JP2813601B2

JP2813601B2 - Tabular document recognition device

Info

Publication number: JP2813601B2
Application number: JP2045548A
Authority: JP
Inventors: 浩司片野
Original assignee: Fuji Electric Co Ltd
Current assignee: Fuji Electric Co Ltd
Priority date: 1990-02-28
Filing date: 1990-02-28
Publication date: 1998-10-22
Anticipated expiration: 2013-10-22
Also published as: JPH03250279A

Description

【発明の詳細な説明】〔産業上の利用分野〕この発明は、表形式文書の画像データが入力されたと
き表中の罫線および行間罫線を切り出し、これらの罫線
にて囲まれる枠内の文字を正確に認識するための表形式
文書認識装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Industrial Application Field] The present invention cuts out ruled lines and line-to-line rules in a table when image data of a tabular document is input, and outputs characters in a frame surrounded by these ruled lines. The present invention relates to a tabular document recognition device for accurately recognizing a document.

[Conventional technology]

従来の表形式文書認識では専ら実線，点線などの実際
の罫線（以下、実罫線ともいう）のみを切り出して文字
を認識するようにしている。その１つとして、出願人は
次のようなものを提案している（特願平１−214934号；
以下、提案済み方式という）。In conventional table format document recognition, characters are recognized by cutting out only actual ruled lines (hereinafter, also referred to as real ruled lines) such as solid lines and dotted lines. As one of the proposals, the applicant has proposed the following (Japanese Patent Application No. 1-214934;
Hereinafter, it is referred to as a proposed method).

これは、第10図に示すような表形式文書に対し、はじ
めに指定した処理領域11内を一定帯幅12で短冊状に分割
し、各帯毎に垂直（ｙ）方向の投影をとる。次に、投影
データから罫線の一部と考えられる箇所（罫線候補）13
を抽出する。これを各帯毎に抽出した後、或る着目する
候補について隣接帯から重なりの最も大きな候補を探し
出し、これを同一の罫線候補として連結する。そして、
連結された候補について、重なりの最も大きな候補を隣
接帯から探し出す作業を繰り返すことにより、同一の罫
線と考えられる連結候補群14を作成する。これらの連結
候補群の水平方向の投影をここでは部分投影15と定義を
し、帯投影とは直角方向の投影をとることによって両端
の座標16を決定し、罫線を得る。こうして得られた罫線
群を一本に合成したり、統合したりする処理を行なうこ
とにより、最終的な罫線群が求められる。なお、第10図
は横罫線を抽出する例であるが、縦罫線の抽出も上記と
同様にして行なわれることは云うまでもない。In this method, for a tabular document as shown in FIG. 10, the processing area 11 specified first is divided into strips with a constant band width 12, and each band is projected in the vertical (y) direction. Next, from the projection data, a part considered as a part of the ruled line (ruled line candidate) 13
Is extracted. After this is extracted for each band, the candidate having the largest overlap is searched for from a neighboring band with respect to a certain candidate of interest, and is connected as the same ruled line candidate. And
For the connected candidates, the operation of searching for the candidate having the largest overlap from the adjacent band is repeated, thereby creating a connected candidate group 14 considered to be the same ruled line. Here, the horizontal projection of these connection candidate groups is defined as a partial projection 15, and the coordinates 16 at both ends are determined by taking a projection in a direction perpendicular to the band projection to obtain a ruled line. A final ruled line group is obtained by performing processing for combining or integrating the ruled line groups thus obtained into one. Although FIG. 10 shows an example of extracting horizontal ruled lines, it goes without saying that vertical ruled lines are also extracted in the same manner as described above.

一方、文字列と文字列との間に引く行間罫線について
は、実罫線の抽出の場合と同じく、表形式文書の画像デ
ータに対して処理領域を帯毎に分割し、垂直（ｙ）方向
の投影をとって行間罫線の一部と考えられる空白部分
を、それぞれ候補として抽出する。そして、各候補群に
対して隣接帯から最も重なる候補を見つける操作を繰り
返して行間罫線の連結候補群を作成し、これについても
部分投影をとることにより行間罫線を決定するようにし
ている。このとき、行間罫線は実罫線とは異なり元々何
もない空白部分をすべて候補として拾ってくるためその
大きさはまちまちであり、このため行間候補を必要に応
じて分割することにより、連結の作業を行ない易くして
いる。その様子を第11図に示す。On the other hand, as for the line rule drawn between the character strings, the processing region is divided into bands for the image data of the tabular document, and the vertical (y) direction A blank portion which is considered as a part of the ruled line between the lines is projected and extracted as a candidate. Then, for each candidate group, an operation of finding the most overlapped candidate from the adjacent band is repeated to create a group of line ruled line connection candidates, and a line projection rule is also determined by taking a partial projection. At this time, unlike the actual ruled line, the line ruled line is different in size because it picks up all blank spaces that are originally empty as candidates, so that the line spacing candidate is divided as necessary, thereby consolidating the lines. Is easy to perform. This is shown in FIG.

まず、既に求められた実罫線からその傾きを計算し、
その傾きに対する帯幅の比20を候補ずれ幅とする。な
お、実罫線がない場合は、実際に分割しながら候補のず
れ値を求めその平均値を用いる。次に、或る着目する行
間候補18に対して左右に隣接する帯からそれに重なる候
補の数をカウントし、その結果重なった候補の多い方向
に合わせ、候補ずれ幅20を用いて着目候補18を第１図
（ｂ）の如く分割する。こうして得られた候補を分割候
補19として、候補の連結処理に利用する。First, calculate the slope from the already determined actual ruled line,
The ratio 20 of the band width to the slope is set as the candidate shift width. If there is no actual ruled line, the deviation value of the candidate is obtained while actually dividing the line, and the average value is used. Next, the number of overlapping candidates is counted from the band adjacent to the left and right with respect to a certain line spacing candidate 18 of interest, and as a result, the number of overlapping candidates is adjusted to the direction in which there are many overlapping candidates. The image is divided as shown in FIG. The candidates obtained in this manner are used as the division candidates 19 in the candidate connection process.

以上のように実罫線または行間罫線を求め、これらの
罫線にて囲まれる枠の座標を文字認識部へ送ることによ
り、枠内の文字が認識され表の構造が把握できることに
なる。As described above, the actual ruled line or the line-to-line ruled line is obtained, and the coordinates of the frame surrounded by these ruled lines are sent to the character recognition unit, whereby the characters in the frame are recognized and the structure of the table can be grasped.

[Problems to be solved by the invention]

しかしながら、上述の如き方式では入力した画像が傾
いている場合は、行間罫線の切り出し精度が低下すると
いう問題がある。つまり、行間罫線を抽出するのに画像
を帯で分割し帯毎に直角方向の投影をとる方式では、各
連結候補の端点座標（ｙ座標）にずれが生じてしまう。
これは、候補を分割する際に用いた候補ずれ幅は傾きに
対する帯幅の値であるが、傾きは実数値であるのに対し
て候補ずれ幅は整数値で表現しなければならず、その結
果、傾きの値が正確に反映されないからである。すなわ
ち、第12図のような分割候補が連結されて得られた候補
群21から求められる行間罫線23の傾きと、行間罫線の抽
出前に求められている実罫線22の傾きとにずれが生じ、
行間罫線の傾きを補正するなどの必要が生じる。However, in the above-described method, if the input image is inclined, there is a problem that the accuracy of cutting out the ruled line between lines is reduced. In other words, in a method in which the image is divided into bands and the projection is performed in the orthogonal direction for each band to extract the ruled line between lines, a deviation occurs in the end point coordinates (y coordinate) of each connection candidate.
This is because the candidate deviation width used when dividing the candidate is the value of the band width with respect to the gradient, but the gradient is a real value, whereas the candidate deviation width must be expressed by an integer value. As a result, the value of the slope is not accurately reflected. That is, there is a difference between the inclination of the line ruled line 23 obtained from the candidate group 21 obtained by connecting the division candidates as shown in FIG. 12 and the inclination of the actual ruled line 22 obtained before the extraction of the line ruled line. ,
It becomes necessary to correct the inclination of the ruled line.

以上のことから、元は何もない空白部分のどこに罫線
を引くかというその罫線の精度が、特に画像が傾いてい
る場合に問題になることがわかる。From the above, it can be understood that the accuracy of the ruled line where the ruled line is drawn in the blank portion where there is no original becomes a problem especially when the image is inclined.

したがって、この発明の目的は画像が傾いている場合
でも行間罫線を精度良く抽出し得るようにすることにあ
る。Accordingly, it is an object of the present invention to accurately extract a ruled line between lines even when an image is inclined.

[Means for solving the problem]

画像データを入力する画像入力部と、画像データから
実線の罫線を切り出す実罫線抽出部と、その抽出された
罫線から画像の傾きを検出する傾き検出部と、罫線を画
像上から消去した画像を用いて求められた傾き方向に画
像の投影を得る投影部と、投影データから行間罫線を抽
出する行間罫線抽出部と、実罫線と行間の罫線群とで囲
まれる枠内の文字を認識する文字認識部とを設ける。An image input unit for inputting image data, a real ruled line extracting unit for cutting out solid ruled lines from the image data, a tilt detecting unit for detecting a tilt of the image from the extracted ruled lines, and an image having ruled lines deleted from the image. A projecting unit that obtains an image in the inclination direction obtained by using the image, a line rule extracting unit that extracts a line rule from the projection data, and a character that recognizes a character in a frame surrounded by actual rule lines and a group of line rules between lines And a recognition unit.

[Action]

実罫線を抽出してその傾きを求め、その傾き方向に行
間罫線を切り出すための投影をとることにより、画像の
傾きに左右されない行間罫線の切り出しを実現し、さら
には行間罫線の抽出精度の向上によって表構造の認識精
度を向上させる。The actual ruled lines are extracted, their inclinations are calculated, and projection is performed to cut out the line ruled lines in the direction of the inclination, thereby realizing the cutout of the line ruled lines independent of the image inclination, and further improving the extraction accuracy of the line ruled lines. This improves the recognition accuracy of the table structure.

〔Example〕

第１図はこの発明の実施例を示すブロック図である。 FIG. 1 is a block diagram showing an embodiment of the present invention.

実罫線と行間罫線を含む表形式文書１を、ホストCPU2
により制御される画像入力部３にて２値画像データに変
換し、これを画像メモリ４に格納する。罫線抽出部５は
実線罫線抽出部51、傾き検出部52、投影部53および行間
罫線抽出部54からなり、画像メモリ４上の処理領域を確
定して実罫線と行間罫線を抽出する。得られた罫線群に
て囲まれる枠の座標は文字認識部６に送られ、枠内の文
字が認識される。The tabular document 1 including the actual ruled lines and the line ruled lines is transferred to the host CPU 2
Is converted into binary image data by the image input unit 3 controlled by the control unit 2 and stored in the image memory 4. The ruled line extracting unit 5 includes a solid line ruled line extracting unit 51, an inclination detecting unit 52, a projecting unit 53, and a line ruled line extracting unit 54, and determines a processing area on the image memory 4 to extract a real ruled line and a line ruled line. The coordinates of the frame surrounded by the obtained ruled line group are sent to the character recognition unit 6, and the characters in the frame are recognized.

以下、第２図ないし第９図を参照して具体的に説明す
る。なお、実罫線については従来と同様に抽出するもの
とし、従ってその詳細は省略する。Hereinafter, a specific description will be given with reference to FIGS. 2 to 9. It should be noted that the actual ruled lines are extracted in the same manner as in the prior art, and therefore the details are omitted.

実罫線を抽出したら、その傾きを計算し結果を所定の
メモリに格納しておく。行間罫線の抽出はここでは、第
２図に示すような縦，横の実罫線を画像上から消去した
画像を用いて行なう。（第２図の点線により、元々実際
の罫線があった箇所が消去されたことを示す）。これ
は、行間空白の部分と文字の部分とをはっきり区別する
ことで、行間罫線の候補（行間罫線の一部と考えられる
箇所）を抽出し易くするためである。次は、最外枠罫線
によって囲まれた枠11内を行間罫線の処理領域として
（最外枠罫線がない表形式文書の場合は、最初に指定し
た処理領域をそのまま利用する）、先に保存してあった
傾きの値から投影方法を決定し、第３図の如く一定帯幅
で処理領域11を分割し、傾き方向の投影をとる。そし
て、各帯から行間罫線の候補を抽出するために、投影デ
ータを第４図のように或る２値化レベル24で再度２値化
した後、この２値化データから“0"が一定幅以上連続す
る部分を行間候補17、また“1"が一定幅以上連続する部
分を文字候補25として抽出する。After the actual ruled line is extracted, its inclination is calculated and the result is stored in a predetermined memory. Here, the line rule is extracted using an image in which vertical and horizontal actual rule lines as shown in FIG. 2 are deleted from the image. (The dotted line in FIG. 2 indicates that the portion where the actual ruled line originally existed was deleted). This is to make it easy to extract a candidate for a line spacing rule (a portion considered as a part of the line spacing rule) by clearly distinguishing a blank portion between lines and a character portion. Next, the inside of the frame 11 surrounded by the outermost frame ruled line is set as the processing region of the line spacing rule (in the case of a tabular document having no outermost frame ruled line, the processing region specified first is used as it is) and saved first. The projection method is determined based on the value of the tilt, and the processing area 11 is divided with a constant bandwidth as shown in FIG. 3, and the projection in the tilt direction is performed. Then, in order to extract a line rule candidate from each band, the projection data is binarized again at a certain binarization level 24 as shown in FIG. 4, and "0" is constant from the binarization data. A portion continuous over the width is extracted as a line spacing candidate 17, and a portion where “1” is continuous over a certain width as a character candidate 25.

こうして得られた行間候補17は、空白部分をすべて検
出するため、候補の大きさは第５図の如く様々である。
そのため、連結の際に必要に応じて候補を分割する作業
を行なう。すなわち、或る着目する候補に対して左右に
隣接する帯をみたときに、複数の行間候補が連結した場
合に、行間候補の分割を行なう。例えば第６図のよう
に、着目する行間候補18に対し、それに隣接する帯の候
補群の中からその候補に重なる候補数をカウントする。
そして、そのカウント数の多かった方向に合わせ、第６
図の場合は前方向（着目行間候補の右側）のカウント数
＝２に対し、後方向（着目行間候補の左側）のカウント
数＝３であるから後方向に合わせて、行間候補18を３個
に分割する。このとき、画像の傾きと直角方向の投影に
もとづいて分割が行なわれるので、提案済み方式のよう
に罫線ずれ幅で分割するものにくらべて位置ずれは生じ
ないことになる。In the line spacing candidates 17 thus obtained, all blank portions are detected, and the sizes of the candidates are various as shown in FIG.
Therefore, at the time of connection, an operation of dividing candidates is performed as necessary. That is, when a plurality of line spacing candidates are connected when looking at a band adjacent to the left and right of a certain candidate of interest, the line spacing candidates are divided. For example, as shown in FIG. 6, for the line spacing candidate 18 of interest, the number of candidates overlapping with the candidate is counted from the candidate group of the band adjacent thereto.
Then, in accordance with the direction in which the count number was large, the sixth
In the case of the figure, the count number in the forward direction (right side of the line spacing candidate of interest) = 2, while the count number in the backward direction (left side of the line spacing candidate of interest) = 3. Divided into At this time, since the division is performed based on the inclination of the image and the projection in the direction perpendicular to the image, positional displacement does not occur as compared with the case of dividing by the ruled line displacement width as in the proposed method.

次に、その結果得られた候補を分割候補19とし、この
分割候補を含めて行間候補および文字候補間で連結を行
なう。まず、着目行間候補18について、隣接する帯の候
補の中から前方向へ最も重なっている候補を探し出して
それを同一の罫線候補として連結し、さらに連結された
候補から次の帯中の候補群中から最も重なっている候補
を探し出す操作を繰り返す。同様の処理を反対方向（後
方向）についても行ない、第７図の如く同一罫線の連結
候補群14を抽出する。Next, the resulting candidate is referred to as a division candidate 19, and the line spacing candidates and the character candidates are connected together including this division candidate. First, for the line-of-interest candidate 18, a candidate that is the most overlapped in the forward direction is searched for from adjacent band candidates, and the candidates are connected as the same ruled line candidate. Repeat the operation to find the most overlapping candidate from the list. The same process is performed in the opposite direction (backward direction) to extract a group of connection candidates 14 having the same ruled line as shown in FIG.

次いで、この連結候補群14から行間罫線として抽出し
て良いか、その正当性を連結候補群のその各々の候補が
如何なる候補なのかを調べることにより判断する。例え
ば第８図の如き連結候補群14に対して行間候補17,分割
候補19,文字候補25の数をそれぞれカウントし、行間候補数＋分割候補＞文字候補×Ｐ（Ｐは任意レベル）の条件を満たすとき、行間候補として相応しいとする。
つまり、文字候補を含む割合が高いということは、そこ
の連結候補群の行間候補が文字列と文字列間の余白部で
あることが多く、そこに罫線を引くと不都合が生じる。
そこで、上記の条件を満たす連結候補群については例え
ば第９図のように、傾きに対して平行な方向の部分投影
をとり、“0"の部分があるレベル以上連続して存在すれ
ば、その両端の座標を決定することで行間罫線23を得る
ことができる。Next, it is determined whether or not it can be extracted as a ruled line from the connection candidate group 14 by examining the validity of each candidate in the connection candidate group. For example, the number of line spacing candidates 17, the number of division candidates 19, and the number of character candidates 25 are counted for the connection candidate group 14 as shown in FIG. 8, and the number of line spacing candidates + division candidate> character candidate × P (P is an arbitrary level) When satisfies, it is determined that it is suitable as a line spacing candidate.
That is, the fact that the ratio of including character candidates is high means that the line spacing candidates in the concatenation candidate group are often character strings and margins between the character strings, and drawing a ruled line there causes inconvenience.
Therefore, as shown in FIG. 9, for example, as shown in FIG. 9, a partial projection in a direction parallel to the inclination is performed on the connection candidate group that satisfies the above condition. By determining the coordinates of both ends, the ruled line 23 can be obtained.

以上により、実罫線と行間罫線とで形成される表が認
識されたことになる。As described above, the table formed by the actual ruled lines and the line spacing ruled lines is recognized.

〔The invention's effect〕

この発明によれば、実罫線を抽出してその傾きを求
め、その傾き方向に行間罫線を切り出すための投影をと
るようにしたので、正確な位置に罫線を引くことができ
る。また、罫線の抽出精度が向上したので、罫線によっ
て囲まれる文字の認識も正しく行なうことができ、表形
式文書を正確に把握することが可能となる利点が得られ
る。According to the present invention, since the actual ruled line is extracted, its inclination is obtained, and the projection for cutting out the line ruled line is taken in the direction of the inclination, the ruled line can be drawn at an accurate position. Further, since the accuracy of extracting ruled lines is improved, characters surrounded by the ruled lines can be correctly recognized, and an advantage that a table format document can be accurately grasped can be obtained.

[Brief description of the drawings]

第１図はこの発明の実施例を示すブロック図、第２図は
実線の罫線を消去した画像例を説明するための説明図、
第３図は帯投影を説明するための説明図、第４図は再２
値化レベルを説明するための説明図、第５図は候補抽出
方法を説明するための説明図、第６図は候補の分割方法
を説明するための説明図、第７図は候補の連結方法を説
明するための説明図、第８図は候補の連結状態を説明す
るための説明図、第９図は部分投影を説明するための説
明図、第10図は実線の罫線を抽出する従来方法を説明す
るための説明図、第11図は従来の候補分割方法を説明す
るための説明図、第12図は画像の傾きによる誤差を説明
するための説明図である。符号説明１……表形式文書、２……ホストCPU、３……画像入力
部、４……画像メモリ、５……罫線抽出部、６……文字
認識部、11……処理領域、12……帯幅、13……罫線候
補、14……連結候補群、15……部分投影、16……罫線端
点、17……行間候補、18……着目行間候補、19……分割
候補、20……候補ずれ幅、21……行間罫線連結候補群、
22……実罫線、23……行間罫線、24……再２値化レベ
ル、25……文字候補、51……実線罫線抽出部、52……傾
き検出部、53……投影部、54……行間罫線抽出部。FIG. 1 is a block diagram showing an embodiment of the present invention. FIG. 2 is an explanatory diagram for explaining an example of an image in which solid ruled lines are deleted.
FIG. 3 is an explanatory diagram for explaining band projection, and FIG.
FIG. 5 is an explanatory diagram for explaining a candidate extracting method, FIG. 6 is an explanatory diagram for explaining a candidate dividing method, and FIG. 7 is a candidate connecting method. , FIG. 8 is an explanatory diagram for explaining a connected state of candidates, FIG. 9 is an explanatory diagram for explaining a partial projection, and FIG. 10 is a conventional method for extracting a solid ruled line. FIG. 11 is an explanatory diagram for explaining a conventional candidate dividing method, and FIG. 12 is an explanatory diagram for explaining an error due to image inclination. Description of symbols 1... Table format document 2... Host CPU 3... Image input unit 4... Image memory 5... Ruled line extraction unit 6... Character recognition unit 11. ... Band width, 13 ... Rule line candidate, 14 ... Connection candidate group, 15 ... Partial projection, 16 ... Rule line end point, 17 ... Line spacing candidate, 18 ... Target line spacing candidate, 19 ... Division candidate, 20 ... … Candidate shift width, 21 …… Line spacing ruled line connection candidate group,
22 ... actual ruled line, 23 ... line spacing ruled line, 24 ... re-binarization level, 25 ... character candidate, 51 ... solid line ruled line extraction unit, 52 ... inclination detection unit, 53 ... projection unit, 54 ... ... Line spacing rule extraction unit.

Claims

(57) [Claims]

An image input unit for inputting image data; an actual ruled line extracting unit for cutting out a solid ruled line from the image data; an inclination detecting unit for detecting an image inclination from the extracted ruled line; A projection unit for projecting the image in the inclination direction obtained by using the image deleted from the image, a line rule extraction unit for extracting the line rule from the projection data, and a frame surrounded by the actual rule and the rule group between the lines. A tabular document recognition device, comprising: a character recognition unit for recognizing characters.