JP4974367B2

JP4974367B2 - Region dividing method and apparatus, and program

Info

Publication number: JP4974367B2
Application number: JP2007239484A
Authority: JP
Inventors: 敏文山合
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2007-09-14
Filing date: 2007-09-14
Publication date: 2012-07-11
Anticipated expiration: 2027-09-14
Also published as: JP2009070242A

Description

本発明は、文書の画像データより文字領域、表領域、写真領域等の領域を識別して抽出する領域分割方法及び装置、並びにその方法を実行するためのコンピュータプログラムに関する。 The present invention relates to an area dividing method and apparatus for identifying and extracting areas such as a character area, a table area, and a photograph area from image data of a document, and a computer program for executing the method.

文書を読み取って生成した画像データを処理し、テキストデータとして再利用する場合、文字で書かれた文章の存在範囲などを示すレイアウト情報を取得することが重要である。例えば、文書の文字認識を行う場合、文書をイメージスキャナ等で画像データとして読み込み、文字領域と図形領域等に領域分割を行なってレイアウト情報を得る。そして、文字領域に関しては文字認識処理を行なってテキストデータへ変換する。この場合、領域分割は文字認識処理の前処理として利用されることになる。領域分割は、文字認識処理の前処理だけでなく、文書の画像データを光ディスク等にファィリングしたり、ファクシミリで送信したりする場合にも、文書の画像中の属性の異なる領域の処理方法を最適化する等のために重要な技術である。 When image data generated by reading a document is processed and reused as text data, it is important to acquire layout information indicating the existence range of sentences written in characters. For example, when character recognition of a document is performed, the document is read as image data with an image scanner or the like, and layout information is obtained by dividing the area into character areas and graphic areas. The character area is converted into text data by performing character recognition processing. In this case, the area division is used as a pre-process for the character recognition process. For area segmentation, the processing method for areas with different attributes in the document image is optimal not only for character recognition pre-processing, but also when document image data is filed on an optical disk or transmitted by facsimile. It is an important technology for making it easier.

自動的に領域分割を行う技術としては、射影を利用する方法（特許文献１）、黒画素の統合による方法（特許文献２）等、様々な手法が提案されている。しかし、これらの自動的な領域分割技術では、複雑なレイアウトや不規則な形式の文書を正確に領域分割することは非常に困難である。 Various techniques such as a method using projection (Patent Document 1) and a method using black pixel integration (Patent Document 2) have been proposed as techniques for automatically dividing an area. However, with these automatic area dividing techniques, it is very difficult to accurately divide a complex layout or an irregularly formatted document.

また、手動による情報を利用して領域分割を行う技術として、領域分割装置に対して、ユーザーが文書画像中の任意の領域を選択し、選択した各領域毎にその内部にポイントを１つずつ指定し、領域分割装置において該指定されたポイントの位置を証拠として利用して文書画像の領域分割を行うようにした領域分割方法及び装置がある（特許文献３）。この技術によれば、ユーザーの領域指定情報を積極的に利用することで、領域分割の精度を高めることができる。 In addition, as a technique for performing region division using manual information, the user selects an arbitrary region in the document image with respect to the region dividing device, and points are set inside each selected region. There is an area dividing method and apparatus that designates and performs area division of a document image using the position of the designated point as evidence in an area dividing apparatus (Patent Document 3). According to this technique, it is possible to improve the accuracy of region division by actively using the user's region designation information.

しかしながら、特許文献３に記載されている技術では、文書画像中の領域分割する領域中の１点をそれぞれ指定することで領域識別を行うため、文書画像中に存在するほぼ全ての領域を指定しないと、正確な領域識別ができない。つまり、１０個の領域のうち、識別が必要な領域が１個であるとしても、残りの９個の識別不要な領域をも選択しないと、正確な識別が行えない。そのため、自動領域識別を行った結果を修正する場合や、単純に１、２個の領域の識別が必要な場合での使用には適していない。また、ＰＤＡ（携帯情報端末）の表示部やＭＦＰ（マルチファンクションプリンタ）の操作パネルなどのようなサイズの小さな画面では、正確なポイントの指定（位置指定）は困難である。 However, in the technique described in Patent Document 3, since region identification is performed by designating each point in the region to be divided in the document image, almost all regions existing in the document image are not designated. and, it can not be exact area identification. That is, even if there is only one area that needs to be identified among the 10 areas, accurate identification cannot be performed unless the remaining 9 areas that do not need to be identified are selected. Therefore, if and that correct the results of automatic segmentation class Osamu, not suitable for simple use in the case requiring identification of one or two areas. In addition, it is difficult to specify an accurate point (position specification) on a small-sized screen such as a display unit of a PDA (personal digital assistant) or an operation panel of an MFP (multifunction printer).

特開平５−２６６２５０号公報JP-A-5-266250 特開平５−２７４４７２号公報JP-A-5-274472 特開平９−１２８４７９号公報JP-A-9-128479

本発明は、このような問題を解決するためになされたもので、その目的は、手動入力を利用して領域分割を行うときに、識別の必要な領域に対するおおまかな位置指定で正確な領域識別を可能にすることである。 The present invention has been made to solve such a problem. The purpose of the present invention is to accurately identify a region by roughly specifying a region to be identified when performing region division using manual input. Is to make it possible.

本発明の領域分割方法は、文書画像を表示する工程と、ユーザーにより指定された、表示されている文書画像の分割対象領域の一部であるユーザー指定領域の位置情報を取得する工程と、前記ユーザー指定領域中の文書画像データに基づいて、前記分割対象領域の抽出を行う工程とを有し、該抽出を行う工程は、前記ユーザー指定領域の内部の情報を抽出する特徴抽出工程と、該抽出された情報に基づいて、前記ユーザー指定領域を文字候補、表候補、図又は写真候補に分類する属性分類工程と、分類された各候補に応じた領域抽出処理を行う工程とを有することを特徴とする領域分割方法である。
本発明のプログラムは、コンピュータに、本発明の領域分割方法の各工程を実行させるためのプログラムである。
本発明の領域分割装置は、本発明のプログラムがインストールされたコンピュータを有する領域分割装置である。 The region dividing method of the present invention includes a step of displaying a document image, a step of acquiring position information of a user-specified region that is specified by a user and is a part of a region to be divided of the displayed document image, based on the document image data in the user-specified region, the possess and performing extraction of the divided region of interest, the step of performing the extract unloading includes a feature extraction step of extracting internal information of the user-specified region, the based on the extracted information, the user-specified region character candidate, that the table candidates, and organic and performing an attribute classification step of classifying the figure or photograph candidate area extraction processing according to the classified each candidate which is a region dividing method according to claim.
Program of the present invention, the computer is a program for executing the steps of the area dividing method of the present invention.
The area dividing apparatus of the present invention is an area dividing apparatus having a computer in which the program of the present invention is installed.

本発明によれば、手動入力を利用して領域分割を行うときに、識別の必要な領域に対するおおまかな位置指定で正確な領域抽出が可能になる。また、領域抽出の前に属性判別を行い、判別された属性（文字、表、図又は写真）に応じて最適な抽出方法を用いることで、最適な抽出結果を得ることができる。 According to the present invention, when performing region division using manual input, it is possible to accurately extract a region by specifying a rough position for a region that needs to be identified. Further, by performing attribute discrimination before region extraction and using an optimum extraction method according to the discriminated attribute (character, table, figure or photograph), an optimum extraction result can be obtained.

以下、本発明の実施形態について図面を参照しながら説明する。
［第１の実施形態］
図１は本発明の第１の実施形態の領域分割装置の構成を示す概略ブロック図である。この領域分割装置は、領域分割処理をマイクロプロセッサ等のデジタル処理により実行し得るように構成したものである。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[First Embodiment]
FIG. 1 is a schematic block diagram showing the configuration of the area dividing apparatus according to the first embodiment of the present invention. This area dividing device is configured to execute the area dividing process by digital processing such as a microprocessor.

この領域分割装置は、文書を光学的に読み取り、電気信号である文書画像データに変換するスキャナ１と、この領域分割装置全体の制御や各種処理を行うＣＰＵ２と、領域分割された文書画像データなどを蓄積するメモリ３と、スキャナ１から入力された文書画像データや領域分割された文書画像データなどを表示するディスプレイ４と、領域分割された文書画像データなどを印刷する印刷装置５と、マウス、タブレットとペンなどのポインティングデバイス６と、ＣＰＵ２が動作するときに使用する各種プログラムが格納されたプログラム格納ＲＯＭ／ＲＡＭ７と、ＣＰＵ２が動作するときにデータ及びプログラムを一時的に記憶するワークエリアＲＡＭ８と、ＣＤ−ＲＯＭやＦＤからプログラムを読み込むためのＣＤ−ＲＯＭ／ＦＤドライブ９とを備えており、これらがバス１０に接続されている。 The area dividing device includes a scanner 1 that optically reads a document and converts it into document image data that is an electrical signal, a CPU 2 that controls the entire area dividing device and performs various processes, and area-divided document image data. , A display 4 for displaying document image data input from the scanner 1 and area-divided document image data, a printing apparatus 5 for printing the area-divided document image data, a mouse, A pointing device 6 such as a tablet and a pen, a program storage ROM / RAM 7 that stores various programs used when the CPU 2 operates, and a work area RAM 8 that temporarily stores data and programs when the CPU 2 operates. CD-ROM / FD drive for reading programs from CD-ROMs and FDs And a blanking 9, which are connected to the bus 10.

図２は本実施形態の領域分割装置の動作を示すフローチャートである。
まずステップＳ１で画像入力を行う。ここでは、図３に示すサンプル文書１１をスキャナ１にセットする。スキャナ１から出力されたサンプル文書１１の文書画像データは、ステップＳ２でディスプレイ４に送られ、サンプル文書１１の画像が表示される。 FIG. 2 is a flowchart showing the operation of the area dividing apparatus of this embodiment.
First, image input is performed in step S1. Here, the sample document 11 shown in FIG. The document image data of the sample document 11 output from the scanner 1 is sent to the display 4 in step S2, and the image of the sample document 11 is displayed.

次いでステップＳ３で、ユーザーは、ディスプレイ４に表示されているサンプル文書１１の画像を見ながらポインティングデバイス６を操作することで、指定領域を入力する。ここでは、図４に示すように、表１１ａの一部の領域を丸１２で囲むことで、表１１ａの位置をおおまかに指定する。このようにユーザーが指定した領域（以下、ユーザー指定領域と言う）は、ステップＳ４でディスプレイ４に表示される。 Next, in step S <b> 3, the user inputs the designated area by operating the pointing device 6 while viewing the image of the sample document 11 displayed on the display 4. Here, as shown in FIG. 4, a part of the table 11a is surrounded by a circle 12 to roughly specify the position of the table 11a. The area designated by the user (hereinafter referred to as the user designated area) is displayed on the display 4 in step S4.

次にステップＳ５で、ＣＰＵ２は、ユーザー指定領域を、ＣＰＵ２で扱いやすい領域に変換する。例えば、ユーザーが丸１２で囲んだ領域の座標の最大値及び最小値に基づいて矩形データで扱ったり、ユーザー指定領域そのものを切り取って扱ったりしてもよい。 Next, in step S <b> 5, the CPU 2 converts the user specified area into an area that can be easily handled by the CPU 2. For example, the user may handle the rectangular data based on the maximum and minimum coordinates of the area surrounded by the circle 12, or cut and handle the user-specified area itself.

次にステップＳ６でユーザー指定領域の内部の情報抽出を行う。例としてユーザー指定領域の内部の画像データを二値化し、射影のヒストグラムをＸ，Ｙ軸双方に取ることなどがある（特公平７−９５３３５号公報参照）。ユーザー指定領域の内部の射影のヒストグラムの０に近い値の連続値はおよそ文字間、行間の情報とみなすことが可能であり、ステップＳ７でその情報を利用して領域分割を行う。この例では射影のヒストグラムの０の連続値が続いた場合、その付近が領域の切れ目と判断することが可能である。（文字領域の場合）。 Next, in step S6, information inside the user designated area is extracted. For example, the image data inside the user-specified area is binarized, and a projection histogram is taken on both the X and Y axes (see Japanese Patent Publication No. 7-95335). The continuous value close to 0 in the projection histogram inside the user-specified area can be regarded as information between characters and between lines. In step S7, the information is divided into regions. In this example, when a continuous value of 0 in the projection histogram continues, it is possible to determine that the vicinity is an area break. (For character areas).

次いでステップＳ８で領域分割の結果をディスプレイ４に表示する。ここでは、サンプル文書１１の表１１ａが表領域として抽出され、その外側に表領域を示す枠１３が表示される。もしも抽出された表領域とサンプル文書１１の表領域１１ａとが不一致であった場合は、ステップＳ９でユーザーはポインティングデバイス６を用いて修正する。このように必要に応じて修正された領域分割結果は文字認識装置などで利用するため、ステップＳ１０で出力される。 In step S8, the result of area division is displayed on the display 4. Here, the table 11a of the sample document 11 is extracted as a table area, and a frame 13 indicating the table area is displayed outside the table area. If the extracted table area does not match the table area 11a of the sample document 11, the user corrects it using the pointing device 6 in step S9. The region segmentation result corrected as necessary in this manner is output in step S10 for use in a character recognition device or the like.

このように、本実施形態の領域分割装置によれば、ユーザーは、文書画像をディスプレイ４で見ながら、抽出したい領域全体を正確に指定せず、その一部をおおまかに指定するだけで、自動的に領域全体を高精度に抽出することができる。また、その高速性と相まって、インタラクティブな修正も可能である。 As described above, according to the region dividing apparatus of the present embodiment, the user does not specify the entire region to be extracted accurately while viewing the document image on the display 4, and only specifies roughly a part of the region. Therefore, the entire area can be extracted with high accuracy. Also, coupled with its high speed, interactive correction is possible.

［第２の実施形態］
図６は本発明の第２の実施形態の領域分割装置の動作を示すフローチャートである。この図において、図２と同一又は対応するステップには、図２と同じ符号を付した。なお、本実施形態及び後述する第３乃至第５の実施形態の領域分割装置の概略構成のブロック図は第１の実施形態（図１）と同じである。さらに、本実施形態及び後述する第３乃至第５の実施形態にて、スキャナ１から読み取る文書も第１の実施形態と同じサンプル文書１１であり、ユーザー指定領域も第１の実施形態と同じ、丸１２である。 [Second Embodiment]
FIG. 6 is a flowchart showing the operation of the area dividing apparatus according to the second embodiment of the present invention. In this figure, steps that are the same as or correspond to those in FIG. The block diagram of the schematic configuration of the area dividing apparatus according to the present embodiment and third to fifth embodiments described later is the same as that of the first embodiment (FIG. 1). Further, in the present embodiment and the third to fifth embodiments described later, the document read from the scanner 1 is also the same sample document 11 as in the first embodiment, and the user specified area is the same as in the first embodiment. Round 12.

本実施形態では、領域抽出の前に、ステップＳ１１でユーザー指定領域の属性判別（文字、表、図又は写真）を行う。属性判別は本出願人の特許である特許第３３４４７７４号、特許第３２１５１６３号などの既知の技術で対応可能である。 In the present embodiment, before region extraction, attribute determination (character, table, figure or photograph) of the user-specified region is performed in step S11. Attribute discrimination can be handled by known techniques such as Japanese Patent Nos. 3344774 and 3215163, which are patents of the present applicant.

図７は、属性判別処理の一例を示すフローチャートである。
まずステップＳ２１でユーザー指定領域の画像データを入力し、次いでステップＳ２２で画像データを二値化する。ただし、ユーザー指定領域の画像データが二値である場合、この二値化処理は行わない。 FIG. 7 is a flowchart illustrating an example of attribute determination processing.
First, in step S21, image data in the user designated area is input, and then in step S22, the image data is binarized. However, when the image data in the user-specified area is binary, this binarization process is not performed.

次にステップＳ２３で黒画素連結成分の抽出を行い、次いでステップＳ２４で白画素連結成分の抽出を行う。そして、ステップＳ２３の抽出結果を用いて、ステップＳ２５で黒画素罫線矩形を抽出し、ステップＳ２４の抽出結果を用いて、ステップＳ２６で白画素罫線矩形の抽出を行う。ここで、黒画素罫線矩形を抽出は、水平方向、垂直方向のそれぞれについて、長い黒画素連結のみで行い、白画素罫線矩形を抽出は、水平方向、垂直方向のそれぞれについて、長い白画素連結のみで行う。最後にステップＳ２７で、ルールベース或いは特徴量ベースにより、表か否かを判別する。 Next, in step S23, black pixel connected components are extracted, and in step S24, white pixel connected components are extracted. Then, a black pixel ruled rectangle is extracted in step S25 using the extraction result in step S23, and a white pixel ruled rectangle is extracted in step S26 using the extraction result in step S24. Here, the black pixel ruled rectangle is extracted only by long black pixel connection in each of the horizontal direction and the vertical direction, and the white pixel ruled rectangle is extracted only by long white pixel connection in each of the horizontal direction and the vertical direction. To do. Finally, in step S27, it is determined whether the table is a rule base or a feature amount base.

図８〜図１０はルールベースによる判別例を説明するための図である。
図８に示すように、ユーザー指定領域内に４本ずつの横罫線２１及び縦罫線２２からなる表の一部が存在し、それらの罫線により区画された９個（横３個×縦３個）のセルのうち、上段のセルに文字「ＸＸＸ」、「ＹＹＹ」、「ＺＺＺ」が記入されているものとする。 8 to 10 are diagrams for explaining an example of discrimination based on the rule base.
As shown in FIG. 8, a part of a table composed of four horizontal ruled lines 21 and four vertical ruled lines 22 exists in the user-specified area, and nine (3 horizontal x 3 vertical) partitioned by the ruled lines. ), The characters “XXX”, “YYY”, and “ZZZ” are entered in the upper cell.

このように、白背景、黒字で表が描かれている場合、罫線は黒画素であるため、図９に示すように、ユーザー指定領域に対して長い黒画連結のみで、水平方向の黒画素罫線矩形２３、垂直方向の黒画素罫線矩形２４が検出される。 In this way, when the table is drawn with a white background and black characters, the ruled line is a black pixel, so as shown in FIG. A ruled line rectangle 23 and a vertical black pixel ruled line rectangle 24 are detected.

しかし、白画素罫線矩形は、水平方向、垂直方向それぞれの罫線から罫線の間にしか存在できないため、白画素連結の長いものはなく、太くなる傾向がある。図１０Ａは、図８の表からステップＳ２６で得られる白画素罫線矩形を、分かり易くするため黒に反転して表示したものである。縦罫線に関しては太過ぎるため、この場合の矩形の縦横比の制限により得られない。また、図８では、セルの内部に文字が存在するため、その文字数が多くて文字の左右に十分な長さの余白が存在しない場合は、例えば図１０Ｂに示すように、セル内の文字「ＸＸＸ」の左右の余白の長さＬ１、Ｌ２が、白画素連結を長いとみなす閾値未満の場合は（「ＹＹＹ」、「ＺＺＺ」も同じ）、検出される白画素罫線矩形は図１０Ｃに示すようなものとなる。 However, since the white pixel ruled rectangle can only exist between the ruled lines in the horizontal direction and the vertical direction, there is no long white pixel connection and tends to be thick. FIG. 10A shows the white pixel ruled line rectangle obtained in step S26 from the table of FIG. 8 by inverting it to black for easy understanding. Since the vertical ruled line is too thick, it cannot be obtained due to the limitation of the aspect ratio of the rectangle in this case. In FIG. 8, since there are characters inside the cell, if the number of characters is large and there is no sufficient margin on the left and right of the character, for example, as shown in FIG. 10B, the character “ When the lengths L1 and L2 of the left and right margins of “XXX” are less than a threshold value that considers white pixel connection to be long (the same applies to “YYY” and “ZZZ”), the detected white pixel ruled rectangle is shown in FIG. 10C. It will be like that.

このように、ステップＳ１１で属性判別を行った後に、ステップＳ１２で、判別された属性に応じた領域範囲の作成（領域抽出処理）を実行する。 As described above, after performing the attribute determination in step S11, in step S12, creation of a region range (region extraction process) according to the determined attribute is executed.

本実施形態によれば、領域抽出の前に属性判別を行い、判別された属性（文字、表、図又は写真）に応じて最適な抽出方法を用いることで、最適な抽出結果を得ることができる。 According to this embodiment, it is possible to obtain an optimum extraction result by performing attribute discrimination before region extraction and using an optimum extraction method according to the discriminated attribute (character, table, figure or photograph). it can.

［第３の実施形態］
図１１は本発明の第３の実施形態の領域分割装置の動作を示すフローチャートである。この図において、図６（第２の実施形態）と同一又は対応するステップには図６と同じ符号を付した。 [Third Embodiment]
FIG. 11 is a flowchart showing the operation of the area dividing apparatus according to the third embodiment of the present invention. In this figure, the same or corresponding steps as those in FIG. 6 (second embodiment) are denoted by the same reference numerals as those in FIG.

本実施形態では、ステップＳ１３でユーザー指定領域の内部特性の分類を行い、その分類の結果に応じて、ステップＳ１４で領域範囲の作成（領域抽出処理）を実行する。つまり、ユーザー指定領域の分類を、その後のユーザーの利用目的に応じた分類ではなく、あくまで領域分割を成功させるのに役に立つ分類を行い、内部の情報から推測される、最適な領域分割手法を選択する。 In the present embodiment, the internal characteristics of the user-specified area are classified in step S13, and an area range is created (area extraction process) in step S14 according to the classification result. In other words, the classification of the user-specified area is not a classification according to the purpose of use of the subsequent user, but a classification that is useful for the successful segmentation of the area is performed, and the optimal area segmentation method inferred from internal information is selected To do.

本実施形態は、例えば長い横線が沢山あるにも拘わらず表領域ではない場合などに有効である。図１２にその例を示す。この例では、文字「○」、「△」、「□」の下に長い横線２１が存在する。この図に示されている領域の一部をユーザーが丸２２で囲むと、第２の実施形態の場合、ステップＳ１１の属性判別手段によっては「表領域」と判別することもある。そして、ステップＳ１３の表領域に対する領域範囲作成手法が後述する第４の実施形態のようなものであった場合、ユーザーが考える領域より狭い範囲を結果として出力することになってしまう。 This embodiment is effective when, for example, there are many long horizontal lines but they are not table regions. An example is shown in FIG. In this example, a long horizontal line 21 exists under the characters “◯”, “Δ”, and “□”. If the user encloses a part of the area shown in this figure with a circle 22, in the case of the second embodiment, it may be determined as a “table area” by the attribute determination means in step S11. When the region range creation method for the table region in step S13 is as in the fourth embodiment described later, a range narrower than the region considered by the user is output as a result.

このような場合が生じることを考慮すると、属性判別手段により、ユーザーが後段の処理で使用する属性を出力するのではなく、領域抽出手段にとって有効となるような属性を出力することにもメリットがある。図１２のような例であれば、長い横線が多いからといって表という属性を出力するよりも、その後に位置する複数の領域抽出手法のうちのどれが最適かという結果を出力する方がよりユーザーフレンドリーな結果となる。真の属性については領域が決定した後で再判定することも可能である。 Considering that such a case may occur, there is also a merit in outputting attributes that are effective for the region extracting means, instead of outputting attributes that the user uses in subsequent processing by the attribute determining means. is there. In the example as shown in FIG. 12, it is better to output the result of which one of the plurality of region extraction methods located after the output is more appropriate than outputting the attribute of the table simply because there are many long horizontal lines. The result is more user-friendly. The true attribute can be determined again after the area is determined.

本実施形態によれば、ユーザー指定領域の内部特性の分類結果に応じて最適な領域抽出手法を選択するため、図１２のような場合でも、正確な領域抽出を行うことができる。 According to the present embodiment, since the optimum region extraction method is selected according to the classification result of the internal characteristics of the user-specified region, accurate region extraction can be performed even in the case of FIG.

［第４の実施形態］
図１３は本発明の第４の実施形態の領域分割装置の動作を示すフローチャートである。この図において、図６（第２の実施形態）と同一又は対応するステップには図６と同じ符号を付した。 [Fourth Embodiment]
FIG. 13 is a flowchart showing the operation of the area dividing apparatus according to the fourth embodiment of the present invention. In this figure, the same or corresponding steps as those in FIG. 6 (second embodiment) are denoted by the same reference numerals as those in FIG.

本実施形態では、ステップＳ１５でユーザー指定領域の属性の判別を行い、表と判別されたときに、ステップＳ１６で表領域の抽出を行う。ここで、ステップＳ１５では、第２の実施形態のステップＳ１１とは異なり、表の判別及び背景か白か黒かの判別のみ行う。 In this embodiment, the attribute of the user designated area is determined in step S15, and when it is determined to be a table, the table area is extracted in step S16. Here, in step S15, unlike step S11 of the second embodiment, only the discrimination of the table and the discrimination of the background, white or black are performed.

ステップＳ１６の具体的構成例のフローチャートを図１４に示す。
ステップＳ３１で文書画像データをワークエリアＲＡＭ８に読み込み、ステップＳ３２で文書画像の全面から前景色で罫線抽出を行う。この罫線抽出の方法は、ハフ変換を利用するものや画素連続（ランと呼ぶ）のうち、閾値以上の長いものだけを利用して連結成分を求めることで罫線候補を作成するなど、既存のものを利用して構わない。 FIG. 14 shows a flowchart of a specific configuration example of step S16.
In step S31, the document image data is read into the work area RAM 8, and in step S32, ruled lines are extracted from the entire surface of the document image with the foreground color. This ruled line extraction method uses existing methods such as creating a ruled line candidate by using a Hough transform, or by using only pixels that are longer than a threshold among consecutive pixels (called a run) to obtain a connected component. You can use.

こうして全画面上に得られた罫線矩形のうち、ステップＳ３３でユーザー指定領域に含まれるものの抽出する。図１５Ａは、ステップＳ３３の抽出結果の一例であり、フリーハンド曲線により定まるユーザー指定領域３１内に含まれる罫線３２（横罫線２本、縦罫線２本を実線で表示）が抽出されている。 Of the ruled line rectangles obtained on the entire screen in this way, those included in the user designated area are extracted in step S33. FIG. 15A is an example of the extraction result of step S33, and ruled lines 32 (two horizontal ruled lines and two vertical ruled lines are displayed as solid lines) included in the user-specified area 31 determined by the freehand curve are extracted.

次に、ステップＳ３４でユーザー指定領域に含まれている罫線（ステップＳ３３で抽出）と交差或いは接触をしている罫線を抽出する。図１５Ｂは、ステップＳ３３で抽出された罫線３２と交差或いは接触している罫線（図１５Ａの破線３３）を抽出した結果を示している。 Next, in step S34, a ruled line that intersects or contacts the ruled line included in the user-specified area (extracted in step S33) is extracted. FIG. 15B shows the result of extracting a ruled line (broken line 33 in FIG. 15A) that intersects or is in contact with the ruled line 32 extracted in step S33.

最後にステップＳ３５で、ステップＳ３４で抽出された罫線の座標の最大値及び最小値から表領域を確定する。これにより、図１５Ｂに実線で示す表領域が抽出される。 Finally, in step S35, the table area is determined from the maximum and minimum values of the ruled line coordinates extracted in step S34. As a result, a table region indicated by a solid line in FIG. 15B is extracted.

このように、本実施形態によれば、ユーザー指定領域が表領域であった場合に、高精度の抽出結果が得られる。なお、図１４では、処理を単純化するために、最初に全画面上の罫線を抽出しているが、見つかった罫線の近辺から探索していく方法でもよく、罫線抽出の方法の相違が結果に大きく影響することはない。 As described above, according to this embodiment, when the user-specified area is a table area, a highly accurate extraction result can be obtained. In FIG. 14, in order to simplify the process, the ruled lines on the entire screen are first extracted. However, a method of searching from the vicinity of the found ruled lines may be used. There is no significant impact on

［第５の実施形態］
図１６は本発明の第５の実施形態の領域分割装置の動作を説明するための図である。本実施形態の領域分割装置の基本的な動作のフローは図６（第２の実施形態）と同じであり、分類結果（ステップＳ１１の属性判別）で表と判定されたときの、領域範囲の求め方（ステップＳ１２に対応）についての発明である。 [Fifth Embodiment]
FIG. 16 is a diagram for explaining the operation of the area dividing apparatus according to the fifth embodiment of the present invention. The basic operation flow of the region dividing apparatus of this embodiment is the same as that in FIG. 6 (second embodiment), and the region range when the classification result (attribute determination in step S11) is determined to be a table. It is an invention about how to obtain (corresponding to step S12).

ここではユーザー指定領域中の連結成分を求め、その連結成分がユーザー指定範囲に接触しているものについて調べる。そして、指定領域ぎりぎりの位置に接触している連結成分に対して、探索範囲を広げ、広げた範囲にも画素が連続していたら、その連結成分を成長させる（大きくする）。これを繰り返していくと、指定領域範囲に収まっている画素と連結されている画素が明らかになり、それを囲む領域が抽出される。 Here, a connected component in the user specified area is obtained, and the connected component is in contact with the user specified range. Then, the search range is expanded with respect to the connected component that is in contact with the position of the designated area, and if the pixel continues in the expanded range, the connected component is grown (enlarged). As this process is repeated, the pixels connected to the pixels within the designated area range become clear, and the area surrounding them is extracted.

図１６の場合、フリーハンド曲線により定まるユーザー指定領域４１を囲む矩形の領域４２（破線で表示）を簡易的にユーザー指定領域とする。また、領域４２内の黒画素（３本の横罫線の各々の一部、１本の縦罫線の一部、８個の○）をユーザー指定領域に含まれる画素と考える。そして、これらの画素が含まれる連結成分を操作していき、連結成分矩形を最終的に抽出する。 In the case of FIG. 16, a rectangular area 42 (displayed by a broken line) surrounding the user-specified area 41 determined by the freehand curve is simply set as the user-specified area. Further, the black pixels in the region 42 (a part of each of the three horizontal ruled lines, a part of one vertical ruled line, and eight o) are considered as pixels included in the user-specified area. Then, the connected component including these pixels is operated to finally extract the connected component rectangle.

図１７は、画素の連結成分を利用した矩形抽出方法の一例を示す図である。図のＡに示すように、主走査方向に黒ラン（黒の矩形として図示）の抽出を行い、Ｂに示すように、黒ランの連結成分を統合して、矩形領域５１〜５３を作成する。次にＣに示すように、副走査方向にも矩形を成長させていき、矩形領域５４及び５５を作成する。このとき、矩形に重なりがあっても、ランが連結していないもの（ここでは矩形領域５３）は統合せず、別扱いとする。 FIG. 17 is a diagram illustrating an example of a rectangular extraction method using a connected component of pixels. As shown in A of the figure, black runs (shown as black rectangles) are extracted in the main scanning direction, and as shown in B, the connected components of the black runs are integrated to create rectangular areas 51 to 53. . Next, as shown in C, a rectangle is grown also in the sub-scanning direction, and rectangular regions 54 and 55 are created. At this time, even if the rectangles are overlapped, those in which the runs are not connected (in this case, the rectangular region 53) are not integrated and are treated separately.

なお、ここでは、単純に白背景で黒前景という前提であるが、多値画像及び明度反転画像も検出の対象となることがある。その場合、多値画像であれば最初に二値化を行って、条件を整える。次に図７に示した処理フローにより、前景と背景を判別するステップを行う。そして、得られた前景色、背景色をそれぞれ白或いは黒に割り当て、前述した本実施形態の処理を行うことで対応可能である。 Here, it is assumed that the background is simply a white background and a black foreground, but a multi-valued image and a lightness inverted image may also be detected. In that case, if it is a multi-valued image, binarization is performed first to adjust the conditions. Next, a step of discriminating between the foreground and the background is performed according to the processing flow shown in FIG. This can be dealt with by assigning the obtained foreground color and background color to white or black, respectively, and performing the processing of this embodiment described above.

以上の第１乃至第４の実施形態に共通していえることは、ユーザーの意図した領域を高速に抽出する（切り出す）ことができるという点であり、判別可能な易しい表（単純な構成の表）であれば、おおまかな指定でも抽出できるという点である。また、自動判別が難しい表については、ユーザーによる指定時に実際に必要な領域に近づけるといったことをすることで、より精度の高い抽出結果が得られる。さらに、この表の難易度と自動領域抽出のトレードオフの関係はユーザーが学習して使いこなすのが容易であるというメリットがある。 What can be said in common with the first to fourth embodiments described above is that the region intended by the user can be extracted (cut out) at high speed, and an easily distinguishable table (simple configuration table). ), It is possible to extract even roughly. For a table that is difficult to be automatically identified, a more accurate extraction result can be obtained by making it closer to the area actually required when designated by the user. In addition, the relationship between the difficulty level of this table and the trade-off between automatic area extraction has the advantage that it is easy for the user to learn and use.

本発明の第１の実施形態の領域分割装置の構成例を示す概略ブロック図である。It is a schematic block diagram which shows the structural example of the area division | segmentation apparatus of the 1st Embodiment of this invention. 本発明の第１の実施形態の領域分割装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the area dividing apparatus of the 1st Embodiment of this invention. 本発明の第１の実施形態のサンプル文書を示す図である。It is a figure which shows the sample document of the 1st Embodiment of this invention. 本発明の第１の実施形態のサンプル文書の表領域の一部を指定する態様の一例を示す図である。It is a figure which shows an example of the aspect which designates a part of table area | region of the sample document of the 1st Embodiment of this invention. 本発明の第１の実施形態のサンプル文書から抽出された表領域を示す図である。It is a figure which shows the table area extracted from the sample document of the 1st Embodiment of this invention. 本発明の第２の実施形態の領域分割装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the area | region division apparatus of the 2nd Embodiment of this invention. 本発明の第２の実施形態における属性判別処理の一例を示すフローチャートである。It is a flowchart which shows an example of the attribute discrimination | determination process in the 2nd Embodiment of this invention. 本発明の第２の実施形態における表の一例を示す図である。It is a figure which shows an example of the table | surface in the 2nd Embodiment of this invention. 図７の属性判別処理により、図８の表から抽出される黒画素罫線矩形を示す図である。It is a figure which shows the black pixel ruled line rectangle extracted from the table | surface of FIG. 8 by the attribute discrimination | determination process of FIG. 図７の属性判別処理により、図８の表から抽出される白画素罫線矩形を示す図である。It is a figure which shows the white pixel ruled line rectangle extracted from the table | surface of FIG. 8 by the attribute discrimination | determination process of FIG. 本発明の第３の実施形態の領域分割装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the area dividing apparatus of the 3rd Embodiment of this invention. 本発明の第３の実施形態の領域分割装置の使用が好適な領域を示す図である。It is a figure which shows an area | region where use of the area | region dividing device of the 3rd Embodiment of this invention is suitable. 本発明の第４の実施形態の領域分割装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the area | region dividing device of the 4th Embodiment of this invention. 図１３における表の抽出処理の具体的構成例を示すフローチャートである。It is a flowchart which shows the specific structural example of the extraction process of the table | surface in FIG. 図１４の抽出処理により抽出される罫線及び表を示す図である。It is a figure which shows the ruled line and table | surface extracted by the extraction process of FIG. 本発明の第５の実施形態の領域分割装置の動作を説明するための図である。It is a figure for demonstrating operation | movement of the area dividing apparatus of the 5th Embodiment of this invention. 画素の連結成分を利用した矩形抽出方法の一例を示す図である。It is a figure which shows an example of the rectangle extraction method using the connection component of a pixel.

Explanation of symbols

１・・・スキャナ、２・・・ＣＰＵ、４・・・ディスプレイ、６・・・ポインティングデバイス。 DESCRIPTION OF SYMBOLS 1 ... Scanner, 2 ... CPU, 4 ... Display, 6 ... Pointing device.

Claims

A step of displaying a document image, a step of obtaining position information of a user-designated area that is a part of a division target area of the displayed document image designated by a user, and document image data in the user-designated area based on, it possesses and performing extraction of the divided region of interest,
The extracting step includes a feature extracting step of extracting information inside the user-specified area, and classifying the user-specified area into character candidates, table candidates, figures, or photo candidates based on the extracted information. area dividing method which is characterized in that chromatic and attribute classification step, and performing region extraction processing in accordance with each candidate classified.

The area dividing method according to claim 1, wherein:
The feature extraction step includes a rectangle extraction step for extracting a connected component rectangle of black pixels or white pixels from a binary image, and a ruled line extraction step for extracting ruled lines. The attribute classification step is based on these extraction results. predetermined feature amount closest or in the area dividing method according to claim Rukoto obtain classification results of the attribute to which category space.

The area dividing method according to claim 1, wherein:
The step of performing the region extraction process is the largest connected rectangle by growing from the connected components of the pixels included in the user-specified region to a position where the connection is lost with respect to the region that is a table candidate in the attribute classification step. A region dividing method characterized in that the region is a user-specified table region .

The program for making a computer perform each process of the area | region division method described in any one of Claims 1-3 .

An area dividing apparatus having a computer in which the program according to claim 4 is installed .