JP5563390B2

JP5563390B2 - Image processing apparatus, control method therefor, and program

Info

Publication number: JP5563390B2
Application number: JP2010150264A
Authority: JP
Inventors: 誠榎本
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2010-06-30
Filing date: 2010-06-30
Publication date: 2014-07-30
Anticipated expiration: 2030-06-30
Also published as: JP2012014430A

Description

本発明は、画像処理装置、制御方法、及びプログラムに関し、特に、文書画像処理における前処理に関する。 The present invention relates to an image processing apparatus, a control method, and a program, and more particularly to preprocessing in document image processing.

近年、ネットワークの広がりにより、文書が電子的に配布される機会も増え、それに伴い紙の文書をスキャンにより電子文書化して配布可能とする技術が普及している。しかし、掲示されているポスターや、会議で用いたホワイトボード、サイズの大きな模造紙など、スキャンすることが困難な被写体（文書）が存在する。そこで、カメラで撮影した画像を電子文書化する技術が開発されている。ここで、カメラと被写文書との位置関係により得られる画像に台形状の歪みが生じるため、歪みを補正する技術が必要となる。 In recent years, due to the spread of networks, the opportunity for electronic distribution of documents has increased, and along with this, a technology that enables paper documents to be electronically distributed by scanning has become widespread. However, there are subjects (documents) that are difficult to scan, such as posted posters, whiteboards used in meetings, and large-size imitation paper. In view of this, techniques for electronically documenting images taken with a camera have been developed. Here, since a trapezoidal distortion occurs in the image obtained by the positional relationship between the camera and the document, a technique for correcting the distortion is required.

例えば、色差からエッジを取得し、一定以上の長さの線分を文書枠として検出し、歪みを補正する技術がある（特許文献１等参照）。また、台座上にある文書を撮影した場合に、検出した線分候補の撮影画像平面上の相対位置から隣接辺を求めていく技術がある（特許文献２等参照）。 For example, there is a technique of acquiring an edge from a color difference, detecting a line segment having a certain length or more as a document frame, and correcting distortion (see Patent Document 1). In addition, there is a technique for obtaining an adjacent side from a relative position on a photographed image plane of a detected line segment candidate when a document on a pedestal is photographed (see Patent Document 2, etc.).

特開２００３−０５８８７７号公報JP 2003-058877 A 特開２００７−５８６３４号公報JP 2007-58634 A

文書やホワイトボードなどの長方形の撮影対象をカメラにより撮影した場合、当該撮影対象とカメラを正確に正対させるのが困難であるため、撮影画像中の文書には３次元的な傾きにより、台形状の歪みが生じる。そのため撮影画像中から文書（ホワイトボード）を読みやすい形で抽出する為には、文書枠（ホワイトボードの枠）を正確に抽出する必要がある。文書枠を抽出する方法として、ハフ変換などを用いて直線成分を検出し、４直線から文書枠を推定する方法がある。しかし、文書の背景によっては、直線が多数抽出され、文書枠を構成する組み合わせの数が増大し、正しい文書枠の推定が困難となる問題があった。 When a rectangular object such as a document or a whiteboard is photographed by a camera, it is difficult to accurately face the object and the camera. Therefore, the document in the photographed image has a three-dimensional inclination. Shape distortion occurs. Therefore, in order to extract a document (whiteboard) from a captured image in a form that is easy to read, it is necessary to accurately extract a document frame (whiteboard frame). As a method of extracting a document frame, there is a method of detecting a straight line component using Hough transform or the like and estimating a document frame from four straight lines. However, depending on the background of the document, many straight lines are extracted, and the number of combinations constituting the document frame increases, which makes it difficult to estimate the correct document frame.

上記課題を解決するために、本願発明は、以下の構成を有する。すなわち、矩形領域を有する被写体を撮影して得られた画像データから、前記矩形領域の四辺で構成される枠を抽出する画像処理装置であって、入力された前記画像データから複数の直線成分を検出する検出手段と、前記検出手段にて検出された前記直線成分に対する直交方向において、画素情報の高低による勾配方向を算出する算出手段と、前記検出手段にて検出された前記複数の直線成分の中から四辺を選択し、選択された当該四辺からなる枠候補を１以上抽出する抽出手段と、前記抽出手段にて抽出された前記枠候補のうち、当該枠候補の四辺における勾配方向が当該枠の内側もしくは外側のいずれかの方向に対して同一の向きとならない枠候補を、前記抽出手段で抽出された前記枠候補から除く絞り込み手段とを有する。
In order to solve the above problems, the present invention has the following configuration. That is, an image processing apparatus that extracts a frame composed of four sides of the rectangular area from image data obtained by photographing a subject having a rectangular area, wherein a plurality of linear components are extracted from the input image data. A detecting means for detecting; a calculating means for calculating a gradient direction according to the level of pixel information in a direction orthogonal to the linear component detected by the detecting means; and a plurality of linear components detected by the detecting means. select four sides from in an extraction means for extracting one or a frame candidate consisting of the four sides that are selected, among the frame candidate extracted by the extraction means, gradient direction in the four sides of the frame candidate the frame Narrowing means for excluding frame candidates that are not in the same orientation with respect to either the inside or outside direction from the frame candidates extracted by the extracting means .

直線が多数検出される煩雑な背景を持つ文書画像から精度よく文書枠の候補を絞り込むことが可能である。 Document frame candidates can be accurately narrowed down from a document image having a complicated background in which many straight lines are detected.

実施形態１の入力画像取得環境の例を示す図。FIG. 3 is a diagram illustrating an example of an input image acquisition environment according to the first embodiment. 実施形態１における入力画像の例を示す図。FIG. 4 is a diagram illustrating an example of an input image according to the first embodiment. 実施形態１の構成例を示す図。FIG. 3 is a diagram illustrating a configuration example of the first embodiment. 実施形態１における文書領域抽出のフローチャートの図。FIG. 6 is a flowchart of document area extraction in the first embodiment. 実施形態１の動作を示すブロック図。FIG. 3 is a block diagram showing the operation of the first embodiment. 実施形態１における直線検出処理の例を示す図。FIG. 6 is a diagram illustrating an example of straight line detection processing in the first embodiment. 実施形態１における勾配方向算出処理の例を示す図。FIG. 6 is a diagram illustrating an example of a gradient direction calculation process in the first embodiment. 実施形態１における対辺候補作成処理の例を示す図。FIG. 6 is a diagram illustrating an example of an opposite side candidate creation process in the first embodiment. 実施形態１における文書枠座標算出処理の例を示す図。FIG. 6 is a diagram illustrating an example of document frame coordinate calculation processing according to the first embodiment. 実施形態１における文書枠候補抽出処理結果の例を示す図。FIG. 10 is a diagram illustrating an example of a document frame candidate extraction process result according to the first embodiment. 実施形態１における文書枠補正結果の例を示す図。FIG. 6 is a diagram illustrating an example of a document frame correction result according to the first embodiment. 実施形態２における文書領域抽出のフローチャートの図。FIG. 10 is a flowchart of document area extraction in the second embodiment. 実施形態２における縦方向線・横方向線判別処理結果の例を示す図。FIG. 10 is a diagram illustrating an example of a vertical line / horizontal line discrimination processing result according to the second embodiment.

＜実施形態１＞
以下、図面を参照して本発明の好適な実施形態を詳細に説明する。図１は、本発明の実施例の画像処理装置へと入力される画像が取得される環境を示す図である。文書媒体１０１は、撮影対象となる矩形状の白板（ホワイトボード）やポスター、紙文書などであり、本発明ではこれらをまとめて文書と呼ぶこととする。撮影装置１０２は、文書媒体１０１を撮影するデジタルカメラなどの撮影装置である。撮影装置１０２で撮影して得られた画像が、処理対象の画像（入力画像）となる。表示部１０３は、撮影装置１０２に備えられ、撮影対象等を表示する。操作部１０４は、撮影装置１０２に備えられ、撮影装置１０２をユーザが操作する際に用いられる。 <Embodiment 1>
Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the drawings. FIG. 1 is a diagram illustrating an environment in which an image input to an image processing apparatus according to an embodiment of the present invention is acquired. The document medium 101 is a rectangular white board (whiteboard), a poster, a paper document, or the like to be photographed. In the present invention, these are collectively referred to as a document. The photographing device 102 is a photographing device such as a digital camera that photographs the document medium 101. An image obtained by photographing with the photographing apparatus 102 becomes an image to be processed (input image). The display unit 103 is provided in the photographing apparatus 102 and displays a photographing target and the like. The operation unit 104 is provided in the photographing apparatus 102 and is used when the user operates the photographing apparatus 102.

図２に入力画像の例を示す。画像２００は、例えば表示部１０３に表示される。撮影装置１０２により撮影された画像２００は、文書領域２１０（被写体の文書が写っている領域）と、文書領域以外の部分である背景領域２２０から構成される。文書領域２１０には文字領域２１１が含まれている。また、撮影位置と対象物の位置関係が正対していない場合（すなわち、斜め方向から撮影した場合）、文書領域２１０には傾き、すなわち線形歪みが生じる。ここでは、文書領域２１０は実際には直方形であるが、線形歪みにより、台形状の画像となっている。 FIG. 2 shows an example of the input image. The image 200 is displayed on the display unit 103, for example. An image 200 photographed by the photographing apparatus 102 is composed of a document area 210 (an area where a subject document is shown) and a background area 220 which is a part other than the document area. The document area 210 includes a character area 211. In addition, when the positional relationship between the shooting position and the object is not directly opposed (that is, when shooting from an oblique direction), the document area 210 is inclined, that is, linear distortion occurs. Here, the document area 210 is actually a rectangular shape, but has a trapezoidal image due to linear distortion.

［システム構成］
図３に本発明を実施する画像処理装置３００の構成例を示す。画像処理装置３００は、撮影画像データの入力を行う画像入力部３０１と、画像データに本発明の処理を施す画像処理プログラムを実行し、制御するＣＰＵ３０２と、該プログラムを実行する際のワークメモリやデータの一時保存などに利用されるＲＡＭ３０３と、該プログラムやデータを格納する記憶部３０４とから成る。 [System configuration]
FIG. 3 shows a configuration example of an image processing apparatus 300 that implements the present invention. The image processing apparatus 300 includes an image input unit 301 that inputs captured image data, a CPU 302 that executes and controls an image processing program that performs processing of the present invention on image data, a work memory that is used to execute the program, It comprises a RAM 303 used for temporary storage of data and a storage unit 304 for storing the program and data.

ここで、画像処理装置３００の構成は一例であり、ここで示した以外の構成要素を含んでいても良い。また、外部の汎用コンピュータなどを用いて画像処理を実行しても良いし、撮影装置１０２などの電子回路上で画像処理を実行しても構わない。 Here, the configuration of the image processing apparatus 300 is an example, and may include components other than those shown here. In addition, image processing may be executed using an external general-purpose computer or the like, or image processing may be executed on an electronic circuit such as the photographing apparatus 102.

図５は本実施形態の全体を説明するブロック図である。撮影画像５０１は、撮影装置１０２により撮影された画像を示す。画像判別部５０２は、撮影画像５０１が文書領域を含むか否かの判別を行う。文書領域抽出部５０３は、文書領域を含む撮影画像５０１から文書領域の枠を抽出する。ここでは、この枠を“文書枠”と記載する。歪み補正部５０４は、文書領域抽出部５０３で得た文書枠の撮影画像５０１上の座標をパラメータとして文書領域に対して逆透視変換を行い、直方形形状へと補正する。電子文書生成部５０５は、歪み補正部５０４により補正した画像から外部プログラムにて扱える電子文書５０６を生成する。電子文書５０６は、電子文書生成部５０５により生成された電子文書を示す。 FIG. 5 is a block diagram illustrating the entirety of this embodiment. A photographed image 501 indicates an image photographed by the photographing apparatus 102. The image determination unit 502 determines whether the captured image 501 includes a document area. The document area extraction unit 503 extracts a frame of the document area from the captured image 501 including the document area. Here, this frame is referred to as a “document frame”. The distortion correction unit 504 performs reverse perspective transformation on the document region using the coordinates on the captured image 501 of the document frame obtained by the document region extraction unit 503 as a parameter, and corrects the rectangular shape. The electronic document generation unit 505 generates an electronic document 506 that can be handled by an external program from the image corrected by the distortion correction unit 504. An electronic document 506 indicates an electronic document generated by the electronic document generation unit 505.

［処理の流れ］
以下、撮影画像５０１の例として、図２に示した画像２００を用いた場合の処理を説明する。なお、本処理は、本実施形態において画像処理装置３００に備えられたＣＰＵ３０２がＲＡＭ３０３や記憶部３０４等に格納されたプログラムを読み込み、実行するものである。また、抽出される直線の情報や勾配情報は、ＲＡＭ３０３や記憶部３０４の記憶手段に保持される。 [Process flow]
Hereinafter, processing when the image 200 illustrated in FIG. 2 is used as an example of the captured image 501 will be described. In this processing, the CPU 302 provided in the image processing apparatus 300 in the present embodiment reads and executes a program stored in the RAM 303, the storage unit 304, or the like. The extracted straight line information and gradient information are held in the storage unit of the RAM 303 or the storage unit 304.

画像判別部５０２により、撮影画像５０１に文書が含まれるか否かの判定を行う。画像データから文字を抽出する方法に関しては、例えば特開２００２−０４２０５５号公報などの方法を用いて抽出することができる。処理の結果、画像２００から文字領域２１１が得られ、文書画像を含む画像であると判定される。また、ユーザインターフェースを用いて、ユーザによって画像種別を切り替えても良い。 The image determination unit 502 determines whether or not a document is included in the captured image 501. Regarding a method for extracting characters from image data, for example, a method such as that disclosed in Japanese Patent Application Laid-Open No. 2002-042055 can be used. As a result of the processing, a character area 211 is obtained from the image 200, and it is determined that the image includes a document image. Further, the image type may be switched by the user using a user interface.

文書領域抽出部５０３の詳細な処理について、図４のフローチャートを用いて説明する。Ｓ４０１にて、文書領域抽出部５０３は、画像２００から直線成分を検出する。直線成分の検出は、公知の手法を用いて、例えば次のように行うことができる。すなわち、ＳｏｂｅｌフィルタやＬａｐｌａｃｉａｎフィルタなどを用いたエッジ強調手法により、画像２００中の文書の境界部分に該当する画素を強調する。このエッジを強調した画像に対し、ハフ変換や最小近似法などの公知の直線抽出法を用いることで、直線を検出することが可能である。検出した直線は、例えば画像データにおける端点と端点の座標を保持し、そのベクトルを算出することで直線の向きを取得することができる。なお、直線のベクトルを求める際には、例えば、画像において左から右へ向かってｘ座標が増加し、上から下へ向かってｙ座標が増加する場合、直線の両端のうち、ｘ座標の値が小さい端点を始点とし、他点を終点として求めても良い。同じｘ座標の値である場合には、ｙ座標の値が小さい端点を始点とする。 Detailed processing of the document area extraction unit 503 will be described with reference to the flowchart of FIG. In step S <b> 401, the document area extraction unit 503 detects a linear component from the image 200. The detection of the linear component can be performed using a known method as follows, for example. That is, pixels corresponding to the boundary portion of the document in the image 200 are emphasized by an edge enhancement method using a Sobel filter or a Laplacian filter. A straight line can be detected by using a known straight line extraction method such as the Hough transform or the minimum approximation method for the image with the edge enhanced. The detected straight line holds the coordinates of the end points and the end points in the image data, for example, and the direction of the straight line can be obtained by calculating the vector. When obtaining the vector of a straight line, for example, when the x coordinate increases from left to right and the y coordinate increases from top to bottom in the image, the value of the x coordinate of both ends of the straight line. It is also possible to obtain an end point with a small value as a start point and another point as an end point. In the case of the same x-coordinate value, the end point having a small y-coordinate value is set as the starting point.

直線成分を抽出するための処理を適用した結果を図６に示す。図６（ａ）に示す画像６００は、図２に示した画像２００に対してエッジ強調を行った例である。図６（ｂ）に示す画像６１０は、画像６００に対してハフ変換を適用し、得られた直線成分を表した図である。この直線抽出の処理の結果、線分６１１から線分６１８までの計８本の線分が得られる。 The result of applying the process for extracting the linear component is shown in FIG. An image 600 shown in FIG. 6A is an example in which edge enhancement is performed on the image 200 shown in FIG. An image 610 shown in FIG. 6B is a diagram showing a linear component obtained by applying the Hough transform to the image 600. As a result of this straight line extraction process, a total of eight line segments from line segment 611 to line segment 618 are obtained.

Ｓ４０２では、文書領域抽出部５０３は、Ｓ４０１により得られた全直線成分に対して直交方向の画素情報の変化から勾配方向を算出する。例えば、図７（ａ）に示す方法では、抽出された直線を含む領域７００において、直線に対し、直交方向に走査することで輝度のヒストグラムを取得する。ここで、直交方向に走査するとは、例えば、直交方向の画素の値を順に読み出し、その値を抽出することを意味する。そして、直線と直交線の交点位置でのヒストグラムの傾きから輝度の勾配の方向を取得する。また、Ｓ４０１におけるエッジ画像生成時に、副次的に得られる輝度の勾配情報を用いても良い。 In S402, the document area extraction unit 503 calculates the gradient direction from the change in the pixel information in the orthogonal direction with respect to all the straight line components obtained in S401. For example, in the method shown in FIG. 7A, a luminance histogram is acquired by scanning in a direction orthogonal to a straight line in a region 700 including the extracted straight line. Here, scanning in the orthogonal direction means, for example, sequentially reading out values of pixels in the orthogonal direction and extracting the values. Then, the direction of the luminance gradient is acquired from the inclination of the histogram at the intersection of the straight line and the orthogonal line. In addition, the gradient information of luminance that is obtained secondarily at the time of generating the edge image in S401 may be used.

具体的には図７（ａ）に示すように、直線に対し直交する方向において、左から右に走査した結果、輝度が直線の左側と比較し、直線の右側において高くなっている。この場合、直線を基準として輝度の高低により、値の高い方向を向くとして、右向きの勾配情報が取得される。具体的には、直線のベクトルから、そのベクトルに直交する直交ベクトルを求める。そして、その直交ベクトルに沿って画素値（本実施形態では輝度情報）を走査し、ヒストグラムを求める。ここで、図７（ａ）に示した直線の上部を始点とし、下部を終点とすると、直線のベクトルは下を向いていることとなる。この方向に基づいて画素値を走査する方向を一意とする。このヒストグラムと着目する直線に対する直交ベクトルとから、着目する直線の勾配情報（輝度の勾配）を取得する。そして、先に抽出された直線の情報と勾配情報とを対応付けて、記憶手段に保持する。勾配情報を表現するデータ構造については特に限定するものではないが、例えば直線が示すベクトルに向かって右側の画素値が高ければ“１”、左側が高ければ“０”というフラグを付与してもよい。 Specifically, as shown in FIG. 7A, as a result of scanning from left to right in a direction orthogonal to the straight line, the luminance is higher on the right side of the straight line than on the left side of the straight line. In this case, rightward gradient information is acquired assuming that the value is directed in the direction of higher value due to the level of luminance with respect to a straight line. Specifically, an orthogonal vector orthogonal to the vector is obtained from a straight vector. Then, the pixel value (luminance information in this embodiment) is scanned along the orthogonal vector to obtain a histogram. Here, if the upper part of the straight line shown in FIG. 7A is the starting point and the lower part is the ending point, the vector of the straight line is facing downward. The direction in which the pixel value is scanned is made unique based on this direction. The gradient information (luminance gradient) of the line of interest is acquired from this histogram and the orthogonal vector for the line of interest. Then, the previously extracted straight line information and gradient information are associated with each other and stored in the storage unit. The data structure that expresses the gradient information is not particularly limited. For example, a flag “1” is given if the pixel value on the right side is high toward the vector indicated by the straight line, and “0” is given if the left side is high. Good.

図７（ｂ）に示す画像７１０は、画像６００で図示した線分に勾配情報を付加した結果である。各線分に付随する矢印は輝度の上向き方向を示す。例えば図７（ｂ）で示す線分６１１は上向きの矢印に示されている。これは、線分６１１の位置を基準として下から上方向へ画像の輝度が高くなっている、つまり線分６１１に対して上側が明るくなっていることを示す。なお、本実施形態において、勾配方向の算出において、輝度情報を用いているが、これに限定されるものではない。例えば、直線を基準として、その近傍の画素値により変化を算出できれば他の情報を用いても良い。 An image 710 illustrated in FIG. 7B is a result of adding gradient information to the line segment illustrated in the image 600. The arrow accompanying each line segment indicates the upward direction of luminance. For example, a line segment 611 shown in FIG. 7B is indicated by an upward arrow. This indicates that the luminance of the image increases from bottom to top with respect to the position of the line segment 611, that is, the upper side of the line segment 611 is brighter. In the present embodiment, luminance information is used in the calculation of the gradient direction, but the present invention is not limited to this. For example, other information may be used as long as a change can be calculated based on pixel values in the vicinity of a straight line as a reference.

Ｓ４０３では、文書領域抽出部５０３は、Ｓ４０１により得られた線分の一つを処理対象線として選択する。ここでは、処理対象線分として線分６１１が選択されたとする。Ｓ４０４では、文書領域抽出部５０３は、未処理の線分全てに対して、Ｓ４０３にて選択された処理対象線分と対辺を成すかを、Ｓ４０２で算出した勾配情報を基に判定する。処理対象線分と対辺を成すと判定された全ての線分を対辺候補として対辺候補リストを作成する。 In step S403, the document area extraction unit 503 selects one of the line segments obtained in step S401 as a processing target line. Here, it is assumed that the line segment 611 is selected as the processing target line segment. In step S404, the document area extraction unit 503 determines whether all unprocessed line segments are opposite to the processing target line segment selected in step S403 based on the gradient information calculated in step S402. The opposite side candidate list is created with all line segments determined to form opposite sides to the processing target line segment as opposite side candidates.

対辺の判定方法について、図８（ａ）の線分８０１と線分８０２を用いて説明する。最初に、線分８０１、８０２の座標から“内側”を求める。ここで“内側”とは、線分８０１の両端、線分８０２の両端の４点を４頂点とする四角形８０３を描き、四角形の内部方向を“内側”とする。次に、線分８０１、８０２の勾配情報において、勾配方向がいずれも内側へ、または外側へと向いている勾配であれば対辺候補と判定する。つまり、勾配情報としては、対辺となる線分において、逆方向の値を有することとなる。例えば、輝度の上向き方向を表す矢印８０４、８０５はそれぞれ内側を向いているため、線分８０１と線分８０２は対辺候補と判定される。なお、２線分が作る内側方向へ輝度が高くなる対辺候補を“山型対辺”、内側方向へ輝度が低くなる対辺候補を“谷型対辺”と便宜上呼ぶこととする。例えば画像８００における対辺は山型対辺、画像８１０における対辺は谷型対辺となる。 A method for determining the opposite side will be described with reference to a line segment 801 and a line segment 802 in FIG. First, “inside” is obtained from the coordinates of the line segments 801 and 802. Here, “inside” refers to drawing a quadrangle 803 having four vertices at the four ends of the line segment 801 and at both ends of the line segment 802, and the inner direction of the quadrangle is “inside”. Next, in the gradient information of the line segments 801 and 802, if both the gradient directions are inward or outward, the opposite side candidate is determined. In other words, the gradient information has a value in the opposite direction in the line segment on the opposite side. For example, since the arrows 804 and 805 representing the upward direction of the luminance are directed inward, the line segment 801 and the line segment 802 are determined as opposite sides candidates. For convenience, the opposite-side candidate formed by the two line segments that increases in luminance in the inner direction is referred to as “mountain opposite side”, and the opposite-side candidate that decreases in luminance in the inner direction is referred to as “valley-type opposite side”. For example, the opposite side in the image 800 is a mountain-type opposite side, and the opposite side in the image 810 is a valley-type opposite side.

図８（ｂ）に示す画像８２０は、処理対象である線分６１１に対して、未処理の線分、つまり線分６１１以外の全ての直線から対辺となり得る直線を表した図である。対辺候補として線分６１２、６１４、６１６、６１８が、いずれも谷型対辺として得られた。これらをそれぞれ、対辺種類（線分ａ，線分ｂ）の命名規則を用いて、谷型対辺（６１１，６１２）、谷型対辺（６１１，６１４）、谷型対辺（６１１，６１６）、谷型対辺（６１１，６１８）と表す。なお、実際のデータ構造は、各直線の情報を対応付けて、記憶手段にて保持することとなる。もしくはテーブルを作成し、対応する直線間の情報を保持しても良い。 An image 820 shown in FIG. 8B is a diagram showing an unprocessed line segment, that is, a straight line that can be the opposite side from all straight lines other than the line segment 611 with respect to the line segment 611 to be processed. As the opposite side candidates, line segments 612, 614, 616, 618 were all obtained as valley-type opposite sides. Using the naming convention of the opposite side type (line segment a, line segment b), these are respectively the valley type opposite side (611, 612), the valley type opposite side (611, 614), the valley type opposite side (611, 616), the valley This is represented as the opposite side of the mold (611, 618). Note that the actual data structure is stored in the storage means in association with the information of each straight line. Alternatively, a table may be created and information between corresponding straight lines may be held.

Ｓ４０５では、文書領域抽出部５０３は、Ｓ４０１により得られた全ての線分に対してＳ４０４の処理を行ったか否かを判定する。行っていなければＳ４０３へ戻り、他の線分に対しても処理を行う。全ての線分に対して処理が終了していればＳ４０６へ進む。すなわち、残りの未処理直線である線分６１２から６１８に関しても同様にＳ４０４で対辺候補リストを作成する。この処理の結果、谷型対辺（６１１，６１２）、谷型対辺（６１１，６１４）、谷型対辺（６１１，６１６）、谷型対辺（６１１，６１８）、山型対辺（６１２，６１３）、山型対辺（６１２，６１５）、山型対辺（６１２，６１７）、谷型対辺（６１３，６１４）、山型対辺（６１３，６１５）、山型対辺（６１３，６１７）、谷型対辺（６１４，６１６）、谷型対辺（６１４，６１８）、谷型対辺（６１５，６１６）、山型対辺（６１５，６１７）、谷型対辺（６１６，６１８）の１５個の対辺候補リストが得られる。そして、全ての直線に対して処理をした後にＳ４０６へ進む。 In step S405, the document area extraction unit 503 determines whether the processing in step S404 has been performed on all line segments obtained in step S401. If not, the process returns to S403, and the other line segments are also processed. If the processing has been completed for all the line segments, the process proceeds to S406. That is, for the remaining unprocessed straight lines 612 to 618, the opposite side candidate list is similarly created in S404. As a result of this processing, the valley type opposite side (611, 612), the valley type opposite side (611, 614), the valley type opposite side (611, 616), the valley type opposite side (611, 618), the mountain type opposite side (612, 613), Mountain type opposite side (612, 615), mountain type opposite side (612, 617), valley type opposite side (613, 614), mountain type opposite side (613, 615), mountain type opposite side (613, 617), valley type opposite side (614 , 616), valley-type opposite sides (614, 618), valley-type opposite sides (615, 616), mountain-type opposite sides (615, 617), and valley-type opposite sides (616, 618), 15 candidate lists are obtained. Then, after processing all the straight lines, the process proceeds to S406.

Ｓ４０６では、文書領域抽出部５０３は、Ｓ４０４により得られた対辺候補を１つ処理対象として選択する。ここでは、処理対象の対辺候補として、谷型対辺（６１１，６１２）が選択されたとする。 In step S406, the document area extraction unit 503 selects one of the opposite side candidates obtained in step S404 as a processing target. Here, it is assumed that the valley-shaped opposite side (611, 612) is selected as the opposite side candidate to be processed.

Ｓ４０７では、文書領域抽出部５０３は、Ｓ４０４により得られた対辺候補リストの内、未処理の対辺候補から処理対辺候補と山谷同型の対辺候補を組み合わせて文書枠候補リストを作成する。また、処理対象対辺候補と、同じ辺を持つ対辺候補については文書枠を形成できないため、除外する。なお、２つの谷型対辺から構成される文書枠候補を“谷型枠”、２つの山型対辺から構成される文書枠候補を“山型枠”と呼ぶこととする。これらの文書枠は、いずれも四辺の勾配方向が、文書枠の内側外側を基準として、同一の向きとなっている。すなわち、山型枠は、四辺の勾配方向が全て文書枠の内側を向いており、谷型枠は、四辺の勾配方向が全て文書枠の外側を向いている。 In step S407, the document area extraction unit 503 creates a document frame candidate list by combining the processing opposite side candidate and the Yamatani opposite side candidate from the unprocessed opposite side candidate in the opposite side candidate list obtained in step S404. In addition, the processing target opposite side candidate and the opposite side candidate having the same side are excluded because a document frame cannot be formed. A document frame candidate composed of two valley-shaped opposite sides is referred to as a “valley frame”, and a document frame candidate composed of two mountain-shaped opposite sides is referred to as a “mountain frame”. In these document frames, the gradient directions of the four sides are the same with respect to the inside and outside of the document frame. In other words, the mountain-shaped frame has all four sides of the gradient direction facing the inside of the document frame, and the valley-shaped frame has the four sides of the gradient direction all facing the outside of the document frame.

処理対象の対辺候補である谷型対辺（６１１，６１２）に対しては、同じ谷型対辺であり、線分６１１、６１２を含まない対辺候補である、谷型対辺（６１３，６１４）、谷型対辺（６１４，６１６）、谷型対辺（６１４，６１８）、谷型対辺（６１５，６１６）、谷型対辺（６１６，６１８）が谷型枠候補として得られる。 The valley-type opposite side (611, 612) that is the opposite-side candidate to be processed is the same valley-type opposite side and is the opposite-side candidate that does not include the line segments 611 and 612. The valley-type opposite side (613, 614), valley The mold opposite sides (614, 616), the valley opposite sides (614, 618), the valley opposite sides (615, 616), and the valley opposite sides (616, 618) are obtained as valley form frame candidates.

これらをそれぞれ、枠種類（線分ａ，線分ａ’，線分ｂ，線分ｂ’）の命名規則を用いて、谷型枠（６１１，６１２，６１３，６１４）、谷型枠（６１１，６１２，６１４，６１６）、谷型枠（６１１，６１２，６１４，６１８）、谷型枠（６１１，６１２，６１５，６１６）、谷型枠（６１１，６１２，６１６，６１８）と表す。 Using the naming conventions of the frame types (line segment a, line segment a ′, line segment b, line segment b ′), these are respectively referred to as a valley frame (611, 612, 613, 614), a valley frame (611). , 612, 614, 616), a valley frame (611, 612, 614, 618), a valley frame (611, 612, 615, 616), and a valley frame (611, 612, 616, 618).

Ｓ４０８では、文書領域抽出部５０３は、Ｓ４０４により得られた全ての対辺候補に対して、Ｓ４０７の処理を行ったか否かを判定する。全ての対辺候補に対する処理が終了していなければＳ４０６へ戻り、未処理の対辺候補に対して処理を適用する。全ての対辺候補への処理が終了していればＳ４０９へ進む。 In step S408, the document area extraction unit 503 determines whether the processing in step S407 has been performed on all the opposite side candidates obtained in step S404. If the processing for all the opposite side candidates is not completed, the process returns to S406, and the process is applied to the unprocessed opposite side candidate. If the processing for all the opposite sides has been completed, the process proceeds to S409.

このように、残りの未処理対辺に関しても同様にＳ４０７で文書枠候補リストを作成する。この処理により谷型枠（６１１，６１２，６１３，６１４）、谷型枠（６１１，６１２，６１４，６１６）、谷型枠（６１１，６１２，６１４，６１８）、谷型枠（６１１，６１２，６１５，６１６）、谷型枠（６１１，６１２，６１６，６１８）、谷型枠（６１１，６１４，６１５，６１６）、谷型枠（６１１，６１４，６１６，６１８）、谷型枠（６１１，６１６，６１３，６１４）、谷型枠（６１１，６１８，６１３，６１４）、谷型枠（６１１，６１８，６１４，６１６）、谷型枠（６１３，６１４，６１５，６１６）、谷型枠（６１４，６１８，６１５，６１６）、山型枠（６１２，６１３，６１５，６１７）、山型枠（６１２，６１５，６１３，６１７）、山型枠（６１２，６１７，６１３，６１５）の１５の文書枠候補リストが得られる。そして、全ての直線に対して処理をした後、Ｓ４０９へ進む。 As described above, the document frame candidate list is similarly created in S407 for the remaining unprocessed opposite sides. By this treatment, the valley form (611, 612, 613, 614), the valley form (611, 612, 614, 616), the valley form (611, 612, 614, 618), the valley form (611, 612) 615, 616), valley form (611, 612, 616, 618), valley form (611, 614, 615, 616), valley form (611, 614, 616, 618), valley form (611, 616,613,614), trough form (611,618,613,614), trough form (611,618,614,616), trough form (613,614,615,616), trough form ( 614, 618, 615, 616), mountain frame (612, 613, 615, 617), mountain frame (612, 615, 613, 617), mountain frame (612, 617, 613, 615) Document frame candidate list Obtained. Then, after processing all the straight lines, the process proceeds to S409.

Ｓ４０９では、文書枠候補の４線分から実際に文書枠となる四角形を算出する。直線検出で求めた線分はレンズの歪みやノイズなどの影響から、通常は実際の枠辺と同一にはならない。そのため、線分を延伸することにより４線分のそれぞれの交点を計算し、頂点とする。 In step S409, a quadrangle that actually becomes a document frame is calculated from the four line segments of the document frame candidate. The line segment obtained by the straight line detection is usually not the same as the actual frame side due to the influence of distortion and noise of the lens. Therefore, by extending the line segment, each intersection of the four line segments is calculated and set as a vertex.

例えば、以下の４つの線分について説明する。ここで、ｘ、ｙはそれぞれ線分の端点のｘ座標、ｙ座標を示す。 For example, the following four line segments will be described. Here, x and y indicate the x coordinate and y coordinate of the end point of the line segment, respectively.

線分Ａ（Ａｘ１，Ａｙ１）−（Ａｘ２，Ａｙ２）
線分ａ（ａｘ１，ａｙ１）−（ａｘ２，ａｙ２）（線分Ａの対辺）
線分Ｂ（Ｂｘ１，Ｂｙ１）−（Ｂｘ２，Ｂｙ２）
線分ｂ（ｂｘ１，ｂｙ１）−（ｂｘ２，ｂｙ２）（線分Ｂの対辺）
上記の４線分からなる四角形の頂点は、隣接辺である線分ＡとＢ、線分Ａとｂ、線分ａとＢ、線分ａとｂ、の交点を求める事でわかる。図９（ａ）に示す計算式９００は、線分Ａ（Ａｘ１，Ａｙ１）−（Ａｘ２，Ａｙ２）と線分Ｂ（Ｂｘ１，Ｂｙ１）−（Ｂｘ２，Ｂｙ２）との交点座標（ＡＢｘ，ＡＢｙ）を求める計算式である。 Line A (Ax1, Ay1)-(Ax2, Ay2)
Line segment a (ax1, ay1)-(ax2, ay2) (opposite side of line segment A)
Line segment B (Bx1, By1)-(Bx2, By2)
Line segment b (bx1, by1)-(bx2, by2) (opposite side of line segment B)
The vertices of the quadrilateral consisting of the above four line segments can be found by calculating the intersections of the adjacent line segments A and B, line segments A and b, line segments a and B, and line segments a and b. The calculation formula 900 shown in FIG. 9A is obtained by calculating the intersection coordinates (ABx, ABy) between the line segment A (Ax1, Ay1)-(Ax2, Ay2) and the line segment B (Bx1, By1)-(Bx2, By2). Is a calculation formula for obtaining.

ここで、条件として、
（−Ａｙ１＋Ａｙ２）＊（Ｂｘ１−Ｂｘ２）−（Ａｘ１−Ａｘ２）＊（−Ｂｙ１＋Ｂｙ２）＝０
であった場合は隣接辺が平行となり、解は存在しないため、文書枠候補から除外する。 Here, as a condition,
(-Ay1 + Ay2) * (Bx1-Bx2)-(Ax1-Ax2) * (-By1 + By2) = 0
If it is, the adjacent sides are parallel and there is no solution, so it is excluded from the document frame candidates.

また、画像９１０で示す、線分９１１，９１２，９１３，９１４から算出した４つの頂点９１５，９１６，９１７，９１８が作る四角形のように、四角形の各辺に線分が重ならない場合も候補のリストから除外する。Ｓ４０８で作成した１４の文書枠候補リストの内、谷型枠（６１１，６１２，６１３，６１４）、谷型枠（６１１，６１２，６１４，６１６）、谷型枠（６１１，６１２，６１４，６１８）、谷型枠（６１１，６１２，６１５，６１６）、谷型枠（６１１，６１２，６１６，６１８）、谷型枠（６１１，６１４，６１５，６１６）、谷型枠（６１１，６１４，６１６，６１８）、谷型枠（６１１，６１６，６１３，６１４）、谷型枠（６１１，６１８，６１３，６１４）、谷型枠（６１３，６１４，６１５，６１６）、谷型枠（６１４，６１８，６１５，６１６）、山型枠（６１２，６１３，６１５，６１７）、山型枠（６１２，６１５，６１３，６１７）の１３枠候補は含まれる四辺が文書枠を成さないため除外される。図１０（ａ）の画像１０００に示す、谷型枠（６１１，６１４，６１６，６１８）１００１と山型枠（６１２，６１７，６１３，６１５）１００２の２枠候補へと最終的に絞り込まれ、図４のフローチャートの全処理が完了する。なお、本フローチャートでは説明の為に対辺単位での処理を行ったが、最初から４線分を網羅的に組み合わせ、勾配情報に基づいて文書枠候補を判定しても良い。 In addition, as shown in the image 910, as shown by the quadrilateral formed by the four vertices 915, 916, 917, and 918 calculated from the line segments 911, 912, 913, and 914, the line segment does not overlap each side of the quadrilateral. Exclude from the list. Of the 14 document frame candidate lists created in S408, the valley frame (611, 612, 613, 614), the valley frame (611, 612, 614, 616), the valley frame (611, 612, 614, 618) ), Trough form (611,612,615,616), trough form (611,612,616,618), trough form (611,614,615,616), trough form (611,614,616) , 618), trough form (611, 616, 613, 614), trough form (611, 618, 613, 614), trough form (613, 614, 615, 616), trough form (614, 618) , 615, 616), mountain frame (612, 613, 615, 617), and mountain frame (612, 615, 613, 617) are excluded because the four sides included do not form a document frame. . As shown in the image 1000 in FIG. 10A, the frame shape is finally narrowed down to two frame candidates of a valley frame (611, 614, 616, 618) 1001 and a mountain frame (612, 617, 613, 615) 1002, All the processes in the flowchart of FIG. 4 are completed. In this flowchart, the process in units of opposite sides is performed for the sake of explanation. However, it is also possible to comprehensively combine the four line segments from the beginning and determine the document frame candidate based on the gradient information.

図１０（ｂ）に示す画像１０１０は、表示部１０３に文書枠候補の絞り込み結果を表示した例である。オリジナル画像表示部１０１１は、撮影画像上に文書枠候補をオーバーレイ表示する。候補サムネイル部１０１２は、それぞれの文書枠候補について歪み補正した結果をサムネイルで表示する。この表示部１０３に表示された画像を参照して、ユーザが操作部１０４で選択操作を行うことにより、補正に使用する文書枠候補を決定することができる。ここでは、補正に使用する文書枠として山型枠（６１２，６１７，６１３，６１５）１００２がユーザにより選択されたとする。なお、表示部１０３に表示する画像１０１０の構成は一例であり、パーソナルコンピューター上で決定しても良いし、１以上の文書枠候補全てに対して歪み補正処理をかけ、電子文書化しても良い。 An image 1010 shown in FIG. 10B is an example in which the narrowing result of document frame candidates is displayed on the display unit 103. The original image display unit 1011 displays the document frame candidate as an overlay on the captured image. The candidate thumbnail portion 1012 displays the result of distortion correction for each document frame candidate as a thumbnail. By referring to the image displayed on the display unit 103 and performing a selection operation on the operation unit 104 by the user, a document frame candidate to be used for correction can be determined. Here, it is assumed that the mountain frame (612, 617, 613, 615) 1002 is selected by the user as the document frame used for correction. Note that the configuration of the image 1010 displayed on the display unit 103 is an example, and may be determined on a personal computer, or distortion correction processing may be applied to all one or more document frame candidates to form an electronic document. .

歪み補正部５０４では、文書領域抽出部５０３で得られた文書枠の頂点情報から歪みを補正する。ここでの歪み補正とは、矩形領域を３次元的な角度をもって２次元平面に投影した場合に生成される不等辺四角形領域を、元の矩形領域へと補正する演算、いわゆる逆透視変換演算である。用いられる変換行列のパラメータについては、例えば特開２００３−２８８５８８に開示されているように、逆透視変換の演算式に４頂点の座標を与える事で、得られる連立方程式を解くことにより抽出できる。その他、本発明に適用できれば、どのような手法を用いても良い。 The distortion correction unit 504 corrects distortion from the vertex information of the document frame obtained by the document area extraction unit 503. The distortion correction here is a so-called reverse perspective transformation calculation that corrects an unequal square area generated when a rectangular area is projected onto a two-dimensional plane with a three-dimensional angle to the original rectangular area. is there. The parameters of the transformation matrix used can be extracted by solving the simultaneous equations obtained by giving the coordinates of the four vertices to the inverse perspective transformation arithmetic expression as disclosed in, for example, Japanese Patent Laid-Open No. 2003-288588. In addition, any method may be used as long as it can be applied to the present invention.

電子文書生成部５０５では、補正した画像を電子文書５０６として生成し出力する。ここでは、電子文書５０６としてＪＰＥＧ形式により出力する。なお、ＪＰＥＧ形式の出力は一例であり、電子デバイスの取り扱い可能な形式に応じた変換や、再利用可能な電子文書、例えばワードプロセッシング文書やプレゼンテーション文書などへ変換しても良い。 The electronic document generation unit 505 generates and outputs the corrected image as an electronic document 506. Here, the electronic document 506 is output in the JPEG format. The output in the JPEG format is an example, and conversion according to a format that can be handled by the electronic device or conversion into a reusable electronic document such as a word processing document or a presentation document may be performed.

図１１（ａ）に示す画像１１００は、実施形態１を適用した結果、出力される電子文書５０６の例である。対象物との傾きから台形状に歪んだ部分が、正対した長方形の状態で電子文書化される。 An image 1100 illustrated in FIG. 11A is an example of an electronic document 506 that is output as a result of applying the first embodiment. A portion distorted in a trapezoidal shape from the tilt with respect to the object is electronically documented in a rectangular state facing the object.

また、図１１（ｂ）の画像１１１０は勾配情報を用いなかった場合の例である。線分６１２、６１４、６１５、６１８からなる文書枠１１１１のような文書枠まで余分に検出される。この場合には、合計１６通りから文書枠候補の選択をする必要がある。 Further, an image 1110 in FIG. 11B is an example in the case where gradient information is not used. Even a document frame such as a document frame 1111 including line segments 612, 614, 615, and 618 is detected. In this case, it is necessary to select document frame candidates from a total of 16 patterns.

以上説明したように、本発明を適用することで、適用しない場合と比較し、好適な文書枠候補を残したまま削減することが可能となり、後段の処理の負荷を軽減することができる。 As described above, by applying the present invention, it is possible to reduce a document frame candidate while leaving a suitable document frame candidate as compared with the case where the present invention is not applied, and the processing load on the subsequent stage can be reduced.

＜実施形態２＞
実施形態１では、単純に全ての直線から総当たりで対辺候補の検索を行った。しかし、実際の文書画像では背景領域などから直線が多く検出されれば、その分負荷が増大する。そこで、縦方向の線分と横方向の線分を分類する方法を組み合わせることで絞り込みの処理をさらに高速化することが可能である。 <Embodiment 2>
In the first embodiment, the opposite side candidates are simply searched from all the straight lines. However, if a large number of straight lines are detected from the background area or the like in an actual document image, the load increases accordingly. Therefore, it is possible to further speed up the narrowing-down process by combining a method for classifying a vertical line segment and a horizontal line segment.

図１２は本実施形態の文書領域抽出部５０３における文書領域抽出処理をフローチャートにより説明した図である。以下、入力される撮影画像５０１の例に、図２に示した画像２００を用いて処理を説明する。なお、本処理フローは例えば、ＣＰＵ３０２がＲＡＭ３０３や記憶部３０４に格納されたプログラムやデータを読み出し、実行することで実現される。Ｓ１２０１では、文書領域抽出部５０３は、直線成分を検出する。詳細な処理方法はＳ４０１と同様であるため省略する。図６（ｂ）に示す画像６１０が直線検出した結果である。ここでは線分６１１から６１８まで８本の直線が検出される。Ｓ１２０２では、文書領域抽出部５０３は、直線の勾配方向を算出する。詳細な処理方法はＳ４０２と同様であるため省略する。図７（ｂ）に示す画像７１０が勾配情報を付加した結果である。 FIG. 12 is a diagram illustrating a document area extraction process in the document area extraction unit 503 according to this embodiment with reference to a flowchart. Hereinafter, processing will be described using the image 200 illustrated in FIG. 2 as an example of the input captured image 501. Note that this processing flow is realized, for example, by the CPU 302 reading and executing a program or data stored in the RAM 303 or the storage unit 304. In step S1201, the document area extraction unit 503 detects a straight line component. The detailed processing method is the same as that in S401, and will be omitted. FIG. 6B shows the result of straight line detection of the image 610 shown in FIG. Here, eight straight lines from line segments 611 to 618 are detected. In step S1202, the document area extraction unit 503 calculates a straight line gradient direction. Since the detailed processing method is the same as that in S402, the description is omitted. An image 710 shown in FIG. 7B is a result of adding gradient information.

Ｓ１２０３では、文書領域抽出部５０３は、縦方向線及び横方向線の判定を行う。画像平面上での水平に対する線分の相対角度から、縦方向線および横方向線の判定を行う。ここでは、水平に対して相対的に０度以上４５度未満、１３５度以上１８０度未満であれば横方向線、４５度以上１３５度未満であれば縦方向線と判定する。なお、縦方向線もしくは横方向線を判定するための基準は、上記の値に限定されるものではなく、必要に応じて変更して良い。また、本発明を適用可能であれば、他の方法を用いても良い。図１３に示す画像１３００が縦方向線と横方向線を判定した結果である。ここでは、縦方向線を実線、横方向線を破線で表す。 In step S1203, the document area extraction unit 503 determines a vertical direction line and a horizontal direction line. The vertical line and the horizontal line are determined from the relative angle of the line segment to the horizontal on the image plane. Here, if it is 0 degree or more and less than 45 degree | times relative to the horizontal, if it is 135 degree or more and less than 180 degree | times, it will determine with a horizontal direction line, and if it is 45 degree | times or more and less than 135 degree | times, it will determine with a vertical direction line. In addition, the reference | standard for determining a vertical direction line or a horizontal direction line is not limited to said value, You may change as needed. Also, other methods may be used as long as the present invention is applicable. The image 1300 shown in FIG. 13 is a result of determining the vertical direction line and the horizontal direction line. Here, the vertical direction line is represented by a solid line, and the horizontal direction line is represented by a broken line.

Ｓ１２０４では、文書領域抽出部５０３は、処理対象となる線分を選択する。ここでは線分６１１が選択されたとする。Ｓ１２０５では、文書領域抽出部５０３は、処理対象線分が縦方向線か横方向線かによる分岐処理を行う。縦方向線であればＳ１２０６へ、横方向線であればＳ１２０７へ進む。ここで、線分６１１は横方向線であるので、Ｓ１２０７へ進む。 In step S1204, the document area extraction unit 503 selects a line segment to be processed. Here, it is assumed that the line segment 611 is selected. In step S1205, the document area extraction unit 503 performs branching processing based on whether the processing target line segment is a vertical line or a horizontal line. If it is a vertical line, the process proceeds to S1206. If it is a horizontal line, the process proceeds to S1207. Here, since the line segment 611 is a horizontal line, it progresses to S1207.

Ｓ１２０７では、文書領域抽出部５０３は、未処理の横方向線に対して上下対辺候補を作成する。ここでは未処理の横方向線として線分６１２、６１７、６１８が対象となる。対辺候補の判定自体はＳ４０４と同様であるため省略する。結果、対辺候補として６１２、６１８が、いずれも横方向の谷型対辺として得られた。これらをそれぞれ、上下谷型対辺（６１１，６１２）、上下谷型対辺（６１１，６１８）と表す。なお、上下谷型対辺、上下山型対辺は“上下対辺”であり、左右谷型対辺、左右山型対辺は“左右対辺”となる。 In step S <b> 1207, the document area extraction unit 503 creates an upper / lower opposite candidate for an unprocessed horizontal line. Here, line segments 612, 617, and 618 are targeted as unprocessed horizontal lines. Since the determination of the opposite side candidate itself is the same as S404, the description is omitted. As a result, 612 and 618 were obtained as the opposite side candidates, as valley-type opposite sides in the horizontal direction. These are represented as upper and lower valley type opposite sides (611, 612) and upper and lower valley type opposite sides (611, 618), respectively. The upper and lower valley type opposite sides and the upper and lower mountain type opposite sides are “upper and lower sides”, and the left and right valley type opposite sides and the left and right mountain type opposite sides are “left and right opposite sides”.

Ｓ１２０８の終了判定に従い、線分６１２から６１８まで繰り返し処理を行う。Ｓ１２０４で線分６１３（縦方向線）が選択されたとする。Ｓ１２０５で縦方向線と判定され、Ｓ１２０６へ進む。Ｓ１２０６では、未処理の縦方向線として線分６１４、６１５、６１６を対象として左右対辺候補を作成する。処理の結果、谷型対辺候補として線分６１４、山型対辺候補として線分６１５が取得される。これらをそれぞれ、左右谷型対辺（６１３，６１４）、左右山型対辺（６１３，６１５）と表す。 In accordance with the end determination in S1208, the line segments 612 to 618 are repeatedly processed. Assume that the line segment 613 (vertical line) is selected in S1204. In S1205, the vertical line is determined, and the process proceeds to S1206. In S1206, left and right side candidates are created for the line segments 614, 615, and 616 as unprocessed vertical lines. As a result of the processing, a line segment 614 is acquired as a valley-shaped opposite-side candidate, and a line segment 615 is acquired as a mountain-shaped opposite-side candidate. These are represented as left and right trough type opposite sides (613, 614) and left and right mountain type opposite sides (613, 615), respectively.

同様に残りの線分に対してＳ１２０４からＳ１２０８を繰り返す。処理の結果、上下谷型対辺（６１１，６１２）、上下谷型対辺（６１１，６１８）、上下谷型対辺（６１７，６１８）、上下山型対辺（６１２，６１７）、左右谷型対辺（６１３，６１４）、左右谷型対辺（６１４，６１６）、左右谷型対辺（６１５，６１６）、左右山型対辺（６１３，６１５）の８個の対辺候補リストが作成された。全ての線分に対して処理が完了した後、Ｓ１２０９へ進む。 Similarly, S1204 to S1208 are repeated for the remaining line segments. As a result of processing, the upper and lower valley type opposite sides (611, 612), the upper and lower valley type opposite sides (611, 618), the upper and lower valley type opposite sides (617, 618), the upper and lower mountain type opposite sides (612, 617), and the left and right valley type opposite sides (613) 614), left and right valley-type opposite sides (614, 616), left and right valley-type opposite sides (615, 616), and left and right mountain-type opposite sides (613, 615). After the processing is completed for all the line segments, the process proceeds to S1209.

Ｓ１２０９では、文書領域抽出部５０３は、上下谷型対辺（６１１，６１２）を処理対象対辺として選択し、Ｓ１２１０へ進む。Ｓ１２１０では、文書領域抽出部５０３は、上下対辺か左右対辺かで処理の分岐を行う。ここで、上下谷型対辺（６１１，６１２）は“上下対辺”であるため、Ｓ１２１１へ進む。 In step S1209, the document area extraction unit 503 selects the upper and lower valley type opposite side (611, 612) as the processing target opposite side, and the process advances to step S1210. In step S <b> 1210, the document area extraction unit 503 branches the process depending on the top / bottom side or the left / right side. Here, since the upper and lower valley type opposite sides (611, 612) are “upper and lower sides”, the process proceeds to S1211.

Ｓ１２１１では、文書領域抽出部５０３は、未処理の左右対辺に対して文書枠候補リストを作成する。ここでは、左右谷型対辺（６１３，６１４）、左右谷型対辺（６１４，６１６）、左右谷型対辺（６１５，６１６）、左右山型対辺（６１３，６１５）が対象となる。詳細な処理はＳ４０７と同様であるため省略する。処理の結果、左右谷型対辺（６１３，６１４）、左右谷型対辺（６１４，６１６）、左右谷型対辺（６１５，６１６）が文書枠候補として得られた。これらをそれぞれ谷型枠候補（６１１，６１２，６１３，６１４）、谷型枠候補（６１１，６１２，６１４，６１６）、谷型枠候補（６１１，６１２，６１５，６１６）と表す。 In step S <b> 1211, the document area extraction unit 503 creates a document frame candidate list for unprocessed left and right opposite sides. Here, the left and right valley type opposite sides (613, 614), the left and right valley type opposite sides (614, 616), the left and right valley type opposite sides (615, 616), and the left and right mountain type opposite sides (613, 615) are targeted. Detailed processing is the same as that in S407, and will be omitted. As a result of the processing, left and right valley type opposite sides (613, 614), left and right valley type opposite sides (614, 616), and left and right valley type opposite sides (615, 616) were obtained as document frame candidates. These are represented as valley type frame candidates (611, 612, 613, 614), valley type frame candidates (611, 612, 614, 616), and valley type frame candidates (611, 612, 615, 616), respectively.

Ｓ１２１３の終了判定に従い、未処理の対辺について処理を行う。Ｓ１２０９で左右谷型対辺（６１３，６１４）が選択されたとする。Ｓ１２１０で左右対辺と判断されＳ１２１２へ進む。Ｓ１２１２では、未処理の上下対辺に対して文書枠候補リストを作成する。ここでは、上下谷型対辺（６１１，６１８）、上下谷型対辺（６１７，６１８）、上下山型対辺（６１２，６１７）が対象となる。処理の結果、谷型枠候補（６１１，６１８，６１３，６１４）、谷型枠候補（６１７，６１８，６１３，６１４）が得られる。 In accordance with the end determination in S1213, processing is performed for the unprocessed opposite side. Assume that the left and right valley-type opposite sides (613, 614) are selected in S1209. In S1210, it is determined that the side is the opposite side, and the process proceeds to S1212. In S1212, a document frame candidate list is created for the unprocessed upper and lower opposite sides. Here, the upper and lower valley type opposite sides (611, 618), the upper and lower valley type opposite sides (617, 618), and the upper and lower mountain type opposite sides (612, 617) are targeted. As a result of the processing, valley shape frame candidates (611, 618, 613, 614) and valley shape frame candidates (617, 618, 613, 614) are obtained.

同様に残りの対辺に対してＳ１２０９からＳ１２１３を繰り返す。結果、谷型枠候補（６１１，６１２，６１３，６１４）、谷型枠候補（６１１，６１２，６１４，６１６）、谷型枠候補（６１１，６１２，６１５，６１６）、谷型枠候補（６１１，６１８，６１３，６１４）、谷型枠候補（６１１，６１８，６１４，６１６）、谷型枠候補（６１１，６１８，６１５，６１６）、谷型枠候補（６１７，６１８，６１３，６１４）、谷型枠候補（６１７，６１８，６１４，６１６）、谷型枠候補（６１７，６１８，６１５，６１６）、山型枠候補（６１２，６１７，６１３，６１５）の１０個の文書枠候補リストが作成された。全ての線分に対して処理が完了した後、Ｓ１２０９へ進む。 Similarly, S1209 to S1213 are repeated for the remaining opposite sides. As a result, the valley form frame candidate (611, 612, 613, 614), the valley form frame candidate (611, 612, 614, 616), the valley form frame candidate (611, 612, 615, 616), the valley form frame candidate (611) , 618, 613, 614), valley-shaped frame candidates (611, 618, 614, 616), valley-shaped frame candidates (611, 618, 615, 616), valley-shaped frame candidates (617, 618, 613, 614), There are ten document frame candidate lists of valley type frame candidates (617, 618, 614, 616), valley type frame candidates (617, 618, 615, 616), and mountain type frame candidates (612, 617, 613, 615). Created. After the processing is completed for all the line segments, the process proceeds to S1209.

Ｓ１２１４では、文書領域抽出部５０３は、４線分から文書枠の頂点の算出を行う。詳細な処理はＳ４０９と同様であるため省略する。処理の結果、谷型枠（６１１，６１４，６１６，６１８）１００１と山型枠（６１２，６１７，６１３，６１５）１００２の２枠候補へと最終的に絞り込まれ、図１２のフローチャートの全処理が完了する。歪み補正部５０４、電子文書生成部５０５の処理に関しては、実施形態１と同様であるため省略する。 In step S1214, the document area extraction unit 503 calculates the vertex of the document frame from the four line segments. Detailed processing is the same as in step S409, and is therefore omitted. As a result of the processing, the two frame candidates of the valley frame (611, 614, 616, 618) 1001 and the mountain frame (612, 617, 613, 615) 1002 are finally narrowed down, and the entire processing of the flowchart of FIG. Is completed. Since the processes of the distortion correction unit 504 and the electronic document generation unit 505 are the same as those in the first embodiment, a description thereof will be omitted.

以上説明したように、実施形態１と比較して、対辺候補数が１５から８へ、文書枠候補数が１５から１０へ、いずれも削減することができた。対辺候補作成処理時には、実施形態１では８本中２本の線分の組み合わせである２８通りで探索するのに対し、実施形態２では、縦方向線４本中２本の組み合わせと、横方向線４本中２本の組み合わせとの計１２通りの探索で済む。このことから、実施形態１の効果に加え、更に処理コストを削減することができる。 As described above, compared with the first embodiment, the number of opposite side candidates can be reduced from 15 to 8, and the number of candidate document frames can be reduced from 15 to 10. In the opposite side candidate creation processing, in the first embodiment, the search is performed in 28 ways that are combinations of two line segments out of eight, whereas in the second embodiment, a combination of two out of four vertical lines and a horizontal direction A total of 12 searches with combinations of 2 out of 4 lines are sufficient. Thus, in addition to the effects of the first embodiment, the processing cost can be further reduced.

＜その他の実施形態＞
また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 <Other embodiments>
The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

Claims

An image processing apparatus for extracting a frame composed of four sides of the rectangular area from image data obtained by photographing a subject having a rectangular area ,
Detecting means for detecting a plurality of linear components from the input image data;
Calculating means for calculating a gradient direction according to the level of pixel information in a direction orthogonal to the linear component detected by the detecting means;
An extraction unit that selects four sides from the plurality of linear components detected by the detection unit, and extracts one or more frame candidates including the selected four sides;
Among the frame candidate extracted by the extraction means, a frame candidate gradient direction of four sides of the frame candidate is not the same orientation with respect to either direction of the inside or outside of the frame, by the extraction means An image processing apparatus comprising: a narrowing-down unit that excludes from the extracted frame candidates.

The extraction means includes
Classifying means for classifying the linear component into a horizontal line as upper and lower sides and a vertical line as left and right sides in a frame from a relative inclination with respect to the input image data;
Selection means for selecting a certain vertical direction line and another vertical direction line having a reverse gradient direction as left and right opposite sides, and selecting a certain horizontal direction line and another horizontal direction line having a reverse gradient direction as upper and lower opposite sides Further comprising
The image processing apparatus according to claim 1, wherein the frame candidate is extracted from the four sides by combining the upper and lower opposite sides and the left and right opposite sides.

The image processing apparatus according to claim 1, wherein the calculation unit calculates a gradient direction using a luminance value as the pixel information.

The image processing apparatus according to claim 1, wherein the frame is a frame of a whiteboard, a poster, or a paper document.

A control method of an image processing apparatus for extracting a frame composed of four sides of the rectangular area from image data obtained by photographing a subject having a rectangular area ,
A detecting step in which the detecting means of the image processing device detects a plurality of linear components from the input image data;
A calculation step in which the calculation means of the image processing device calculates a gradient direction according to the level of pixel information in a direction orthogonal to the linear component detected in the detection step;
An extraction step in which the extraction means of the image processing device selects four sides from the plurality of linear components detected in the detection step, and extracts one or more frame candidates composed of the selected four sides;
Narrowing means of said image processing apparatus, of the frame candidates extracted in the extraction step, not the same orientation with respect to either direction of the inside or outside of the gradient direction in the four sides of the frame candidate the frame And a narrowing-down step of removing frame candidates from the frame candidates extracted in the extraction step .

Computer
Detecting means for detecting a plurality of linear components from the input image data;
Calculating means for calculating a gradient direction according to the level of pixel information in a direction orthogonal to the linear component detected by the detecting means;
An extraction unit that selects four sides from the plurality of linear components detected by the detection unit and extracts one or more frame candidates including the selected four sides;
Among the frame candidate extracted by the extraction means, a frame candidate gradient direction of four sides of the frame candidate is not the same orientation with respect to either direction of the inside or outside of the frame, by the extraction means A program for functioning as a narrowing-down means to be excluded from the extracted frame candidates.