JP2013041315A

JP2013041315A - Image recognition device and image recognition method

Info

Publication number: JP2013041315A
Application number: JP2011175879A
Authority: JP
Inventors: Hiroaki Takebe; 浩明武部; Yoshinobu Hotta; 悦伸堀田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2011-08-11
Filing date: 2011-08-11
Publication date: 2013-02-28
Anticipated expiration: 2031-08-11
Also published as: JP5712859B2

Abstract

PROBLEM TO BE SOLVED: To provide a device and a method for accurately extracting an area corresponding to an object forming a specific geometric figure from an image.SOLUTION: An image recognition device comprises: an edge extraction unit for extracting edge segments from an image; an acquisition unit for acquiring combinations of candidates for predetermined geometric figures to be formed by using the edge segments extracted by the edge extraction unit; a calculation unit for calculating a recall ratio indicating a level at which the extracted edge segments cover outer circumference of figure candidates and a relevance ratio indicating a level at which the extracted edge segments are to be used as the figure candidates, for each combination acquired by the acquisition unit; and an image extraction unit for extracting an area corresponding to figure candidates included in a combination with the highest evaluation value determined on the basis of the recall ratio and relevance ratio.

Description

本発明は、画像から所定の幾何学的図形に対応する領域を認識する画像認識装置および画像認識方法に係わる。 The present invention relates to an image recognition apparatus and an image recognition method for recognizing an area corresponding to a predetermined geometric figure from an image.

画像から特定の幾何学的図形を構成する物体に対応する領域を認識して抽出するニーズが存在する。たとえば、矩形のメモ書シールが貼られたホワイトボードをデジタルカメラで撮影し、得られた画像からメモ書シールに対応する領域を抽出する用途がある。この場合、例えば、抽出した画像に対して文字認識を行うことにより、メモ書シールに記載されている文字等を電子データとして保存することができる。そして、このような用途に対して、画像からエッジを抽出し、エッジで囲まれる領域を認識することで、対象とする物体に対応する領域を抽出する方法が知られている。 There is a need to recognize and extract a region corresponding to an object constituting a specific geometric figure from an image. For example, there is an application in which a white board with a rectangular note sticker is photographed with a digital camera and an area corresponding to the note sticker is extracted from the obtained image. In this case, for example, by performing character recognition on the extracted image, it is possible to save the characters described on the note sticker as electronic data. For such applications, there is known a method of extracting an area corresponding to a target object by extracting an edge from an image and recognizing an area surrounded by the edge.

関連する技術として、下記の図形切り出し方法が提案されている。この方法は、周囲の少なくとも２辺が直線となる構成のマトリックス状にデータを配した矩形の２次元コード図形又はそれに類似した図形を含む画像を読取り、その読取った画像から前記２次元コード図形又はそれに類似した図形を切出して認識する画像認識装置において、前記２次元コード図形又はそれに類似した図形の画像に対してハフ変換法及び最小２乗近似法により周囲の互いに交差する２本の直線の位置を検出するステップと、このステップにて検出した２本の直線の長さを検出するステップと、前記各ステップで検出した２本の直線の位置と長さを元に周囲の互いに交差する残り２本の直線の位置を検出するステップを設け、前記各ステップにより前記２次元コード図形又はそれに類似した図形を切出す。（例えば、特許文献１） As a related technique, the following graphic cutout method has been proposed. This method reads an image including a rectangular two-dimensional code figure in which data is arranged in a matrix having a configuration in which at least two sides are straight lines or a figure similar thereto, and reads the two-dimensional code figure or the like from the read image. In the image recognition apparatus for cutting out and recognizing a figure similar to it, the position of two straight lines intersecting each other by the Hough transform method and the least square approximation method for the image of the two-dimensional code figure or a figure similar thereto , A step of detecting the lengths of the two straight lines detected in this step, and the remaining two intersecting with each other based on the positions and lengths of the two straight lines detected in the respective steps. A step of detecting the position of the straight line of the book is provided, and the two-dimensional code figure or a figure similar thereto is cut out by each step. (For example, Patent Document 1)

また、他の関連する技術として、下記の画像整合方法が提案されている。この方法においては、画像を処理し、建造物の候補の領域を得て建造物領域を含む画素は１、含まない画素は０の値を有するバイナリ画像表現により領域の垂直水平方向の寸法を試作建造物の寸法のセットにサイズテストして、寸法が大き過ぎたり小さ過ぎる場合は、それは建造物ではないと判断する。バイナリ画像表現に基づき各建造物の候補の輪郭線のＸ−Ｙ画素リストを求め、更にバイナリ画像の画素格子に各領域の輪郭線の主要軸を整合させ、輪郭線リストの水平、垂直エッジ部の方向ヒストグラムを計算してヒストグラムのピーク集中率が現在のステッシュホルドより小さい場合には領域は建造物ではないと判断する。エッジ部のヒストグラム内のピークを領域画素の座標リストのコーナー候補として仮定し、最も多くのコーナー候補の有効となった組み合わせを建造物全周として選択する。（例えば、特許文献２） As another related technique, the following image matching method has been proposed. In this method, an image is processed to obtain a candidate area of a building, and the vertical and horizontal dimensions of the area are prototyped by a binary image representation having a value of 1 for pixels including the building area and 0 for pixels not including the building area. Size test a set of building dimensions and if a dimension is too large or too small, determine that it is not a building. An XY pixel list of contour lines of each building candidate is obtained based on the binary image representation, and the main axes of the contour lines of each region are aligned with the pixel grid of the binary image, and the horizontal and vertical edge portions of the contour list are obtained. The direction histogram is calculated, and if the peak concentration rate of the histogram is smaller than the current step, the area is determined not to be a building. A peak in the histogram of the edge portion is assumed as a corner candidate in the coordinate list of the region pixels, and a combination in which the most corner candidates are valid is selected as the entire circumference of the building. (For example, Patent Document 2)

特開平７−２２００８１号公報JP-A-7-220081 特開平５−１０１１８３号公報Japanese Patent Laid-Open No. 5-101183

従来技術においては、画像から特定の幾何学的図形を構成する物体に対応する領域を抽出する処理において、複数の物体が互いに重なり合っているときには、画像から抽出される各エッジがそれぞれどの物体を構成するエッジであるかを判定することが困難である。この場合、各物体に対応する領域を抽出する精度が低下する。また、物体の色が背景の色と類似しているときは、１つのエッジが複数の部分に分かれて抽出されてしまうことがある。この場合も、各物体に対応する領域を正しく抽出することは困難である。 In the prior art, in the process of extracting a region corresponding to an object constituting a specific geometric figure from an image, when a plurality of objects overlap each other, each edge extracted from the image constitutes which object. It is difficult to determine whether it is an edge to be performed. In this case, the accuracy of extracting a region corresponding to each object decreases. Further, when the color of the object is similar to the background color, one edge may be extracted in a plurality of portions. Also in this case, it is difficult to correctly extract a region corresponding to each object.

本発明の課題は、画像から特定の幾何学的図形を構成する物体に対応する領域を精度よく抽出する装置および方法を提供することである。 An object of the present invention is to provide an apparatus and a method for accurately extracting a region corresponding to an object constituting a specific geometric figure from an image.

本発明の１つの態様の画像認識装置は、画像からエッジセグメントを抽出するエッジ抽出部と、前記エッジ抽出部により抽出されたエッジセグメントを利用して形成される予め決められた幾何学的な図形の候補の組合せを取得する取得部と、前記取得部により取得された各組合せについて、前記図形の候補の外周が前記抽出されたエッジセグメントによってカバーされる程度を表す再現率、および、前記抽出されたエッジセグメントが前記図形の候補として利用される程度を表す適合率をそれぞれ算出する算出部と、前記再現率および前記適合率に基づいて決まる評価値が最大となる組合せに含まれる図形の候補に対応する領域を抽出する画像抽出部、を有する。 An image recognition apparatus according to one aspect of the present invention includes an edge extraction unit that extracts an edge segment from an image, and a predetermined geometric figure formed by using the edge segment extracted by the edge extraction unit. An acquisition unit for acquiring a combination of candidates, a reproducibility representing the extent to which an outer periphery of the candidate for the figure is covered by the extracted edge segment for each combination acquired by the acquisition unit, and the extracted A calculation unit that calculates a precision that represents a degree to which the edge segment is used as a candidate for the graphic, and a graphic candidate included in a combination that has a maximum evaluation value determined based on the recall and the precision An image extraction unit for extracting a corresponding region;

上述の態様によれば、画像から特定の幾何学的図形を構成する物体に対応する領域を精度よく抽出することができる。 According to the above aspect, it is possible to accurately extract a region corresponding to an object constituting a specific geometric figure from an image.

実施形態の画像認識装置の機能を示すブロック図である。It is a block diagram which shows the function of the image recognition apparatus of embodiment. 実施形態の画像認識方法を示すフローチャートである。It is a flowchart which shows the image recognition method of embodiment. ソーベルフィルタを示す図である。It is a figure which shows a Sobel filter. 入力画像から生成される２値化エッジ画像の例を示す図である。It is a figure which shows the example of the binarization edge image produced | generated from an input image. 画像の方向分解について説明する図である。It is a figure explaining the direction decomposition of an image. 方向分解処理について説明する図である。It is a figure explaining a direction decomposition process. 方向分解により生成された２値化エッジ画像の例を示す図である。It is a figure which shows the example of the binarization edge image produced | generated by direction decomposition | disassembly. ラベリングおよび外接矩形について説明する図である。It is a figure explaining a labeling and a circumscribed rectangle. 黒画素連結成分の重なり統合について説明する図である。It is a figure explaining the overlap integration of a black pixel connection component. エッジ抽出部により抽出されたエッジセグメントの例を示す図である。It is a figure which shows the example of the edge segment extracted by the edge extraction part. 矩形領域候補を抽出する処理を示すフローチャートである。It is a flowchart which shows the process which extracts a rectangular area candidate. エッジセグメントが矩形領域を構成する条件を説明する図（その１）である。FIG. 6 is a diagram (part 1) illustrating a condition for an edge segment to form a rectangular region. エッジセグメントが矩形領域を構成する条件を説明する図（その２）である。It is FIG. (2) explaining the conditions for which an edge segment comprises a rectangular area. エッジセグメントが矩形領域を構成する条件を説明する図（その３）である。FIG. 11 is a diagram (part 3) for explaining the conditions under which an edge segment forms a rectangular area. エッジセグメントが矩形領域を構成する条件を説明する図（その４）である。It is FIG. (4) explaining the conditions for which an edge segment comprises a rectangular area. 矩形領域候補を取得するためのグラフ及びクリークを説明する図である。It is a figure explaining the graph and clique for acquiring a rectangular area candidate. 矩形領域候補の組合せを取得する処理を示すフローチャートである。It is a flowchart which shows the process which acquires the combination of a rectangle area | region candidate. 矩形領域候補の組合せを取得するためのグラフ及びクリークを説明する図である。It is a figure explaining the graph and clique for acquiring the combination of a rectangle area candidate. 抽出されたエッジセグメントを示す図である。It is a figure which shows the extracted edge segment. （ａ）はエッジセグメントについてのグラフ、（ｂ）は抽出されたクリークを示す図である。(A) is a graph about an edge segment, (b) is a figure which shows the extracted clique. 矩形領域候補を示す図である。It is a figure which shows a rectangular area candidate. （ａ）は矩形領域候補についてのグラフ、（ｂ）は抽出されたクリークを示す図である。(A) is a graph about a rectangular area | region candidate, (b) is a figure which shows the extracted clique. 矩形領域候補の組合せを示す図である。It is a figure which shows the combination of a rectangular area | region candidate. 正三角形領域を抽出するための方向分解について説明する図である。It is a figure explaining direction decomposition for extracting an equilateral triangle field. エッジセグメントが正三角形領域を構成する条件を説明する図（その１）である。It is FIG. (1) explaining the conditions for which an edge segment comprises an equilateral triangle area | region. エッジセグメントが正三角形領域を構成する条件を説明する図（その２）である。It is FIG. (2) explaining the conditions for which an edge segment comprises an equilateral triangle area | region. エッジセグメントが正三角形領域を構成する条件を説明する図（その３）である。FIG. 6 is a diagram (part 3) for explaining conditions for an edge segment to form an equilateral triangle region; 画像認識装置を実現するためのコンピュータシステムのハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the computer system for implement | achieving an image recognition apparatus.

図１は、実施形態の画像認識装置の機能を示すブロック図である。実施形態の画像認識装置１は、画像データ格納部２、処理部３、抽出結果格納部８、出力部９を有する。 FIG. 1 is a block diagram illustrating functions of the image recognition apparatus according to the embodiment. The image recognition apparatus 1 according to the embodiment includes an image data storage unit 2, a processing unit 3, an extraction result storage unit 8, and an output unit 9.

画像データ格納部２は、デジタルカメラまたはスキャナ等により得られた画像データを格納する。ここで、画像認識装置１は、デジタルカメラまたはスキャナ等から画像データを受信するためのインタフェースを備えていてもよい。或いは、画像認識装置１は、デジタルカメラ等に内蔵されてもよい。また、画像データは、この実施例では、カラー画像データである。なお、以下の説明では、画像データを、単に「画像」と呼ぶことがある。 The image data storage unit 2 stores image data obtained by a digital camera or a scanner. Here, the image recognition apparatus 1 may include an interface for receiving image data from a digital camera, a scanner, or the like. Alternatively, the image recognition device 1 may be built in a digital camera or the like. The image data is color image data in this embodiment. In the following description, the image data may be simply referred to as “image”.

処理部３は、画像データ格納部２に格納されている画像から、予め決められた幾何学的な図形（この実施例では、矩形）を抽出する。処理部３は、画像から幾何学的な図形を抽出するために、エッジ抽出部４、取得部５、算出部６、画像抽出部７を有する。エッジ抽出部４、取得部５、算出部６、画像抽出部７の動作は、後で説明する。 The processing unit 3 extracts a predetermined geometric figure (in this embodiment, a rectangle) from the image stored in the image data storage unit 2. The processing unit 3 includes an edge extraction unit 4, an acquisition unit 5, a calculation unit 6, and an image extraction unit 7 in order to extract a geometric figure from an image. The operations of the edge extraction unit 4, the acquisition unit 5, the calculation unit 6, and the image extraction unit 7 will be described later.

抽出結果格納部８は、処理部３により抽出された図形に対応する領域の画像データを格納する。そして、出力部９は、抽出結果格納部８に格納されている、処理部３により抽出された領域の画像データを出力する。出力部９は、例えば、表示装置に画像データを出力する。或いは、出力部９は、外部の記憶装置に画像データを出力してもよい。 The extraction result storage unit 8 stores image data of an area corresponding to the graphic extracted by the processing unit 3. Then, the output unit 9 outputs the image data of the area extracted by the processing unit 3 stored in the extraction result storage unit 8. The output unit 9 outputs image data to a display device, for example. Alternatively, the output unit 9 may output image data to an external storage device.

図２は、実施形態の画像認識方法を示すフローチャートである。このフローチャートの処理は、例えば、画像認識装置１に抽出指示が与えられたときに、処理部３によって実行される。抽出指示は、この実施例では、画像から矩形領域を抽出する指示である。また、抽出指示は、例えば、ユーザにより画像認識装置１に入力される。そして、抽出指示が与えられると、処理部３は、画像データ格納部２からカラー画像を取得する。 FIG. 2 is a flowchart illustrating an image recognition method according to the embodiment. The processing of this flowchart is executed by the processing unit 3 when an extraction instruction is given to the image recognition apparatus 1, for example. In this embodiment, the extraction instruction is an instruction to extract a rectangular area from the image. Further, the extraction instruction is input to the image recognition apparatus 1 by the user, for example. When an extraction instruction is given, the processing unit 3 acquires a color image from the image data storage unit 2.

ステップＳ１において、エッジ抽出部４は、画像データ格納部２から取得したカラー画像をグレー化する。ステップＳ２において、エッジ抽出部４は、グレー化した画像に対してソーベルフィルタ演算を行う。このソーベルフィルタ演算により、エッジが強調された画像（以下、エッジ画像）が得られる。ステップＳ３において、エッジ抽出部４は、エッジ画像に対して２値化処理を実行し、２値化エッジ画像を生成する。ステップＳ４において、エッジ抽出部４は、ソーベルフィルタ演算の結果を利用して、２値化エッジ画像を予め決められている複数の方向に分解する。そして、ステップＳ５において、エッジ抽出部４は、各方向に分解された複数の２値化エッジ画像から、それぞれエッジセグメントを抽出する。 In step S 1, the edge extraction unit 4 grays the color image acquired from the image data storage unit 2. In step S2, the edge extraction unit 4 performs a Sobel filter operation on the grayed image. By this Sobel filter calculation, an image with an enhanced edge (hereinafter referred to as an edge image) is obtained. In step S3, the edge extraction unit 4 performs binarization processing on the edge image to generate a binarized edge image. In step S4, the edge extraction unit 4 decomposes the binarized edge image into a plurality of predetermined directions using the result of the Sobel filter calculation. In step S5, the edge extraction unit 4 extracts edge segments from the plurality of binarized edge images decomposed in the respective directions.

ステップＳ６において、取得部５は、ステップＳ１〜Ｓ５で抽出されたエッジセグメントを利用して形成される幾何学的な図形の候補をリストアップする。すなわち、取得部５は、ステップＳ１〜Ｓ５で抽出されたエッジセグメントの中から、矩形領域の外周（すなわち、辺）を構成する可能性のあるエッジセグメントを取り出すことにより、１または複数の矩形領域候補を取得する。そして、ステップＳ７において、取得部５は、矩形領域候補の組合せを取得する。このとき、取得部５は、矩形領域候補の組合せの中から、矩形領域間の関係が矛盾するものではなく、且つ、矩形領域候補とエッジセグメントとの関係が矛盾するものではない組合せを選択する。 In step S 6, the acquisition unit 5 lists geometric figure candidates formed using the edge segments extracted in steps S 1 to S 5. That is, the acquisition unit 5 extracts one or more rectangular regions from the edge segments extracted in steps S1 to S5 by taking out edge segments that may form the outer periphery (ie, the side) of the rectangular region. Get candidates. In step S7, the acquisition unit 5 acquires a combination of rectangular area candidates. At this time, the acquisition unit 5 selects a combination that does not contradict the relationship between the rectangular regions and does not contradict the relationship between the rectangular region candidate and the edge segment from among the combinations of the rectangular region candidates. .

ステップＳ８において、算出部６は、ステップＳ７で得られた各組合せについて、再現率および適合率を算出する。この実施例では、再現率は、矩形領域候補の辺がエッジセグメントによってカバーされている程度または割合を表す。適合率は、エッジ抽出部４によって抽出された全エッジセグメントのうち、矩形領域候補の辺として使用されている程度または割合を表す。 In step S8, the calculation unit 6 calculates a recall rate and a matching rate for each combination obtained in step S7. In this example, the recall represents the degree or ratio of the sides of the rectangular area candidate covered by the edge segment. The relevance ratio represents the degree or proportion of all edge segments extracted by the edge extraction unit 4 that are used as sides of rectangular area candidates.

ステップＳ９において、画像抽出部７は、再現率および適合率に基づいて決まる評価値が最大となる組合せを特定する。評価値は、たとえば、Ｆ値である。そして、画像抽出部７は、特定した組合せに含まれる矩形領域候補を、取得すべき矩形領域として抽出する。なお、白黒画像が処理部３に入力されるときは、ステップＳ１のグレー化処理は省略される。 In step S 9, the image extraction unit 7 specifies a combination that maximizes the evaluation value determined based on the recall rate and the matching rate. The evaluation value is, for example, an F value. Then, the image extraction unit 7 extracts rectangular area candidates included in the specified combination as rectangular areas to be acquired. When a black and white image is input to the processing unit 3, the graying process in step S1 is omitted.

次に、図面を参照しながら、図２に示すフローチャートの各ステップの処理について詳しく説明する。以下の説明では、画像認識装置１は、入力画像から矩形領域を抽出するものとする。 Next, processing of each step of the flowchart shown in FIG. 2 will be described in detail with reference to the drawings. In the following description, it is assumed that the image recognition device 1 extracts a rectangular area from an input image.

＜ステップＳ１：グレー化＞
カラー画像のグレー化は、ＲＧＢ空間の原点を通過する任意の直線に各画素の画素値を投影する処理に相当する。よって、ＲＧＢ空間における方向ベクトルの設定に応じて、様々なグレー化が可能である。例えば、各画素の画素値を明度で表すグレー化は、画像処理において広く行われており、下記の式で計算される。なお、各画素の画素値は、ＲＧＢ空間上の座標(R,G,B)で表される。
明度＝0.299R + 0.587G + 0.114B <Step S1: Graying>
Graying a color image corresponds to a process of projecting the pixel value of each pixel onto an arbitrary straight line passing through the origin of the RGB space. Therefore, various graying is possible according to the setting of the direction vector in the RGB space. For example, graying that represents the pixel value of each pixel by brightness is widely performed in image processing, and is calculated by the following equation. The pixel value of each pixel is represented by coordinates (R, G, B) in the RGB space.
Lightness = 0.299R + 0.587G + 0.114B

エッジ抽出部４は、他の方法でカラー画像をグレー化してもよい。例えば、エッジ抽出部４は、色差を利用してカラー画像をグレー化することができる。 The edge extraction unit 4 may gray out the color image by other methods. For example, the edge extraction unit 4 can gray the color image using the color difference.

＜ステップＳ２〜Ｓ３：ソーベルフィルタおよび２値化処理＞
エッジ抽出部４は、ステップＳ１で得られるグレー画像に対してソーベルフィルタ演算を実行する。ソーベルフィルタは、画像のエッジを強調するエッジオペレータの１つであり、グレー画像の各画素に対してＸ方向フィルタ演算およびＹ方向フィルタ演算を行う。Ｘ方向フィルタおよびＹ方向フィルタは、図３に示す通りである。即ち、画素(x,y)に対するＸ方向フィルタ演算の結果Ｓx(x,y)は、下式で得られる。
g(x+1,y+1)+2g(x,y+1)+g(x-1,y+1)-g(x+1,y-1)-2g(x,y-1)-g(x-1,y-1)
また、画素(x,y)に対するＹ方向フィルタ演算の結果Ｓy(x,y)は、下式で得られる。
g(x+1,y+1)+2g(x+1,y)+g(x+1,y-1)-g(x-1,y+1)-2g(x-1,y)-g(x-1,y-1)
なお、g(i,j)は、グレー化処理により計算された画素(i,j)の濃度値を表す。 <Steps S2 to S3: Sobel filter and binarization process>
The edge extraction unit 4 performs a Sobel filter operation on the gray image obtained in step S1. The Sobel filter is one of edge operators that emphasizes the edge of an image, and performs an X-direction filter operation and a Y-direction filter operation on each pixel of a gray image. The X direction filter and the Y direction filter are as shown in FIG. That is, the result Sx (x, y) of the X direction filter operation for the pixel (x, y) is obtained by the following equation.
g (x + 1, y + 1) + 2g (x, y + 1) + g (x-1, y + 1) -g (x + 1, y-1) -2g (x, y-1) -g (x-1, y-1)
Further, the result Sy (x, y) of the Y-direction filter operation for the pixel (x, y) is obtained by the following equation.
g (x + 1, y + 1) + 2g (x + 1, y) + g (x + 1, y-1) -g (x-1, y + 1) -2g (x-1, y) -g (x-1, y-1)
G (i, j) represents the density value of the pixel (i, j) calculated by the graying process.

続いて、エッジ抽出部４は、ソーベルフィルタ演算の結果を利用して、各画素について強度および方向を計算する。画素(x,y)の強度および方向は、下式で計算される。
強度＝√(Ｓx(x,y)2 + Ｓy(x,y)2)
方向＝arctan(Ｓy(x,y)/Ｓx(x,y)) Subsequently, the edge extraction unit 4 calculates the intensity and direction for each pixel using the result of the Sobel filter calculation. The intensity and direction of the pixel (x, y) is calculated by the following equation.
Strength = √ (Sx (x, y) 2 + Sy (x, y) 2)
Direction = arctan (Sy (x, y) / Sx (x, y))

ここで、各画素について得られる上述の強度値を濃度値と考えると、ソーベルフィルタから出力は、グレー画像として処理することができる。そして、エッジ抽出部４は、このグレー画像について２値化処理を行うことで、２値化エッジ画像を生成する。２値化処理は、例えば、大津の２値化方式を使用することができる。 Here, considering the intensity value obtained for each pixel as a density value, the output from the Sobel filter can be processed as a gray image. Then, the edge extracting unit 4 generates a binarized edge image by performing binarization processing on the gray image. For the binarization processing, for example, the binarization method of Otsu can be used.

図４は、入力画像から生成される２値化エッジ画像の例を示す。この例では、デジタルカメラでホワイトボートを撮影することにより入力画像が得られたものである。また、撮影されたホワイトボードには、４枚のメモ書シールが貼り付けられている。 FIG. 4 shows an example of the binarized edge image generated from the input image. In this example, an input image is obtained by photographing a white boat with a digital camera. In addition, four note stickers are affixed to the photographed whiteboard.

入力画像内には、図４（ａ）に示すように、ホワイトボード１１に対応する領域、およびメモ書シール１２ａ〜１２ｄに対応する領域が形成されている。なお、メモ書シール１２ａ〜１２ｄは、ホワイトボード１１と異なる色を有しており、図４（ａ）では、斜線領域で表されている。また、この例では、メモ書シール１２ａ、１２ｂは互いに一部が重なり合っており、メモ書シール１２ｂ、１２ｃも互いに一部が重なり合っている。なお、メモ書シール１２ａ〜１２ｄには、それぞれ文字等が表記されているが、ここでは図面を見やすくするために、文字等の表記は省略されている。 In the input image, as shown in FIG. 4A, an area corresponding to the whiteboard 11 and an area corresponding to the note stickers 12a to 12d are formed. Note that the note stickers 12a to 12d have a color different from that of the whiteboard 11, and are represented by hatched areas in FIG. In this example, note stickers 12a and 12b partially overlap each other, and note stickers 12b and 12c also partially overlap each other. Note that characters and the like are written on the memo pad seals 12a to 12d, but the characters and the like are omitted here for easy viewing of the drawings.

図４（ｂ）は、図４（ａ）に示す入力画像から生成される２値化エッジ画像を示す。この２値化エッジ画像においては、ホワイトボード１１の端部に対応する領域の画素、メモ書シール１２ａ〜１２ｄの端部に対応する領域の画素、およびメモ書シール１２ａ〜１２ｄに表記されている文字等に対応する領域の画素の濃度値（または、画素値）が「１」であり、他の領域の画素値が「０」である。すなわち、ホワイトボード１１の端部に対応する領域、メモ書シール１２ａ〜１２ｄの端部に対応する領域、およびメモ書シール１２ａ〜１２ｄに表記されている文字等に対応する領域に、エッジが存在している。 FIG. 4B shows a binarized edge image generated from the input image shown in FIG. In this binarized edge image, the pixels in the area corresponding to the end of the whiteboard 11, the pixels in the area corresponding to the ends of the note stickers 12a to 12d, and the note stickers 12a to 12d are described. The density value (or pixel value) of the pixel in the area corresponding to the character or the like is “1”, and the pixel value in the other area is “0”. That is, there is an edge in the area corresponding to the end of the whiteboard 11, the area corresponding to the end of the note stickers 12a to 12d, and the area corresponding to the characters written on the note stickers 12a to 12d. doing.

＜ステップＳ４：方向分解＞
エッジ抽出部４は、上述したように、ソーベルフィルタ演算の結果を利用して、各画素について強度および方向を計算する。ここで、強度は、上述の２値化処理により２値化されている。すなわち、図４（ｂ）に示すような２値化エッジ画像が生成されている。そして、抽出部４は、２値化エッジ画像を、予め決められた複数の方向に分解する。 <Step S4: Direction decomposition>
As described above, the edge extraction unit 4 calculates the intensity and direction for each pixel using the result of the Sobel filter calculation. Here, the intensity is binarized by the binarization process described above. That is, a binarized edge image as shown in FIG. 4B is generated. Then, the extraction unit 4 decomposes the binarized edge image into a plurality of predetermined directions.

この実施例では、２値化エッジ画像は、図５（ａ）に示す８つの方向dir0〜dir7に分解される。この場合、分解方向dir0〜dir7に対してそれぞれ下記の角度範囲が設定される。
Dir0：-π/8＜θ≦π/8
Dir1：π/8＜θ≦3π/8
Dir2：3π/8＜θ≦5π/8
Dir3：5π/8＜θ≦7π/8
Dir4：7π/8＜θ≦9π/8(-7π/8)
Dir5：-7π/8＜θ≦-5π/8
Dir6：-5π/8＜θ≦-3π/8
Dir7：-3π/8＜θ≦-π/8 In this embodiment, the binarized edge image is decomposed into eight directions dir0 to dir7 shown in FIG. In this case, the following angle ranges are set for the disassembly directions dir0 to dir7, respectively.
Dir0: -π / 8 <θ ≦ π / 8
Dir1: π / 8 <θ ≦ 3π / 8
Dir2: 3π / 8 <θ ≦ 5π / 8
Dir3: 5π / 8 <θ ≦ 7π / 8
Dir4: 7π / 8 <θ ≦ 9π / 8 (-7π / 8)
Dir5: -7π / 8 <θ ≦ -5π / 8
Dir6: -5π / 8 <θ ≦ -3π / 8
Dir7: -3π / 8 <θ ≦ -π / 8

図５（ｂ）は、画像領域の方向と分解方向との関係を示す。図５（ｂ）に示す例では、画像上に２つの矩形領域１２ｅ、１２ｆが形成されている。この場合、矩形領域１２ｅの下辺は、分解方向dir0の角度範囲に属する。また、矩形領域１２ｅの右辺、上辺、左辺は、それぞれ、分解方向dir2、dir4、dir6の角度範囲に属する。同様に、矩形領域１２ｆの各辺は、分解方向dir1、dir3、dir5、dir7の角度範囲に属する。 FIG. 5B shows the relationship between the direction of the image area and the decomposition direction. In the example shown in FIG. 5B, two rectangular areas 12e and 12f are formed on the image. In this case, the lower side of the rectangular area 12e belongs to the angle range of the decomposition direction dir0. Further, the right side, the upper side, and the left side of the rectangular area 12e belong to the angular ranges of the decomposition directions dir2, dir4, and dir6, respectively. Similarly, each side of the rectangular area 12f belongs to the angular range of the decomposition directions dir1, dir3, dir5, and dir7.

図６は、方向分解処理について説明する図である。図６において、各マス目は、それぞれ１つの画素に相当する。また、図６（ａ）に示す２値化エッジ画像において、各画素内の上段に表記されている値は、ソーベルフィルタ演算の結果に基づいて得られる強度を表している。ここで、強度は、２値化されている。また、各画素内の下段に表記されている値は、ソーベルフィルタ演算の結果に基づいて得られる方向を表している。ただし、強度がゼロである画素においては、方向を表す値は省略されている。 FIG. 6 is a diagram illustrating the direction decomposition process. In FIG. 6, each square corresponds to one pixel. In the binarized edge image shown in FIG. 6A, the value written in the upper part of each pixel represents the intensity obtained based on the result of the Sobel filter calculation. Here, the intensity is binarized. Further, the value written in the lower part of each pixel represents the direction obtained based on the result of the Sobel filter calculation. However, the value indicating the direction is omitted in a pixel having an intensity of zero.

エッジ抽出部４は、各分解方向dir0〜dir7において、強度が１であり、且つ、方向が対応する分解方向の角度範囲に属する画素を抽出する。例えば、分解方向dir4については、強度が１であり、且つ、方向が7π/8〜9π/8（すなわち、157.5〜202.5度）に属する画素を抽出する。この結果、図６（ａ）に示す２値化エッジ画像から５個の画素が抽出され、方向dir4の２値化エッジ画像として、図６（ｂ）に示す画像が得られる。他の分解方向においても、それぞれ、同様に２値化エッジ画像が生成される。 The edge extraction unit 4 extracts pixels having an intensity of 1 in each decomposition direction dir0 to dir7 and belonging to the angle range of the corresponding decomposition direction. For example, for the decomposition direction dir4, pixels whose intensity is 1 and whose direction belongs to 7π / 8 to 9π / 8 (that is, 157.5 to 202.5 degrees) are extracted. As a result, five pixels are extracted from the binarized edge image shown in FIG. 6A, and the image shown in FIG. 6B is obtained as the binarized edge image in the direction dir4. Similarly, the binarized edge images are generated in the other decomposition directions, respectively.

図７は、方向分解により生成された２値化エッジ画像の例を示す。ここで、図７（ａ）は、図４（ｂ）に示す２値化エッジ画像から得られる、分解方向dir2の２値化エッジ画像を示している。この２値化エッジ画像は、各メモ書シール１２ａ〜１２ｄの右側端部に相当するエッジを含んでいる。また、図７（ｂ）は、図４（ｂ）に示す２値化エッジ画像から得られる、分解方向dir6の２値化エッジ画像を示している。この２値化エッジ画像は、各メモ書シール１２ａ〜１２ｄの左側端部に相当するエッジを含んでいる。同様に、図７（ｃ）は、分解方向dir4の２値化エッジ画像を示しており、各メモ書シール１２ａ〜１２ｄの上側端部に相当するエッジを含んでいる。図７（ｄ）は、分解方向dir0の２値化エッジ画像を示しており、各メモ書シール１２ａ〜１２ｄの下側端部に相当するエッジを含んでいる。 FIG. 7 shows an example of a binarized edge image generated by direction decomposition. Here, FIG. 7A shows a binarized edge image in the decomposition direction dir2 obtained from the binarized edge image shown in FIG. 4B. This binarized edge image includes an edge corresponding to the right end of each of the note stickers 12a to 12d. FIG. 7B shows a binarized edge image in the decomposition direction dir6 obtained from the binarized edge image shown in FIG. 4B. This binarized edge image includes an edge corresponding to the left end of each of the note stickers 12a to 12d. Similarly, FIG. 7C shows a binarized edge image in the decomposition direction dir4 and includes an edge corresponding to the upper end of each of the note stickers 12a to 12d. FIG. 7D shows a binarized edge image in the decomposition direction dir0 and includes an edge corresponding to the lower end of each of the note stickers 12a to 12d.

＜ステップＳ５：エッジセグメントの抽出＞
エッジ抽出部４は、各方向の２値化エッジ画像において、それぞれエッジセグメントを抽出する。エッジセグメントは、エッジを構成する要素である。また、エッジセグメントは、この例では、４点で囲まれる領域であって、それら４点の座標で表される。エッジセグメントの抽出は、以下に説明するラベリング処理、重なり統合処理、ノイズ除去処理、統合処理を含む。 <Step S5: Extraction of Edge Segment>
The edge extraction unit 4 extracts edge segments from the binarized edge image in each direction. An edge segment is an element constituting an edge. In this example, the edge segment is an area surrounded by four points, and is represented by the coordinates of these four points. Edge segment extraction includes labeling processing, overlap integration processing, noise removal processing, and integration processing described below.

（１）ラベリング
エッジ抽出部４は、２値化エッジ画像において、各黒画素連結成分に対してラベルを付与する。黒画素連結成分は、所定数よりも多くの黒画素が連結している領域である。黒画素とは、２値化された画素値（または、濃度値）が１である画素である。また、ラベルは、各黒画素連結成分を識別する識別番号である。図８（ａ）に示す例では、各黒画素連結成分に対して、ラベルＬ１、Ｌ２が付与されている。 (1) The labeling edge extraction unit 4 gives a label to each black pixel connected component in the binarized edge image. The black pixel connected component is an area where more than a predetermined number of black pixels are connected. A black pixel is a pixel whose binarized pixel value (or density value) is 1. The label is an identification number for identifying each black pixel connected component. In the example shown in FIG. 8A, labels L1 and L2 are assigned to each black pixel connected component.

エッジ抽出部４は、処理対象の２値化エッジ画像の座標系において、各黒画素連結成分を射影する。処理対象の２値化エッジ画像の座標系は、入力画像の座標系に対して、分解方向の角度だけ回転した直交座標系である。例えば、分解方向dir1の２値化エッジ画像の座標系は、入力画像の座標系に対してπ/4だけ回転している。そして、エッジ抽出部４は、図８（ｂ）に示すように、処理対象の２値化エッジ画像の座標系の各射影軸に黒画素連結成分を射影することで得られる射影値の最大値および最小値を取得する。 The edge extraction unit 4 projects each black pixel connected component in the coordinate system of the binarized edge image to be processed. The coordinate system of the binarized edge image to be processed is an orthogonal coordinate system rotated by an angle in the decomposition direction with respect to the coordinate system of the input image. For example, the coordinate system of the binarized edge image in the decomposition direction dir1 is rotated by π / 4 with respect to the coordinate system of the input image. Then, as shown in FIG. 8B, the edge extraction unit 4 projects the maximum projection value obtained by projecting the black pixel connected component onto each projection axis of the coordinate system of the binarized edge image to be processed. And get the minimum value.

エッジ抽出部４は、上述の各最大値および各最小値を通過し、且つ、それぞれ対応する射影軸に直交する直線の交点を求める。ここで、図８（ｂ）において、一方の射影軸上の最大値および最小値をそれぞれａ、ｂとし、他方の射影軸上の最大値および最小値をそれぞれｃ、ｄとすると、４つの交点座標(a,c)(a,d)(b,c)(b,d)が得られる。これらの４つの交点座標は、黒画素連結成分を取り囲む最小の矩形（すなわち、外接矩形）の４つの頂点の座標を表す。そして、エッジ抽出部４は、ラベリング処理の結果として、各黒画素連結成分について、黒画素連結成分を識別するラベルおよび黒画素連結成分の外接矩形を表す４つの交点座標を出力する。 The edge extraction unit 4 obtains intersections of straight lines that pass through the above-described maximum values and minimum values and are orthogonal to the corresponding projection axes. Here, in FIG. 8B, if the maximum value and the minimum value on one projection axis are a and b, respectively, and the maximum value and the minimum value on the other projection axis are c and d, respectively, there are four intersections. Coordinates (a, c) (a, d) (b, c) (b, d) are obtained. These four intersection coordinates represent the coordinates of the four vertices of the smallest rectangle (that is, the circumscribed rectangle) surrounding the black pixel connected component. Then, as a result of the labeling process, the edge extraction unit 4 outputs, for each black pixel connected component, a label for identifying the black pixel connected component and four intersection coordinates representing the circumscribed rectangle of the black pixel connected component.

（２）重なり統合
エッジ抽出部４は、２値化エッジ画像において、任意の２つの黒画素連結成分に対して、それぞれの外接矩形が互いに重なり合うか判定する。図９（ａ）に示す例では、黒画素連結成分Ｌ３、Ｌ４の外接矩形が互いに重なり合っている。この場合、エッジ抽出部４は、黒画素連結成分Ｌ３、Ｌ４を１つの黒画素連結成分に統合する。すなわち、これら２つの黒画素連結成分に対して同じラベルが付与される。図９（ｂ）においては、これら２つの黒画素連結成分に対して同じラベルＬ３が付与されている。また、これら２つの黒画素連結成分を取り囲む最小の矩形（黒画素連結成分Ｌ３、Ｌ４の外接矩形）の各頂点の座標が算出される。そして、エッジ抽出部４は、互いに重なり合う黒画素連結成分が存在しなくなるまで、重なり統合処理を繰り返す。 (2) The overlap integrated edge extraction unit 4 determines whether or not the circumscribed rectangles overlap each other for any two black pixel connected components in the binarized edge image. In the example shown in FIG. 9A, the circumscribed rectangles of the black pixel connected components L3 and L4 overlap each other. In this case, the edge extraction unit 4 integrates the black pixel connected components L3 and L4 into one black pixel connected component. That is, the same label is given to these two black pixel connected components. In FIG. 9B, the same label L3 is given to these two black pixel connected components. Also, the coordinates of the vertices of the minimum rectangle (the circumscribed rectangle of the black pixel connection components L3 and L4) surrounding these two black pixel connection components are calculated. Then, the edge extraction unit 4 repeats the overlapping integration process until there are no black pixel connected components that overlap each other.

（３）ノイズ除去
エッジ抽出部４は、重なり統合処理後に得られる黒画素連結成分の集合に対してノイズ除去処理を行う。例えば、重なり統合処理後に得られる黒画素連結成分の大きさが所定値よりも小さいときは、その黒画素連結成分は、ノイズと判定されて上述の集合から取り除かれる。なお、黒画素連結成分の大きさは、例えば、その黒画素連結成分の外接矩形の長辺の長さで規定される。 (3) The noise removal edge extraction unit 4 performs noise removal processing on a set of black pixel connected components obtained after overlap integration processing. For example, when the size of the black pixel connected component obtained after the overlap integration process is smaller than a predetermined value, the black pixel connected component is determined as noise and removed from the above set. The size of the black pixel connected component is defined by, for example, the length of the long side of the circumscribed rectangle of the black pixel connected component.

（４）統合
エッジ抽出部４は、２値化エッジ画像において、互いに近接する黒画素連結成分どうしを統合する。すなわち、互いに近接する黒画素連結成分は、１つの黒画素連結成分に統合される。ここで、黒画素連結成分間の距離は、例えば、各黒画素連結成分を上述した射影軸に射影したときの射影値の差分で表される。この場合、少なくとも一方の射影軸上の射影値の差分が予め設定されている閾値よりも小さければ、黒画素連結成分を統合すべきと判定される。なお、２つの黒画素連結成分が統合されたときは、上述の重なり統合処理と同様に、それら２つの黒画素連結成分に対して同じラベルが付与される。また、統合された２つの黒画素連結成分を取り囲む最小の矩形の各頂点の座標が算出される。 (4) The integrated edge extraction unit 4 integrates adjacent black pixel components in the binarized edge image. That is, the black pixel connected components close to each other are integrated into one black pixel connected component. Here, the distance between the black pixel connected components is represented by, for example, a difference in projection values when each black pixel connected component is projected onto the projection axis described above. In this case, if the difference between the projection values on at least one projection axis is smaller than a preset threshold value, it is determined that the black pixel connected components should be integrated. In addition, when two black pixel connection components are integrated, the same label is given to these two black pixel connection components similarly to the above-described overlap integration processing. Also, the coordinates of the vertices of the minimum rectangle surrounding the two integrated black pixel connected components are calculated.

エッジ抽出部４は、互いに近接する黒画素連結成分が存在しなくなるまで、統合処理を繰り返す。この統合処理により得られる各黒画素連結成分（または、各黒画素連結成分の外接矩形）が、エッジセグメントとして抽出される。 The edge extraction unit 4 repeats the integration process until there are no black pixel connected components close to each other. Each black pixel connected component (or a circumscribed rectangle of each black pixel connected component) obtained by this integration processing is extracted as an edge segment.

図１０は、エッジ抽出部４により抽出されたエッジセグメントの例を示す図である。図１０（ａ）は、図７（ａ）に示す分解方向dir2における２値化エッジ画像から抽出されたエッジセグメントを示している。この例では、エッジセグメントＥ１〜Ｅ５が抽出されている。エッジセグメントＥ１〜Ｅ４は、それぞれ、図４（ａ）に示すメモ書シール１２ａ〜１２ｄの右側端部（または、その一部）に対応している。エッジセグメントＥ５は、ホワイトボード１１の端部に対応している。図１０（ｂ）は、分解方向dir0〜dir7の２値化エッジ画像からそれぞれ抽出されるエッジセグメントをすべて重ねて示している。この例では、エッジセグメントＥ１〜Ｅ１８が抽出されている。 FIG. 10 is a diagram illustrating an example of edge segments extracted by the edge extraction unit 4. FIG. 10A shows edge segments extracted from the binarized edge image in the decomposition direction dir2 shown in FIG. In this example, edge segments E1 to E5 are extracted. Each of the edge segments E1 to E4 corresponds to the right end (or part thereof) of the note stickers 12a to 12d shown in FIG. The edge segment E5 corresponds to the end portion of the whiteboard 11. FIG. 10B shows all the edge segments respectively extracted from the binarized edge images in the decomposition directions dir0 to dir7. In this example, edge segments E1 to E18 are extracted.

このように、エッジ抽出部４は、各分解方向dir0〜dir7の２値化エッジ画像からそれぞれエッジセグメントを抽出する。各エッジセグメントは、それぞれラベルによって識別される。また、各エッジセグメントの位置および形状は、そのエッジセグメント内の黒画素連結成分の外接矩形の４つの頂点の座標によって表される。 As described above, the edge extraction unit 4 extracts edge segments from the binarized edge images in the respective decomposition directions dir0 to dir7. Each edge segment is identified by a label. The position and shape of each edge segment are represented by the coordinates of the four vertices of the circumscribed rectangle of the black pixel connected component in the edge segment.

＜ステップＳ６：矩形領域候補の取得＞
取得部５は、ステップＳ１〜Ｓ５で抽出されたエッジセグメントに基づいて、すべての矩形領域候補をリストアップする。矩形領域候補は、矩形領域を構成する可能性のあるエッジセグメントの集合で表される。 <Step S6: Acquisition of Rectangular Area Candidate>
The acquiring unit 5 lists all the rectangular area candidates based on the edge segments extracted in steps S1 to S5. A rectangular area candidate is represented by a set of edge segments that may form a rectangular area.

図１１は、矩形領域候補を抽出する処理を示すフローチャートである。このフローチャートは、上述のようにしてエッジ抽出部４によりエッジセグメントが抽出された後に、取得部５により実行される。なお、取得部５は、エッジ抽出４からエッジセグメント情報を受け取る。エッジセグメント情報は、エッジセグメントの個数を表す情報、各エッジセグメントの外接矩形の座標、各エッジセグメントが抽出された分解方向（dir0〜dir7）を表す情報を含む。 FIG. 11 is a flowchart illustrating a process of extracting rectangular area candidates. This flowchart is executed by the acquisition unit 5 after the edge segment is extracted by the edge extraction unit 4 as described above. The acquisition unit 5 receives edge segment information from the edge extraction 4. The edge segment information includes information indicating the number of edge segments, the coordinates of the circumscribed rectangle of each edge segment, and information indicating the decomposition direction (dir0 to dir7) from which each edge segment is extracted.

ステップＳ１１において、取得部５は、入力されるエッジセグメント情報からグラフを作成する。ステップＳ１２において、取得部５は、このグラフからクリークを抽出することにより、矩形領域候補として、矩形領域を構成する可能性のあるエッジセグメントの集合を求める。そして、ステップＳ１３において、取得部５は、所定の最大サイズよりも大きな矩形領域候補、および所定の最小サイズよりも小さい矩形領域候補を、ノイズとみなして除去する。これにより、最終的な矩形領域候補が得られる。そして、取得部５は、矩形領域候補の個数を表す情報、及び各矩形領域候補を構成するエッジセグメントの識別番号（すなわち、ラベル）を出力する。 In step S11, the acquisition unit 5 creates a graph from the input edge segment information. In step S12, the acquisition unit 5 extracts a clique from the graph to obtain a set of edge segments that may form a rectangular area as a rectangular area candidate. In step S 13, the acquiring unit 5 removes the rectangular area candidate larger than the predetermined maximum size and the rectangular area candidate smaller than the predetermined minimum size as noise. Thereby, a final rectangular area candidate is obtained. Then, the acquisition unit 5 outputs information indicating the number of rectangular area candidates and identification numbers (that is, labels) of edge segments constituting each rectangular area candidate.

（１）グラフの作成
取得部５は、ステップＳ１１において、入力されるエッジセグメント情報からグラフを作成する。グラフは、ノードおよびノード間を接続するパスから構成される。この例では、各ノードは、１つのエッジセグメントに対応する。また、ノード間を接続するパスは、対応する２つのエッジセグメントが矩形領域を構成する可能性を表す。 (1) In step S11, the graph creation acquisition unit 5 creates a graph from the input edge segment information. The graph is composed of nodes and paths connecting the nodes. In this example, each node corresponds to one edge segment. In addition, the path connecting the nodes represents the possibility that the corresponding two edge segments form a rectangular area.

グラフは、各エッジセグメントについて、当該エッジセグメントおよび他の各エッジセグメントを利用してそれぞれ矩形領域を構成する条件を満たすか否かを判定することにより作成される。図１０（ｂ）に示す例では、エッジセグメントＥ１について、各エッジセグメントＥ２〜Ｅ１８と組み合わせることで、矩形領域を構成する条件を満たすか判定される。例えば、エッジセグメントＥ１、Ｅ２間の判定では、エッジセグメントＥ１が矩形領域の１つの辺に対応すると過程したときに、エッジセグメントＥ２が同じ矩形領域の任意の辺に対応するか否かがチェックされる。そして、取得部５は、すべてのエッジセグメントに組合せについてこの判定を行うことにより、グラフを作成する。 The graph is created for each edge segment by determining whether or not a condition for forming a rectangular area is satisfied using the edge segment and each of the other edge segments. In the example shown in FIG. 10B, the edge segment E1 is combined with each of the edge segments E2 to E18 to determine whether or not the condition for forming the rectangular area is satisfied. For example, in the determination between the edge segments E1 and E2, when it is determined that the edge segment E1 corresponds to one side of the rectangular area, it is checked whether the edge segment E2 corresponds to an arbitrary side of the same rectangular area. The And the acquisition part 5 produces a graph by performing this determination about the combination to all the edge segments.

２つのエッジセグメントが矩形領域を構成するための条件の実施例を示す。ここでは、図５（ａ）に示す分解方向dir2のエッジセグメントを一例として説明する。なお、分解方向dir2のエッジセグメントは、矩形領域の右辺に対応する。 An example of conditions for two edge segments to form a rectangular region is shown. Here, an edge segment in the decomposition direction dir2 shown in FIG. 5A will be described as an example. The edge segment in the decomposition direction dir2 corresponds to the right side of the rectangular area.

以下の説明においては、エッジセグメントＬの重心座標を(L.ave_x, L.ave_y)を表記する。エッジセグメントの重心座標は、エッジセグメントの形状を特定する外接矩形の４つの頂点座標から算出される。エッジセグメントの形状を特定する外接矩形については、エッジセグメントを抽出する際の重なり統合処理および統合処理に関連して説明した通りである。また、エッジセグメントの形状を特定する外接矩形の４つの頂点について、最大のｘ座標をL.max_x、最大のｙ座標をL.max_y、最小のｘ座標をL.min_x、最小のｙ座標をL.min_yと表記する。 In the following description, the centroid coordinates of the edge segment L are expressed as (L.ave_x, L.ave_y). The barycentric coordinates of the edge segment are calculated from the four vertex coordinates of the circumscribed rectangle that specifies the shape of the edge segment. The circumscribed rectangle that specifies the shape of the edge segment is as described in connection with the overlap integration process and the integration process when extracting the edge segment. For the four vertices of the circumscribed rectangle that specify the shape of the edge segment, the maximum x coordinate is L.max_x, the maximum y coordinate is L.max_y, the minimum x coordinate is L.min_x, and the minimum y coordinate is L. Expressed as .min_y.

取得部５は、仮想的な矩形領域を設定する。そして、分解方向dir2のエッジセグメントの１つが、その仮想的な矩形領域の右辺（または、その一部）に対応していると仮定する。図１２〜図１５に示す例では、仮想的な矩形領域２１が設定され、分解方向dir2のエッジセグメントＬ１について、矩形領域候補を構成する他のエッジセグメント（以下、探索対象エッジセグメント）が探索される。 The acquisition unit 5 sets a virtual rectangular area. Then, it is assumed that one of the edge segments in the decomposition direction dir2 corresponds to the right side (or part thereof) of the virtual rectangular area. In the example shown in FIGS. 12 to 15, a virtual rectangular area 21 is set, and for the edge segment L1 in the decomposition direction dir2, other edge segments (hereinafter referred to as search target edge segments) constituting the rectangular area candidate are searched. The

探索対象エッジセグメント（Ｌ２）が分解方向dir0から抽出された場合、下記の条件を満たせば、取得部５は、エッジセグメントＬ１、Ｌ２が矩形領域を構成する可能性があると判定する。この場合、エッジセグメントＬ１およびＬ２は、図１２に示すように、それぞれ矩形領域２１の右辺および下辺に対応する。
L1.ave_x >= L2.max_x かつ L1.max_y <= L2.ave_y
なお、２つ目の不等式において、L2.ave_yの代わりにL2.min_yを使用してもよい。 When the search target edge segment (L2) is extracted from the decomposition direction dir0, the acquisition unit 5 determines that the edge segments L1 and L2 may form a rectangular area if the following condition is satisfied. In this case, the edge segments L1 and L2 correspond to the right side and the lower side of the rectangular area 21, respectively, as shown in FIG.
L1.ave_x> = L2.max_x and L1.max_y <= L2.ave_y
In the second inequality, L2.min_y may be used instead of L2.ave_y.

探索対象エッジセグメント（Ｌ３）が分解方向dir2から抽出された場合、下記の条件を満たせば、取得部５は、エッジセグメントＬ１、Ｌ３が矩形領域を構成する可能性があると判定する。この場合、エッジセグメントＬ１およびＬ３は、図１３に示すように、いずれも矩形領域２１の右辺に対応する。なお、ＴＨ１は、予め決められた所定の閾値である。
|L1.ave_x - L3.ave_x| < TH1 When the search target edge segment (L3) is extracted from the decomposition direction dir2, the acquisition unit 5 determines that the edge segments L1 and L3 may form a rectangular area if the following condition is satisfied. In this case, the edge segments L1 and L3 both correspond to the right side of the rectangular area 21, as shown in FIG. TH1 is a predetermined threshold value determined in advance.
| L1.ave_x-L3.ave_x | <TH1

探索対象エッジセグメント（Ｌ４）が分解方向dir4から抽出された場合、下記の条件を満たせば、取得部５は、エッジセグメントＬ１、Ｌ４が矩形領域を構成する可能性があると判定する。この場合、エッジセグメントＬ１およびＬ４は、図１４に示すように、それぞれ矩形領域２１の右辺および上辺に対応する。
L1.ave_x >= L4.max_x かつ L1.min_y >= L4.ave_y
なお、２つ目の不等式において、L4.ave_yの代わりにL4.max_yを使用してもよい。 When the search target edge segment (L4) is extracted from the decomposition direction dir4, the acquisition unit 5 determines that the edge segments L1 and L4 may form a rectangular area if the following condition is satisfied. In this case, the edge segments L1 and L4 correspond to the right side and the upper side of the rectangular area 21, respectively, as shown in FIG.
L1.ave_x> = L4.max_x and L1.min_y> = L4.ave_y
In the second inequality, L4.max_y may be used instead of L4.ave_y.

探索対象エッジセグメント（Ｌ５）が分解方向dir6から抽出された場合、下記の条件を満たせば、取得部５は、エッジセグメントＬ１、Ｌ５が矩形領域を構成する可能性があると判定する。この場合、エッジセグメントＬ１およびＬ５は、図１５に示すように、それぞれ矩形領域２１の右辺および左辺に対応する。
L1.ave_x >= L5.ave_x When the search target edge segment (L5) is extracted from the decomposition direction dir6, the acquisition unit 5 determines that the edge segments L1 and L5 may form a rectangular area if the following condition is satisfied. In this case, the edge segments L1 and L5 correspond to the right side and the left side of the rectangular area 21, respectively, as shown in FIG.
L1.ave_x> = L5.ave_x

探索対象エッジセグメントが分解方向dir0、dir2、dir4、dir6以外の分解方向から抽出された場合は、取得部５は、エッジセグメントＬ１およびその探索対象エッジセグメントが矩形領域を構成する可能性が無いと判定する。なお、ここでは、図１２〜図１５を参照しながら、一方のエッジセグメントが矩形領域の右辺である場合の判定条件を説明したが、一方のエッジセグメントが矩形領域の左辺、上辺、または下辺である場合の判定条件も、同様に得ることができる。 When the search target edge segment is extracted from a decomposition direction other than the decomposition directions dir0, dir2, dir4, and dir6, the acquisition unit 5 has no possibility that the edge segment L1 and the search target edge segment form a rectangular area. judge. Here, the determination condition in the case where one edge segment is the right side of the rectangular area has been described with reference to FIGS. 12 to 15, but one edge segment is the left side, the upper side, or the lower side of the rectangular area. The determination condition in a certain case can be obtained similarly.

このように、取得部５は、各エッジセグメントについて他のエッジセグメントと共に矩形領域を構成する可能性があるか判定する。したがって、上記判定により作成されるグラフは、抽出されたエッジセグメントの総数がｎである場合、ｎ×ｎ行列で表される。この場合、取得部５は、ｉ番目のエッジセグメントとｊ番目のエッジセグメントとの組合せが矩形領域を構成するための条件を満たすときは、この行列の(i,j)成分および(j,i)成分にそれぞれ１を設定し、この組合せが上記条件を満たさないときは、この行列の(i,j)成分および(j,i)成分にそれぞれ０を設定する。作成されたグラフの一例を図１６（ａ）に示す。 Thus, the acquisition unit 5 determines whether each edge segment has a possibility of forming a rectangular area together with other edge segments. Therefore, the graph created by the above determination is represented by an n × n matrix when the total number of extracted edge segments is n. In this case, when the combination of the i-th edge segment and the j-th edge segment satisfies the condition for forming the rectangular area, the acquisition unit 5 and the (i, j) component and (j, i When the combination does not satisfy the above condition, 0 is set for each of the (i, j) and (j, i) components of this matrix. An example of the created graph is shown in FIG.

（２）クリークの抽出
取得部５は、上述のようにして作成したグラフからクリークを抽出する。クリークは、グラフの極大完全部分グラフに相当する。グラフが完全であるとは、グラフを構成する全てのノードがそれぞれ自分以外の全てのノードとパスで接続されている状態を意味する。また、極大完全部分グラフは、完全な部分グラフであって、且つ、その部分グラフを真に包含する他の完全部分グラフが存在しない部分グラフを意味する。したがって、クリークを構成するエッジセグメントの集合は、自分以外のエッジセグメントのすべてと互いに矩形領域を構成する可能性がある。図１６（ａ）に示すグラフから抽出されたクリークの実施例を図１６（ｂ）に示す。なお、図１６（ｂ）において、「−１」は、クリークの構成要素の終了を意味している。 (2) The clique extraction acquisition unit 5 extracts a clique from the graph created as described above. A clique corresponds to a maximal complete subgraph of the graph. The complete graph means that all nodes constituting the graph are connected to all other nodes by paths. In addition, the maximal complete subgraph means a complete subgraph and a subgraph in which there is no other complete subgraph that truly includes the subgraph. Therefore, there is a possibility that the set of edge segments constituting the clique forms a rectangular area with all of the edge segments other than itself. An example of a clique extracted from the graph shown in FIG. 16A is shown in FIG. In FIG. 16B, “−1” means the end of the clique component.

図１６（ｂ）に示す実施例において、例えば、クリーク１は、矩形領域を構成する可能性のあるエッジセグメントの集合として、Ｌ２５、Ｌ２４、Ｌ２３、Ｌ１８、Ｌ１を要素として有している。この場合、Ｌ２５、Ｌ２４、Ｌ２３、Ｌ１８、Ｌ１の中から任意の２つのエッジセグメントを抽出すると、抽出された２つのエッジセグメントは、常に、上述の矩形領域を構成するための条件を満たすことになる。 In the embodiment shown in FIG. 16B, for example, the clique 1 has L25, L24, L23, L18, and L1 as elements as a set of edge segments that may form a rectangular area. In this case, when any two edge segments are extracted from L25, L24, L23, L18, and L1, the two extracted edge segments always satisfy the condition for forming the rectangular area described above. Become.

このように、取得部５は、エッジセグメント情報からグラフを作成し、さらにそのグラフからクリークを抽出する。ここで、各クリークは、矩形領域を構成する可能性のあるエッジセグメントの集合である。すなわち、取得部５は、複数のエッジセグメントの集合で表現される、１または複数の矩形領域候補を取得する。 Thus, the acquisition unit 5 creates a graph from the edge segment information, and further extracts a clique from the graph. Here, each clique is a set of edge segments that may form a rectangular region. That is, the acquisition unit 5 acquires one or a plurality of rectangular area candidates expressed by a set of a plurality of edge segments.

＜ステップＳ７：矩形領域候補の組合せの取得＞
取得部５は、ステップＳ６で抽出した矩形領域候補に基づいて、矩形領域候補の組合せをリストアップする。矩形領域候補の組合せは、両立可能な矩形領域候補の集合で表される。 <Step S7: Acquisition of Combination of Rectangular Area Candidates>
The acquisition unit 5 lists combinations of rectangular area candidates based on the rectangular area candidates extracted in step S6. A combination of rectangular area candidates is represented by a set of compatible rectangular area candidates.

図１７は、矩形領域候補の組合せを取得する処理を示すフローチャートである。このフローチャートは、上述のようにして矩形領域候補が抽出された後に、取得部５により実行される。このとき、取得部５は、矩形領域候補情報を使用する。矩形領域候補情報は、矩形領域候補の個数を表す情報、各矩形領域候補を識別する番号、および各矩形領域候補を構成するエッジセグメントの番号を含む。 FIG. 17 is a flowchart showing a process for acquiring a combination of rectangular area candidates. This flowchart is executed by the acquisition unit 5 after the rectangular area candidate is extracted as described above. At this time, the acquisition unit 5 uses the rectangular area candidate information. The rectangular area candidate information includes information indicating the number of rectangular area candidates, a number for identifying each rectangular area candidate, and a number of edge segments constituting each rectangular area candidate.

ステップＳ２１において、取得部５は、矩形領域候補情報からグラフを作成する。ステップＳ２２において、取得部５は、このグラフからクリークを抽出することにより、矩形領域候補の組合せを求める。そして、取得部５は、矩形領域候補の組合せの個数を表す情報、及び各矩形領域候補の組合せを構成する矩形領域候補の識別番号を出力する。 In step S21, the acquisition unit 5 creates a graph from the rectangular area candidate information. In step S22, the acquisition unit 5 obtains a combination of rectangular area candidates by extracting a clique from the graph. Then, the acquisition unit 5 outputs information indicating the number of combinations of rectangular area candidates, and identification numbers of the rectangular area candidates that constitute the combinations of the rectangular area candidates.

（１）グラフの作成
取得部５は、ステップＳ２１において、矩形領域候補情報からグラフを作成する。グラフは、上述したように、ノードおよびノード間を接続するパスから構成される。ただし、矩形領域候補の組合せを得る場合、各ノードは、１つの矩形領域候補に対応する。また、ノード間を接続するパスは、対応する２つの矩形領域候補が互いに両立する可能性を表す。 (1) The graph creation acquisition unit 5 creates a graph from the rectangular area candidate information in step S21. As described above, the graph includes nodes and paths connecting the nodes. However, when obtaining a combination of rectangular area candidates, each node corresponds to one rectangular area candidate. In addition, the path connecting the nodes represents the possibility that two corresponding rectangular area candidates are compatible with each other.

グラフは、各矩形領域候補について、当該矩形領域候補および他の各矩形領域候補が互いに両立する条件を満たすか否かを判定することにより作成される。２つの矩形領域候補が両立する条件は、例えば、下記の２つである。
条件１：一方の矩形領域候補が他方の矩形領域候補によって完全に包含されていない
条件２：２つの矩形領域候補が同じエッジセグメントを共有していない The graph is created for each rectangular area candidate by determining whether or not the rectangular area candidate and each of the other rectangular area candidates satisfy a mutually compatible condition. For example, the following two conditions satisfy the two rectangular area candidates.
Condition 1: One rectangular area candidate is not completely included by the other rectangular area candidate Condition 2: Two rectangular area candidates do not share the same edge segment

たとえば、矩形領域候補１が矩形領域候補２の中に形成されているものとする。この場合、矩形領域候補１は、矩形領域候補２に完全に包含されているので、条件１を満たしていない。すなわち、矩形領域候補１、２は両立しないと判定される。 For example, it is assumed that the rectangular area candidate 1 is formed in the rectangular area candidate 2. In this case, since the rectangular area candidate 1 is completely included in the rectangular area candidate 2, the condition 1 is not satisfied. That is, it is determined that the rectangular area candidates 1 and 2 are not compatible.

また、矩形領域候補１がエッジエレメントＬ１、Ｌ２、Ｌ３から構成され、矩形領域候補３がエッジエレメントＬ３、Ｌ５、Ｌ６から構成されるものとするこの場合、矩形領域候補１、３は、エッジエレメントＬ３を共有しているので、条件２を満たしていない。すなわち、矩形領域候補１、３は両立しないと判定される。 In this case, the rectangular area candidate 1 is composed of edge elements L1, L2, and L3, and the rectangular area candidate 3 is composed of edge elements L3, L5, and L6. Since L3 is shared, Condition 2 is not satisfied. That is, it is determined that the rectangular area candidates 1 and 3 are not compatible.

このように、取得部５は、各矩形領域候補について他の矩形領域候補と両立できるか否かを判定する。したがって、上記判定により作成されるグラフは、抽出された矩形領域候補の総数がｍである場合、ｍ×ｍ行列で表される。この場合、取得部５は、ｉ番目の矩形領域候補およびｊ番目の矩形領域候補が両立し得るときは、この行列の(i,j)成分および(j,i)成分にそれぞれ１を設定し、これらの矩形領域候補が両立できないときは、この行列の(i,j)成分および(j,i)成分にそれぞれ０を設定する。作成されたグラフの一例を図１８（ａ）に示す。 In this manner, the acquisition unit 5 determines whether each rectangular area candidate can be compatible with other rectangular area candidates. Therefore, the graph created by the above determination is represented by an m × m matrix when the total number of extracted rectangular area candidates is m. In this case, when the i-th rectangular area candidate and the j-th rectangular area candidate can be compatible, the acquiring unit 5 sets 1 to each of the (i, j) component and the (j, i) component of this matrix. When these rectangular area candidates are not compatible, 0 is set to each of the (i, j) component and (j, i) component of this matrix. An example of the created graph is shown in FIG.

（２）クリークの抽出
取得部５は、上述のようにして作成したグラフからクリークを抽出する。クリークは、上述したように、グラフの極大完全部分グラフに相当する。したがって、各クリークは、それぞれ、互いに両立し得る矩形領域候補の集合である。図１８（ａ）に示すグラフから抽出されたクリークの実施例を図１８（ｂ）に示す。 (2) The clique extraction acquisition unit 5 extracts a clique from the graph created as described above. As described above, the clique corresponds to the maximum complete subgraph of the graph. Therefore, each clique is a set of rectangular area candidates that can be compatible with each other. FIG. 18B shows an example of a clique extracted from the graph shown in FIG.

このように、取得部５は、矩形領域候補情報からグラフを作成し、さらにそのグラフからクリークを抽出する。ここで、各クリークは、両立し得る矩形領域候補の集合である。すなわち、取得部５は、１または複数の矩形領域候補の集合で表現される、１または複数の矩形領域候補の組合せを取得する。 Thus, the acquisition unit 5 creates a graph from the rectangular area candidate information, and further extracts a clique from the graph. Here, each clique is a set of compatible rectangular regions. That is, the acquisition unit 5 acquires a combination of one or more rectangular area candidates expressed by a set of one or more rectangular area candidates.

＜ステップＳ８〜Ｓ９：評価および抽出＞
算出部６は、両立可能な矩形領域候補の組合せのそれぞれに対して、再現率および適合率を算出し、さらに再現率および適合率に基づいて決まる評価値を算出する。評価値は、いわゆるＦ値である。そして、画像抽出部７は、最も評価値の高い矩形領域候補の組合せを特定し、その組合せに含まれる矩形領域の画像を抽出する。 <Steps S8 to S9: Evaluation and Extraction>
The calculation unit 6 calculates a recall rate and a matching rate for each compatible combination of rectangular area candidates, and further calculates an evaluation value determined based on the recall rate and the matching rate. The evaluation value is a so-called F value. Then, the image extraction unit 7 specifies a combination of rectangular area candidates having the highest evaluation value, and extracts an image of the rectangular area included in the combination.

（１）再現率の計算
算出部６は、矩形領域候補の組合せのそれぞれについて再現率を計算する。再現率は、矩形領域候補の組合せが抽出されたエッジセグメントによってどれだけ説明されているかを表す。この実施例では、再現率は、矩形領域候補の組合せに含まれている各矩形領域の外周が、抽出されたエッジセグメントによりカバーされている程度または割合を表す。 (1) Recall Rate Calculation The calculation unit 6 calculates a recall rate for each combination of rectangular area candidates. The recall represents how much the combination of rectangular area candidates is explained by the extracted edge segment. In this embodiment, the recall represents the degree or ratio of the outer periphery of each rectangular area included in the combination of rectangular area candidates covered by the extracted edge segment.

（２）適合率の計算
算出部６は、矩形領域候補の組合せのそれぞれについて適合率を計算する。適合率は、矩形領域候補の組合せが、抽出されたエッジセグメントをどれだけ説明できるかを表す。この実施例では、適合率は、エッジ抽出部４により抽出されたすべてのエッジセグメントのうち、矩形領域候補の組合せに含まれている矩形領域の辺として使用されている程度または割合を表す。 (2) The precision calculation unit 6 calculates the precision for each combination of rectangular area candidates. The relevance ratio represents how much the combination of rectangular area candidates can explain the extracted edge segment. In this embodiment, the relevance ratio represents the degree or proportion of all the edge segments extracted by the edge extraction unit 4 that are used as the sides of the rectangular area included in the combination of the rectangular area candidates.

（３）Ｆ値
算出部６は、矩形領域候補の組合せのそれぞれについてＦ値を計算する。Ｆ値は、再現率および適合率を考慮した評価尺度であり、再現率および適合率の調和平均（調和平均に定数を乗算した値を含む）により得られる。すなわち、再現率をＲで表し、適合率をＰで表すとき、Ｆ値は下式で計算される。
Ｆ値＝２×Ｒ×Ｐ／（Ｒ＋Ｐ） (3) The F value calculation unit 6 calculates the F value for each combination of rectangular area candidates. The F value is an evaluation scale that takes into consideration the recall and precision, and is obtained by the harmonic average of the recall and precision (including the harmonic average multiplied by a constant). That is, when the recall is represented by R and the precision is represented by P, the F value is calculated by the following equation.
F value = 2 × R × P / (R + P)

（４）画像抽出
画像抽出部７は、最も評価値の高い矩形領域候補の組合せを特定し、その組合せに含まれる１または複数の矩形領域の画像を抽出する。抽出された画像データは、抽出結果格納部８に格納される。そして、抽出結果格納部８に格納された抽出画像データは、例えばユーザからの指示に応じて、出力部９により出力される。 (4) Image extraction The image extraction unit 7 identifies a combination of rectangular area candidates with the highest evaluation value, and extracts an image of one or more rectangular areas included in the combination. The extracted image data is stored in the extraction result storage unit 8. The extracted image data stored in the extraction result storage unit 8 is output by the output unit 9 in accordance with, for example, an instruction from the user.

＜実施例＞
以下の実施例では、図１９に示すように、入力画像から９個のエッジセグメントＬ１〜Ｌ９が抽出されているものとする。エッジセグメントの抽出は、図２に示すフローチャートのステップＳ１〜Ｓ５により実現される。 <Example>
In the following embodiment, as shown in FIG. 19, nine edge segments L1 to L9 are extracted from the input image. The edge segment extraction is realized by steps S1 to S5 of the flowchart shown in FIG.

図１９において、各エッジセグメントＬ１〜Ｌ９に対して「方向」および「長さ」が表記されている。「方向」は、ソーベルフィルタの出力に基づいて算出される角度に対応しており、この実施例では、図５（ａ）に示すdir0〜dir7で表される。「長さ」は、エッジセグメントを形成する矩形領域の長辺の長さであり、例えば、画素数で表される。 In FIG. 19, “direction” and “length” are shown for each of the edge segments L1 to L9. The “direction” corresponds to an angle calculated based on the output of the Sobel filter, and is represented by dir0 to dir7 shown in FIG. 5A in this embodiment. The “length” is the length of the long side of the rectangular area forming the edge segment, and is represented by the number of pixels, for example.

取得部５は、まず、エッジセグメントＬ１〜Ｌ９を参照し、矩形領域候補を取得する。矩形領域候補を取得するためには、取得部５は、エッジセグメントＬ１〜Ｌ９から抽出される任意の２個のエッジセグメントが矩形領域を構成する可能性を有するか判定する。このとき、取得部５は、すべての組合せについて矩形領域を構成する可能性を有しているか否かを判定する。この結果、図２０（ａ）に示すグラフが作成される。 First, the acquisition unit 5 refers to the edge segments L1 to L9 and acquires rectangular area candidates. In order to acquire a rectangular area candidate, the acquisition unit 5 determines whether any two edge segments extracted from the edge segments L1 to L9 have a possibility of forming a rectangular area. At this time, the acquisition unit 5 determines whether or not all combinations have a possibility of forming a rectangular area. As a result, the graph shown in FIG.

一例として、エッジセグメントＬ３について説明する。すなわち、エッジセグメントＬ３および他の各エッジセグメントのペアが、それぞれ矩形領域を構成する可能性があるか否かが判定される。なお、エッジセグメントＬ３の方向は、dir2である。 As an example, the edge segment L3 will be described. That is, it is determined whether or not the pair of the edge segment L3 and each of the other edge segments may form a rectangular area. The direction of the edge segment L3 is dir2.

（１）エッジセグメントＬ１
エッジセグメントＬ１の方向は、dir4である。よって、エッジセグメントＬ３、Ｌ１が矩形領域を構成するためには、下記の条件を満たす必要がある。
L3.ave_x >= L1.max_x かつ L3.min_y >= L1.ave_y
ここで、エッジセグメントＬ３はエッジセグメントＬ１よりも右側に位置しており、エッジセグメントＬ３のＸ方向の重心座標は、エッジセグメントＬ１のＸ方向の最大座標よりも大きい。また、エッジセグメントＬ３はエッジセグメントＬ１よりも下側に位置しており、エッジセグメントＬ３のＹ方向の最小座標は、エッジセグメントＬ１のＹ方向の重心座標よりも大きい。すなわち、上記２つの条件は満たされており、エッジセグメントＬ３、Ｌ１は矩形領域を構成することができる。したがって、図２０（ａ）に示すグラフにおいて、エッジセグメントＬ３、Ｌ１に対して「１」が設定される。 (1) Edge segment L1
The direction of the edge segment L1 is dir4. Therefore, in order for the edge segments L3 and L1 to form a rectangular area, the following conditions must be satisfied.
L3.ave_x> = L1.max_x and L3.min_y> = L1.ave_y
Here, the edge segment L3 is located on the right side of the edge segment L1, and the barycentric coordinate of the edge segment L3 in the X direction is larger than the maximum coordinate of the edge segment L1 in the X direction. The edge segment L3 is positioned below the edge segment L1, and the minimum coordinate in the Y direction of the edge segment L3 is larger than the barycentric coordinate in the Y direction of the edge segment L1. That is, the above two conditions are satisfied, and the edge segments L3 and L1 can form a rectangular area. Accordingly, in the graph shown in FIG. 20A, “1” is set for the edge segments L3 and L1.

（２）エッジセグメントＬ２
エッジセグメントＬ２の方向は、エッジセグメントＬ１と同じであり、dir4である。また、エッジセグメントＬ３、Ｌ２間の位置関係は、エッジセグメントＬ３、Ｌ１間の位置関係と同じである。したがって、エッジセグメントＬ３、Ｌ２は矩形領域を構成することができ、エッジセグメントＬ３、Ｌ２に対して「１」が設定される。 (2) Edge segment L2
The direction of the edge segment L2 is the same as that of the edge segment L1, and is dir4. The positional relationship between the edge segments L3 and L2 is the same as the positional relationship between the edge segments L3 and L1. Therefore, the edge segments L3 and L2 can form a rectangular area, and “1” is set for the edge segments L3 and L2.

（３）エッジセグメントＬ４
エッジセグメントＬ４の方向も、エッジセグメントＬ１と同じであり、dir4である。よって、エッジセグメントＬ３、Ｌ４が矩形領域を構成するための条件は、上述したエッジセグメントＬ３、Ｌ１についての条件と類似しており、下記の通りである。
L3.ave_x >= L4.max_x かつ L3.min_y >= L4.ave_y
ところが、エッジセグメントＬ３はエッジセグメントＬ４よりも左側に位置しており、エッジセグメントＬ３のＸ方向の重心座標は、エッジセグメントＬ４のＸ方向の最大座標よりも小さい。すなわち、上記条件は満たされず、エッジセグメントＬ３、Ｌ４は矩形領域を構成できない。したがって、図２０（ａ）に示すグラフにおいて、エッジセグメントＬ３、Ｌ４に対して「０」が設定される。 (3) Edge segment L4
The direction of the edge segment L4 is also the same as the edge segment L1 and is dir4. Therefore, the conditions for the edge segments L3 and L4 to form a rectangular region are similar to the conditions for the edge segments L3 and L1 described above, and are as follows.
L3.ave_x> = L4.max_x and L3.min_y> = L4.ave_y
However, the edge segment L3 is located on the left side of the edge segment L4, and the barycentric coordinate of the edge segment L3 in the X direction is smaller than the maximum coordinate of the edge segment L4 in the X direction. That is, the above condition is not satisfied, and the edge segments L3 and L4 cannot form a rectangular area. Therefore, “0” is set for the edge segments L3 and L4 in the graph shown in FIG.

（４）エッジセグメントＬ５
エッジセグメントＬ５の方向は、エッジセグメントＬ３と同じであり、dir2である。よって、エッジセグメントＬ３、Ｌ５が矩形領域を構成するためには、下記の条件を満たす必要がある。
|L3.ave_x - L5.ave_x| < TH1
閾値ＴＨ１は、２つのエッジセグメントがほぼ同一の直線上に配置されるような小さい値であるものとする。ここで、エッジセグメントＬ３はエッジセグメントＬ５よりも左側に位置しており、エッジセグメントＬ３のＸ方向の重心座標とエッジセグメントＬ５のＸ方向の重心座標との差分は、閾値ＴＨ１よりも大きい。すなわち、上記条件は満たされず、エッジセグメントＬ３、Ｌ５は矩形領域を構成できない。よって、図２０（ａ）に示すグラフにおいて、エッジセグメントＬ３、Ｌ５に対して「０」が設定される。 (4) Edge segment L5
The direction of the edge segment L5 is the same as the edge segment L3 and is dir2. Therefore, in order for the edge segments L3 and L5 to form a rectangular area, the following conditions must be satisfied.
| L3.ave_x-L5.ave_x | <TH1
The threshold value TH1 is a small value such that the two edge segments are arranged on substantially the same straight line. Here, the edge segment L3 is located on the left side of the edge segment L5, and the difference between the barycentric coordinate in the X direction of the edge segment L3 and the barycentric coordinate in the X direction of the edge segment L5 is larger than the threshold value TH1. That is, the above condition is not satisfied, and the edge segments L3 and L5 cannot form a rectangular area. Therefore, “0” is set for the edge segments L3 and L5 in the graph shown in FIG.

（５）エッジセグメントＬ６
エッジセグメントＬ６の方向は、dir0である。よって、エッジセグメントＬ３、Ｌ６が矩形領域を構成するためには、下記の条件を満たす必要がある。
L3.ave_x >= L6.max_x かつ L3.max_y <= L6.ave_y
ここで、エッジセグメントＬ３はエッジセグメントＬ６の右先端部よりも左側に位置しており、エッジセグメントＬ３のＸ方向の重心座標は、エッジセグメントＬ６のＸ方向の最大座標よりも小さい。すなわち、上記条件は満たされず、エッジセグメントＬ３、Ｌ６は矩形領域を構成できない。よって、図２０（ａ）に示すグラフにおいて、エッジセグメントＬ３、Ｌ６に対して「０」が設定される。 (5) Edge segment L6
The direction of the edge segment L6 is dir0. Therefore, in order for the edge segments L3 and L6 to form a rectangular area, the following conditions must be satisfied.
L3.ave_x> = L6.max_x and L3.max_y <= L6.ave_y
Here, the edge segment L3 is located on the left side of the right tip of the edge segment L6, and the barycentric coordinate of the edge segment L3 in the X direction is smaller than the maximum coordinate of the edge segment L6 in the X direction. That is, the above condition is not satisfied, and the edge segments L3 and L6 cannot form a rectangular area. Therefore, “0” is set for the edge segments L3 and L6 in the graph shown in FIG.

（６）エッジセグメントＬ７〜Ｌ８
エッジセグメントＬ７の方向も、dir0である。よって、エッジセグメントＬ３、Ｌ７が矩形領域を構成するためには、下記の条件を満たす必要がある。
L3.ave_x >= L7.max_x かつ L3.max_y <= L7.ave_y
ここで、エッジセグメントＬ３はエッジセグメントＬ７の右先端部よりも右側に位置しており、エッジセグメントＬ３のＸ方向の重心座標は、エッジセグメントＬ７のＸ方向の最大座標よりも大きい。また、エッジセグメントＬ３はエッジセグメントＬ７よりも上側に位置しており、エッジセグメントＬ３のＹ方向の最大座標は、エッジセグメントＬ７のＹ方向の重心座標よりも小さい。すなわち、上記２つの条件は満たされており、エッジセグメントＬ３、Ｌ７は矩形領域を構成することができる。したがって、図２０（ａ）に示すグラフにおいて、エッジセグメントＬ３、Ｌ７に対して「１」が設定される。エッジセグメントＬ３、Ｌ８に対しても同様に「１」が設定される。 (6) Edge segments L7 to L8
The direction of the edge segment L7 is also dir0. Therefore, in order for the edge segments L3 and L7 to form a rectangular area, the following conditions must be satisfied.
L3.ave_x> = L7.max_x and L3.max_y <= L7.ave_y
Here, the edge segment L3 is located on the right side of the right tip of the edge segment L7, and the barycentric coordinate of the edge segment L3 in the X direction is larger than the maximum coordinate of the edge segment L7 in the X direction. The edge segment L3 is positioned above the edge segment L7, and the maximum coordinate in the Y direction of the edge segment L3 is smaller than the barycentric coordinate in the Y direction of the edge segment L7. That is, the above two conditions are satisfied, and the edge segments L3 and L7 can form a rectangular area. Therefore, “1” is set for the edge segments L3 and L7 in the graph shown in FIG. Similarly, “1” is set for the edge segments L3 and L8.

（７）エッジセグメントＬ９
エッジセグメントＬ９の方向は、dir6である。よって、エッジセグメントＬ３、Ｌ９が矩形領域を構成するためには、下記の条件を満たす必要がある。
L3.ave_x >= L9.ave_x
ここで、エッジセグメントＬ３はエッジセグメントＬ９の右側に位置しており、エッジセグメントＬ３のＸ方向の重心座標は、エッジセグメントＬ９のＸ方向の重心座標よりも大きい。すなわち、上記条件は満たされており、エッジセグメントＬ３、Ｌ９は矩形領域を構成することができる。したがって、図２０（ａ）に示すグラフにおいて、エッジセグメントＬ３、Ｌ９に対して「１」が設定される。 (7) Edge segment L9
The direction of the edge segment L9 is dir6. Therefore, in order for the edge segments L3 and L9 to form a rectangular area, the following conditions must be satisfied.
L3.ave_x> = L9.ave_x
Here, the edge segment L3 is located on the right side of the edge segment L9, and the barycentric coordinate in the X direction of the edge segment L3 is larger than the barycentric coordinate in the X direction of the edge segment L9. That is, the above condition is satisfied, and the edge segments L3 and L9 can form a rectangular area. Accordingly, in the graph shown in FIG. 20A, “1” is set for the edge segments L3 and L9.

同様に、取得部５は、すべてのエッジセグメントのペアについて矩形領域を構成し得るか判定する。この結果、図２０（ａ）に示すグラフが作成される。 Similarly, the acquisition unit 5 determines whether a rectangular area can be configured for all edge segment pairs. As a result, the graph shown in FIG.

続いて、取得部５は、上述のようにして作成したグラフからクリークを抽出する。すなわち、図２０（ａ）に示すグラフから極大完全部分グラフが抽出される。この結果、図２０（ｂ）に示す４つのクリークＣ１〜Ｃ４が抽出される。 Subsequently, the acquisition unit 5 extracts a clique from the graph created as described above. That is, a maximal complete partial graph is extracted from the graph shown in FIG. As a result, four cliques C1 to C4 shown in FIG. 20B are extracted.

各クリークは、それぞれ１つの矩形領域候補を表す。例えば、クリークＣ１は、５個のエッジセグメントＬ１、Ｌ２、Ｌ３、Ｌ８、Ｌ９が外周（すなわち、辺）の構成要素として使用される矩形領域候補を表す。このように、この実施例では、４個の矩形領域候補が得られる。 Each clique represents one rectangular area candidate. For example, the clique C1 represents a rectangular area candidate in which five edge segments L1, L2, L3, L8, and L9 are used as components on the outer periphery (ie, sides). Thus, in this embodiment, four rectangular area candidates are obtained.

図２１（ａ）〜図２１（ｄ）は、それぞれ、クリークＣ１〜Ｃ４に相当する矩形領域候補を示している。例えば、図２１（ａ）において破線で表されている矩形領域候補ＲＥＣ１は、クリークＣ１の要素であるエッジセグメントＬ１、Ｌ２、Ｌ３、Ｌ８、Ｌ９によって形成されている。同様に、図２１（ｂ）〜図２１（ｄ）においてそれぞれ破線で表されている矩形領域候補ＲＥＣ２〜ＲＥＣ４は、クリークＣ２〜Ｃ４の要素によって形成されている。 FIGS. 21A to 21D show rectangular area candidates corresponding to the cliques C1 to C4, respectively. For example, the rectangular area candidate REC1 represented by a broken line in FIG. 21A is formed by edge segments L1, L2, L3, L8, and L9 that are elements of the clique C1. Similarly, rectangular area candidates REC2 to REC4 represented by broken lines in FIGS. 21 (b) to 21 (d) are formed by elements of cliques C2 to C4.

なお、この実施例では、取得部５は、要素として３以上のエッジセグメントを有するクリークのみを抽出する。すなわち、極大完全部分グラフであっても、要素数（すなわち、エッジセグメントの個数）が２以下である場合は、取得部５は、そのようなクリークを抽出しない。例えば、エッジセグメントＬ４、Ｌ７は、矩形領域を構成する可能性がある。ところが、２つのエッジセグメントで矩形領域の形状を特定することは困難である。したがって、エッジセグメントＬ４、Ｌ７は、クリークとして抽出されない。エッジセグメントＬ４、Ｌ８も同様に、クリークとして抽出されない。ただし、取得部５は、要素数が２であるクリークを抽出するようにしてもよい。 In this embodiment, the acquisition unit 5 extracts only cliques having three or more edge segments as elements. That is, even in the maximal complete subgraph, when the number of elements (that is, the number of edge segments) is 2 or less, the acquisition unit 5 does not extract such a clique. For example, the edge segments L4 and L7 may constitute a rectangular area. However, it is difficult to specify the shape of the rectangular area with two edge segments. Therefore, the edge segments L4 and L7 are not extracted as cliques. Similarly, the edge segments L4 and L8 are not extracted as cliques. However, the acquisition unit 5 may extract a clique having two elements.

さらに、取得部５は、上述のようにして得られる矩形領域候補ＲＥＣ１〜ＲＥＣ４について、矩形領域候補どうしの組合せが両立可能であるか否かを判定する。ここで、各エッジセグメントは、それぞれ１つの矩形領域候補に属するものであって、複数の矩形領域候補に共有されることはない。 Furthermore, the acquisition unit 5 determines whether the combinations of the rectangular area candidates can be compatible with each other for the rectangular area candidates REC1 to REC4 obtained as described above. Here, each edge segment belongs to one rectangular area candidate, and is not shared by a plurality of rectangular area candidates.

例えば、矩形領域候補ＲＥＣ１に属する要素はエッジセグメントＬ１、Ｌ２、Ｌ３、Ｌ８、Ｌ９であり、矩形領域候補ＲＥＣ３に属する要素はエッジセグメントＬ１、Ｌ２、Ｌ５、Ｌ６、Ｌ９である。すなわち、矩形領域候補ＲＥＣ１、ＲＥＣ３は、エッジセグメントＬ１、Ｌ２、Ｌ９を共有している。したがって、矩形領域候補ＲＥＣ１、ＲＥＣ３が両立することはない。同様に、矩形領域候補ＲＥＣ１、ＲＥＣ４、矩形領域候補ＲＥＣ２、ＲＥＣ３、矩形領域候補ＲＥＣ３、ＲＥＣ４もそれぞれ両立することはない。 For example, the elements belonging to the rectangular area candidate REC1 are edge segments L1, L2, L3, L8, and L9, and the elements belonging to the rectangular area candidate REC3 are edge segments L1, L2, L5, L6, and L9. That is, the rectangular area candidates REC1 and REC3 share the edge segments L1, L2, and L9. Therefore, the rectangular area candidates REC1 and REC3 are not compatible. Similarly, the rectangular area candidates REC1 and REC4, the rectangular area candidates REC2 and REC3, and the rectangular area candidates REC3 and REC4 are not compatible.

換言すれば、矩形領域候補ＲＥＣ１〜ＲＥＣ４においては、矩形領域候補ＲＥＣ１、ＲＥＣ２の組合せ、および矩形領域候補ＲＥＣ２、ＲＥＣ４の組合せのみが両立し得る。図２２（ａ）は、上記判定結果により作成されるグラフを示している。 In other words, in the rectangular area candidates REC1 to REC4, only the combination of the rectangular area candidates REC1 and REC2 and the combination of the rectangular area candidates REC2 and REC4 can be compatible. FIG. 22A shows a graph created based on the determination result.

続いて、取得部５は、上述のようにして作成したグラフからクリークを抽出する。すなわち、図２２（ａ）に示すグラフから極大完全部分グラフが抽出される。この結果、図２２（ｂ）に示す３つのクリークＣ１１〜Ｃ１３が抽出される。 Subsequently, the acquisition unit 5 extracts a clique from the graph created as described above. That is, a maximal complete subgraph is extracted from the graph shown in FIG. As a result, three cliques C11 to C13 shown in FIG. 22B are extracted.

各クリークは、それぞれ１つの矩形領域候補の組合せを表す。例えば、クリークＣ１１は、２つの矩形領域候補ＲＥＣ１、ＲＥＣ２が存在する画像を表す。なお、この実施例では、要素が１つのみである部分グラフであっても、その要素が他のクリークに属していないときは、１つのクリークとして抽出される。例えば、クリークＣ１２の要素は、矩形領域候補ＲＥＣ３のみである。 Each clique represents a combination of one rectangular area candidate. For example, the clique C11 represents an image in which two rectangular area candidates REC1 and REC2 exist. In this embodiment, even a subgraph having only one element is extracted as one clique when the element does not belong to another clique. For example, the element of the clique C12 is only the rectangular area candidate REC3.

ここで、例えば、矩形領域候補ＲＥＣ１は、クリークＣ１１に属する。このため、矩形領域候補ＲＥＣ１のみを要素として有する部分グラフは、極大グラフではない。よって、矩形領域候補ＲＥＣ１のみを要素として有する部分グラフは、クリークとして抽出されることはない。矩形領域候補ＲＥＣ２、ＲＥＣ４についても同様である。 Here, for example, the rectangular area candidate REC1 belongs to the clique C11. For this reason, a partial graph having only the rectangular region candidate REC1 as an element is not a maximum graph. Therefore, a subgraph having only the rectangular area candidate REC1 as an element is not extracted as a clique. The same applies to the rectangular area candidates REC2 and REC4.

図２３（ａ）〜図２３（ｃ）は、それぞれ、クリークＣ１１〜Ｃ１３に相当する矩形領域候補の組合せを示している。図２３（ａ）は、エッジセグメントＬ１、Ｌ２、Ｌ３、Ｌ８、Ｌ９を要素として有する矩形領域候補ＲＥＣ１、及びエッジセグメントＬ４、Ｌ５、Ｌ６を要素として有する矩形領域候補ＲＥＣ２が存在する画像を示す。図２３（ｂ）は、エッジセグメントＬ１、Ｌ２、Ｌ５、Ｌ６、Ｌ９を要素として有する矩形領域候補ＲＥＣ３が存在する画像を示す。図２３（ｃ）は、エッジセグメントＬ４、Ｌ５、Ｌ６を要素として有する矩形領域候補ＲＥＣ２、及びエッジセグメントＬ１、Ｌ２、Ｌ３、Ｌ７、Ｌ９を要素として有する矩形領域候補ＲＥＣ４が存在する画像を示す。 FIG. 23A to FIG. 23C show combinations of rectangular area candidates corresponding to the cliques C11 to C13, respectively. FIG. 23A shows an image in which a rectangular area candidate REC1 having edge segments L1, L2, L3, L8, and L9 as elements and a rectangular area candidate REC2 having edge segments L4, L5, and L6 as elements exist. FIG. 23B shows an image in which a rectangular region candidate REC3 having edge segments L1, L2, L5, L6, and L9 as elements is present. FIG. 23C shows an image in which a rectangular area candidate REC2 having the edge segments L4, L5, and L6 as elements, and a rectangular area candidate REC4 having the edge segments L1, L2, L3, L7, and L9 as elements.

また、図２３（ａ）〜図２３（ｃ）においては、各矩形領域候補の形状を示している。例えば、矩形領域候補ＲＥＣ１のサイズは「８０×６０」である。この表記は、矩形領域候補ＲＥＣ１のＸ方向の長さが「８０」であり、Ｙ方向の長さが「６０」であることを表している。他の矩形領域候補ＲＥＣ２〜ＲＥＣ４についても同様である。 23A to 23C show the shape of each rectangular area candidate. For example, the size of the rectangular area candidate REC1 is “80 × 60”. This notation indicates that the length of the rectangular area candidate REC1 in the X direction is “80” and the length in the Y direction is “60”. The same applies to the other rectangular area candidates REC2 to REC4.

算出部６は、図２３（ａ）〜図２３（ｃ）に示す矩形領域候補の組合せのそれぞれについて、再現率Ｒおよび適合率Ｐを計算し、さらに再現率Ｒおよび適合率ＰからＦ値を計算する。 The calculation unit 6 calculates the reproduction rate R and the relevance rate P for each combination of rectangular area candidates shown in FIGS. 23A to 23C, and further calculates the F value from the reproducibility R and the relevance rate P. calculate.

再現率Ｒは、「矩形領域候補を構成するエッジセグメントの長さの和／矩形領域候補の周囲長の和」で算出される。また、適合率Ｐは、「矩形領域候補を構成するエッジセグメントの長さの和／抽出されている全てのエッジセグメントの長さの和」で算出される。そして、Ｆ値は、「２ＲＰ／（Ｒ＋Ｐ）」で算出される。なお、矩形領域候補ＲＥＣ１、ＲＥＣ２、ＲＥＣ３、ＲＥＣ４の周囲長は、図２３（ａ）〜図２３（ｃ）に示すように、それぞれ「２８０」「２８０」「４８０」「３２０」である。また、エッジ抽出部４によって抽出されているすべてのエッジセグメントＬ１〜Ｌ９の長さの和は、図１９に示すように、「４１１」である。 The recall ratio R is calculated by “the sum of the lengths of the edge segments constituting the rectangular area candidate / the sum of the peripheral lengths of the rectangular area candidates”. The matching rate P is calculated by “the sum of the lengths of the edge segments constituting the rectangular region candidate / the sum of the lengths of all the extracted edge segments”. The F value is calculated by “2RP / (R + P)”. Note that the perimeters of the rectangular area candidates REC1, REC2, REC3, and REC4 are “280”, “280”, “480”, and “320”, respectively, as shown in FIGS. 23 (a) to 23 (c). Further, the sum of the lengths of all the edge segments L1 to L9 extracted by the edge extraction unit 4 is “411” as shown in FIG.

図２３（ａ）に示す組合せについての再現率Ｒ、適合率Ｐ、Ｆ値は、以下の通り算出される。
再現率Ｒ＝{(35+25+20+55+55)+(60+58+78)}/(280+280)=0.689
適合率Ｐ＝{(35+25+20+55+55)+(60+58+78)}/411=0.939
Ｆ値＝2*0.689*0.939/(0.689+0.939)=0.795 The recall ratio R, precision ratio P, and F value for the combination shown in FIG. 23A are calculated as follows.
Recall rate R = {(35 + 25 + 20 + 55 + 55) + (60 + 58 + 78)} / (280 + 280) = 0.689
Precision P = {(35 + 25 + 20 + 55 + 55) + (60 + 58 + 78)} / 411 = 0.939
F value = 2 * 0.689 * 0.939 / (0.689 + 0.939) = 0.795

図２３（ｂ）に示す組合せについての再現率Ｒ、適合率Ｐ、Ｆ値は、以下の通り算出される。
再現率Ｒ＝(35+25+58+78+55)/480=0.523
適合率Ｐ＝(35+25+58+78+55)/411=0.611
Ｆ値＝2*0.523*0.611/(0.523+0.611)=0.564 The recall ratio R, precision ratio P, and F value for the combinations shown in FIG. 23B are calculated as follows.
Recall rate R = (35 + 25 + 58 + 78 + 55) /480=0.523
Precision P = (35 + 25 + 58 + 78 + 55) /411=0.611
F value = 2 * 0.523 * 0.611 / (0.523 + 0.611) = 0.564

図２３（ｃ）に示す組合せについての再現率Ｒ、適合率Ｐ、Ｆ値は、以下の通り算出される。
再現率Ｒ＝{(60+58+78)+(35+25+20+25+55)}/(280+320)=0.593
適合率Ｐ＝{(60+58+78)+(35+25+20+25+55)}/411=0.866
Ｆ値＝2*0.593*0.866/(0.593+0.866)=0.704 The recall rate R, the matching rate P, and the F value for the combinations shown in FIG. 23C are calculated as follows.
Recall rate R = {(60 + 58 + 78) + (35 + 25 + 20 + 25 + 55)} / (280 + 320) = 0.593
Precision P = {(60 + 58 + 78) + (35 + 25 + 20 + 25 + 55)} / 411 = 0.866
F value = 2 * 0.593 * 0.866 / (0.593 + 0.866) = 0.704

画像抽出部７は、図２３（ａ）〜図２３（ｃ）に示す矩形領域候補の組合せから、最もＦ値の高い組合せを特定する。この実施例では、図２３（ａ）に示す矩形領域候補の組合せについてのＦ値が最も高い。よって、画像抽出部７は、図２３（ａ）に示す矩形領域候補ＲＥＣ１、ＲＥＣ２に対応する画像を抽出して出力する。 The image extraction unit 7 identifies the combination having the highest F value from the combinations of the rectangular area candidates shown in FIGS. 23 (a) to 23 (c). In this embodiment, the F value for the combination of rectangular area candidates shown in FIG. Therefore, the image extraction unit 7 extracts and outputs images corresponding to the rectangular area candidates REC1 and REC2 shown in FIG.

＜他の幾何学的図形の抽出＞
上述の実施形態では、画像認識装置１は、入力画像から矩形の画像領域を抽出する。ただし、画像認識装置１は、矩形の画像領域を抽出する構成に限定されるものではなく、他の幾何学的図形に対応する画像領域を抽出してもよい。以下では、入力画像から正三角形の画像領域を抽出する構成および方法を説明する。 <Extraction of other geometric figures>
In the above-described embodiment, the image recognition device 1 extracts a rectangular image region from the input image. However, the image recognition apparatus 1 is not limited to the configuration for extracting a rectangular image region, and may extract an image region corresponding to another geometric figure. Hereinafter, a configuration and a method for extracting an equilateral triangle image area from an input image will be described.

入力画像から正三角形の画像領域を抽出する方法は、図２に示すフローチャートの手順とほぼ同じである。ただし、正三角形の画像領域を抽出する場合、ステップＳ４およびステップＳ６の処理は、矩形領域を抽出する処理と異なる。 A method of extracting an equilateral triangle image region from the input image is almost the same as the procedure of the flowchart shown in FIG. However, when an equilateral triangle image region is extracted, the processes in steps S4 and S6 are different from the process of extracting a rectangular region.

正三角形の画像領域を抽出する場合、エッジ抽出部４は、図２４に示すように、２値化エッジ画像を２４方向dir0〜dir23に分解する。各分解方向に割り当てられる角度範囲は、それぞれ１５度である。 When extracting an equilateral triangle image region, the edge extraction unit 4 decomposes the binarized edge image into 24 directions dir0 to dir23 as shown in FIG. The angle range assigned to each disassembly direction is 15 degrees.

取得部５は、エッジ抽出部４により得られるエッジセグメントを利用して構成される正三角形領域候補を抽出する。ここで、任意の２つのエッジセグメントが正三角形領域を構成できるか否かを判定する条件を説明する。以下の説明では、一方のエッジセグメントＬ１の方向がdir0であるものとする。 The acquiring unit 5 extracts equilateral triangle region candidates configured using the edge segments obtained by the edge extracting unit 4. Here, conditions for determining whether or not any two edge segments can form an equilateral triangle region will be described. In the following description, it is assumed that the direction of one edge segment L1 is dir0.

探索対象エッジセグメント（Ｌ２）が分解方向dir0から抽出された場合、下記の条件を満たすときは、取得部５は、エッジセグメントＬ１、Ｌ２が正三角形領域を構成する可能性があると判定する。この場合、エッジセグメントＬ１およびＬ２は、図２５に示すように、いずれも正三角形領域３１の下辺に対応する。
|L1.ave_y - L2.ave_y| < TH1 When the search target edge segment (L2) is extracted from the decomposition direction dir0, the acquisition unit 5 determines that the edge segments L1 and L2 may constitute an equilateral triangle area when the following condition is satisfied. In this case, the edge segments L1 and L2 both correspond to the lower side of the equilateral triangle region 31, as shown in FIG.
| L1.ave_y-L2.ave_y | <TH1

探索対象エッジセグメント（Ｌ３）が分解方向dir8から抽出された場合、下記の条件を満たすときは、取得部５は、エッジセグメントＬ１、Ｌ３が正三角形領域を構成する可能性があると判定する。この場合、エッジセグメントＬ１およびＬ３は、図２６に示すように、それぞれ正三角形領域３１の下辺および右斜め上辺に対応する。「sqrt」は、平方根を表す。
L1.ave_x <= (L1.ave_y - L3.ave_y)/(sqrt(3)) + L3.ave_x When the search target edge segment (L3) is extracted from the decomposition direction dir8, the acquisition unit 5 determines that the edge segments L1 and L3 may constitute an equilateral triangle area when the following condition is satisfied. In this case, the edge segments L1 and L3 respectively correspond to the lower side and the upper right side of the equilateral triangle region 31, as shown in FIG. “Sqrt” represents a square root.
L1.ave_x <= (L1.ave_y-L3.ave_y) / (sqrt (3)) + L3.ave_x

探索対象エッジセグメント（Ｌ４）が分解方向dir16から抽出された場合、下記の条件を満たすときは、取得部５は、エッジセグメントＬ１、Ｌ４が正三角形領域を構成する可能性があると判定する。この場合、エッジセグメントＬ１およびＬ４は、図２７に示すように、それぞれ正三角形領域３１の下辺および左斜め上辺に対応する。
L1.min_x >= -(L1.ave_y - L4.ave_y)/(sqrt(3)) + L4.ave_x When the search target edge segment (L4) is extracted from the decomposition direction dir16, when the following condition is satisfied, the acquisition unit 5 determines that the edge segments L1 and L4 may constitute an equilateral triangle region. In this case, the edge segments L1 and L4 correspond to the lower side and the upper left side of the equilateral triangular region 31, respectively, as shown in FIG.
L1.min_x> =-(L1.ave_y-L4.ave_y) / (sqrt (3)) + L4.ave_x

探索対象エッジセグメントが分解方向dir0、dir8、dir16以外の分解方向から抽出された場合は、取得部５は、エッジセグメントＬ１およびその探索対象エッジセグメントが正三角形領域を構成する可能性が無いと判定する。なお、ここでは、図２５〜図２７を参照しながら、一方のエッジセグメントが正三角形領域の下辺である場合の判定条件を説明したが、一方のエッジセグメントが正三角形領域の右斜め上辺または左斜め上辺である場合の判定条件も、同様に得ることができる。 When the search target edge segment is extracted from a decomposition direction other than the decomposition directions dir0, dir8, and dir16, the acquisition unit 5 determines that the edge segment L1 and the search target edge segment do not have a possibility of forming an equilateral triangle area. To do. Here, the determination condition when one edge segment is the lower side of the equilateral triangle area has been described with reference to FIGS. 25 to 27. However, one edge segment is the upper right side or the left side of the equilateral triangle area. The determination conditions for the oblique upper side can be obtained similarly.

この後、画像認識装置１は、両立可能な正三角形領域候補の組合せを取得し、さらに各組み合わせについてＦ値を算出する。そして、画像認識装置１は、Ｆ値の最も高い組合せに属する１または複数の正三角形領域候補の画像を抽出する。 Thereafter, the image recognition apparatus 1 acquires compatible equilateral triangle region combinations, and calculates an F value for each combination. Then, the image recognition apparatus 1 extracts one or a plurality of equilateral triangle area candidate images belonging to the combination having the highest F value.

＜画像認識装置のハードウェア構成＞
図２８は、画像認識装置１を実現するためのコンピュータシステムのハードウェア構成を示す図である。コンピュータシステム１００は、図２８に示すように、ＣＰＵ１０１、メモリ１０２、記憶装置１０３、読み取り装置１０４、通信インタフェース１０６、および入出力装置１０７を備える。ＣＰＵ１０１、メモリ１０２、記憶装置１０３、読み取り装置１０４、通信インタフェース１０６、入出力装置１０７は、例えば、バス１０８を介して互いに接続されている。 <Hardware configuration of image recognition device>
FIG. 28 is a diagram illustrating a hardware configuration of a computer system for realizing the image recognition apparatus 1. As shown in FIG. 28, the computer system 100 includes a CPU 101, a memory 102, a storage device 103, a reading device 104, a communication interface 106, and an input / output device 107. The CPU 101, the memory 102, the storage device 103, the reading device 104, the communication interface 106, and the input / output device 107 are connected to each other via a bus 108, for example.

ＣＰＵ１０１は、メモリ１０２を利用して画像認識プログラムを実行することにより、エッジ抽出部４、取得部５、算出部６、画像抽出部７の一部または全部の機能を提供することができる。このとき、ＣＰＵ１０１は、図２に示すフローチャートの処理を記述したプログラムを実行することにより、エッジ抽出部４、取得部５、算出部６、画像抽出部７の機能を提供してもよい。 The CPU 101 can provide some or all of the functions of the edge extraction unit 4, the acquisition unit 5, the calculation unit 6, and the image extraction unit 7 by executing an image recognition program using the memory 102. At this time, the CPU 101 may provide the functions of the edge extraction unit 4, the acquisition unit 5, the calculation unit 6, and the image extraction unit 7 by executing a program describing the processing of the flowchart shown in FIG.

メモリ１０２は、例えば半導体メモリであり、ＲＡＭ領域およびＲＯＭ領域を含んで構成される。記憶装置１０３は、例えばハードディスクであり、実施形態の画像認識に係わる画像認識プログラムを格納する。なお、記憶装置１０３は、フラッシュメモリ等の半導体メモリであってもよい。また、記憶装置１０３は、外部記録装置であってもよい。画像データ格納部２および抽出結果格納部８は、メモリ１０２および／または記憶装置１０３を利用して実現される。 The memory 102 is a semiconductor memory, for example, and includes a RAM area and a ROM area. The storage device 103 is, for example, a hard disk, and stores an image recognition program related to image recognition according to the embodiment. Note that the storage device 103 may be a semiconductor memory such as a flash memory. The storage device 103 may be an external recording device. The image data storage unit 2 and the extraction result storage unit 8 are realized using the memory 102 and / or the storage device 103.

読み取り装置１０４は、ＣＰＵ１０１の指示に従って着脱可能記録媒体１０５にアクセスする。着脱可能記録媒体１０５は、たとえば、半導体デバイス（ＵＳＢメモリ等）、磁気的作用により情報が入出力される媒体（磁気ディスク等）、光学的作用により情報が入出力される媒体（ＣＤ−ＲＯＭ、ＤＶＤ等）などにより実現される。通信インタフェース１０６は、ＣＰＵ１０１の指示に従ってネットワークを介してデータを送受信する。入出力装置１０７は、例えば、ユーザからの指示を受け付けるデバイス、デジタルカメラ等から画像データを受信するインタフェース、認識結果を出力するインタフェース等に相当する。 The reading device 104 accesses the removable recording medium 105 in accordance with an instruction from the CPU 101. The detachable recording medium 105 includes, for example, a semiconductor device (USB memory or the like), a medium to / from which information is input / output by a magnetic action (magnetic disk or the like), a medium to / from which information is input / output by an optical action (CD-ROM, For example, a DVD). The communication interface 106 transmits / receives data via a network according to instructions from the CPU 101. The input / output device 107 corresponds to, for example, a device that receives an instruction from a user, an interface that receives image data from a digital camera, an interface that outputs a recognition result, and the like.

実施形態の画像認識プログラムは、例えば、下記の形態でコンピュータシステム１００に提供される。
（１）記憶装置１０３に予めインストールされている。
（２）着脱可能記録媒体１０５により提供される。
（３）プログラムサーバ１１０から提供される。
なお、実施形態の画像認識方法は、複数のコンピュータを利用して上述の処理を提供してもよい。この場合、あるコンピュータが、上述の処理の一部を、ネットワークを介して他のコンピュータに依頼し、その処理結果を受け取るようにしてもよい。 The image recognition program of the embodiment is provided to the computer system 100 in the following form, for example.
(1) Installed in advance in the storage device 103.
(2) Provided by the removable recording medium 105.
(3) Provided from the program server 110.
Note that the image recognition method of the embodiment may provide the above-described processing using a plurality of computers. In this case, a certain computer may request a part of the above-described processing to another computer via a network and receive the processing result.

さらに、実施形態の画像認識装置の一部は、ハードウェアで実現してもよい。或いは、実施形態の画像認識装置は、ソフトウェアおよびハードウェアの組み合わせで実現してもよい。 Furthermore, a part of the image recognition apparatus according to the embodiment may be realized by hardware. Alternatively, the image recognition apparatus of the embodiment may be realized by a combination of software and hardware.

＜実施形態の効果＞
このように、実施形態の画像認識装置によれば、入力画像において抽出されるエッジセグメントを利用して、両立可能な、所定の幾何学的形状の対象物に対応する領域候補の組合せがすべて抽出される。よって、対象物が互いに重なり合っている場合、或いは、抽出されたエッジセグメントが途切れている場合であっても、正しい対象物（すなわち、実際の対象物に対応する領域）は、上述の領域候補の組合せの中に含まれている。よって、実施形態の画像認識装置によれば、入力画像を認識する際に、対象物が抽出されずに漏れてしまう可能性は低い。 <Effect of embodiment>
As described above, according to the image recognition apparatus of the embodiment, using the edge segments extracted in the input image, all the combinations of compatible area candidates corresponding to the objects having the predetermined geometric shapes are extracted. Is done. Therefore, even when the objects overlap each other or when the extracted edge segment is interrupted, the correct object (that is, the area corresponding to the actual object) is the above-mentioned area candidate. Included in the combination. Therefore, according to the image recognition device of the embodiment, when an input image is recognized, there is a low possibility that an object will leak without being extracted.

また、実施形態の画像認識装置によれば、抽出すべき領域候補の組合せのそれぞれについて、エッジセグメントおよび領域候補に関する再現率および適合率に基づいて決まる評価値が算出される。そして、この評価値に従って抽出すべき領域が決定される。これにより、複数の領域候補の中から、正しい１または複数の抽出すべき領域を高い精度で特定できる。したがって、対象物が互いに重なり合っている場合、或いは、対象物と背景の色が類似している場合であっても、対象物の画像を精度よく抽出できる。 Further, according to the image recognition apparatus of the embodiment, for each combination of region candidates to be extracted, an evaluation value that is determined based on the recall rate and the matching rate regarding the edge segment and the region candidate is calculated. Then, an area to be extracted is determined according to the evaluation value. As a result, one or more correct areas to be extracted can be identified with high accuracy from among a plurality of area candidates. Therefore, even when the objects overlap each other or when the object and the background are similar in color, the image of the object can be accurately extracted.

以上記載した各実施例を含む実施形態に関し、さらに以下の付記を開示する。なお、本発明は、以下の付記に限定されるものではない。 The following additional notes are further disclosed with respect to the embodiments including the examples described above. Note that the present invention is not limited to the following supplementary notes.

（付記１）
画像からエッジセグメントを抽出するエッジ抽出部と、
前記エッジ抽出部により抽出されたエッジセグメントを利用して形成される予め決められた幾何学的な図形の候補の組合せを取得する取得部と、
前記取得部により取得された各組合せについて、前記図形の候補の外周が前記抽出されたエッジセグメントによってカバーされる程度を表す再現率、および、前記抽出されたエッジセグメントが前記図形の候補として利用される程度を表す適合率をそれぞれ算出する算出部と、
前記再現率および前記適合率に基づいて決まる評価値が最大となる組合せに含まれる図形の候補に対応する領域を抽出する画像抽出部と、
を有する画像認識装置。
（付記２）
前記算出部は、前記図形の候補に利用されるエッジセグメントの長さの和を、前記図形の候補の外周の長さの和で除算することで前記再現率を算出し、前記図形の候補に利用されるエッジセグメントの長さの和を、前記エッジ抽出部により抽出された全てのエッジセグメントの長さの和で除算することで前記適合率を算出する
ことを特徴とする付記１に記載の画像認識装置。
（付記３）
前記評価値は、前記再現率および前記適合率の調和平均である
ことを特徴とする付記１または２に記載の画像認識装置。
（付記４）
前記取得部は、前記エッジ抽出部により抽出されたエッジセグメントを利用して形成される幾何学的な図形の候補を抽出し、抽出した図形の候補どうしの組合せの中で、図形の候補が両立し得る組合せを取得する
ことを特徴とする付記１〜３のいずれか１つに記載の画像認識装置。
（付記５）
前記取得部は、抽出した図形の候補どうしの組合せの中で、包含関係にない図形の候補の組合せを取得する
ことを特徴とする付記４に記載の画像認識装置。
（付記６）
前記取得部は、抽出した図形の候補どうしの組合せの中で、前記エッジセグメントが複数の図形の候補により共有されることのない図形の候補の組合せを取得する
ことを特徴とする付記４に記載の画像認識装置。
（付記７）
画像からエッジセグメントを抽出するエッジ抽出部と、
前記エッジ抽出部により抽出されたエッジセグメントを利用して形成される予め決められた幾何学的な図形の候補の組合せを抽出し、前記組合せの中から、２以上の図形の候補が包含関係を有しておらず、且つ、２以上の図形の候補が同じエッジセグメント共有していない組合せを取得する取得部と、
前記取得部により取得された各組合せについて、前記抽出されたエッジセグメントに対する前記図形の候補の妥当性を表す評価値を算出する算出部と、
前記算出部により算出される評価値が最大となる組合せに含まれる図形の候補に対応する領域を抽出する画像抽出部と、
を有する画像認識装置。
（付記８）
コンピュータが、
画像からエッジセグメントを抽出し、
前記抽出されたエッジセグメントを利用して形成される予め決められた幾何学的な図形の候補の組合せを取得し、
前記各組合せについて、前記図形の候補の外周が前記抽出されたエッジセグメントによってカバーされる程度を表す再現率、および、前記抽出されたエッジセグメントが前記図形の候補として利用される程度を表す適合率をそれぞれ算出し、
前記再現率および前記適合率に基づいて決まる評価値が最大となる組合せに含まれる図形の候補に対応する領域を抽出する
ことを特徴とする画像認識方法。
（付記９）
画像からエッジセグメントを抽出し、
前記抽出されたエッジセグメントを利用して形成される予め決められた幾何学的な図形の候補の組合せを取得し、
前記各組合せについて、前記図形の候補の外周が前記抽出されたエッジセグメントによってカバーされる程度を表す再現率、および、前記抽出されたエッジセグメントが前記図形の候補として利用される程度を表す適合率をそれぞれ算出し、
前記再現率および前記適合率に基づいて決まる評価値が最大となる組合せに含まれる図形の候補に対応する領域を抽出する
処理をコンピュータに実行させるための画像認識プログラム。 (Appendix 1)
An edge extractor for extracting edge segments from the image;
An acquisition unit that acquires a combination of predetermined geometric figure candidates formed by using the edge segment extracted by the edge extraction unit;
For each combination acquired by the acquisition unit, the recall representing the extent to which the outer periphery of the graphic candidate is covered by the extracted edge segment, and the extracted edge segment are used as the graphic candidate. A calculation unit for calculating the relevance ratio representing the degree of
An image extraction unit for extracting a region corresponding to a graphic candidate included in a combination having a maximum evaluation value determined based on the reproduction rate and the matching rate;
An image recognition apparatus.
(Appendix 2)
The calculation unit calculates the recall by dividing the sum of the lengths of the edge segments used for the graphic candidates by the sum of the outer perimeters of the graphic candidates, and determines the graphic candidates. The relevance ratio is calculated by dividing the sum of the lengths of the edge segments used by the sum of the lengths of all the edge segments extracted by the edge extraction unit. Image recognition device.
(Appendix 3)
The image recognition apparatus according to appendix 1 or 2, wherein the evaluation value is a harmonic average of the recall and the precision.
(Appendix 4)
The acquisition unit extracts a geometric figure candidate formed using the edge segment extracted by the edge extraction unit, and the figure candidate is compatible among the combinations of the extracted figure candidates. The image recognition device according to any one of appendices 1 to 3, wherein a combination that can be acquired is acquired.
(Appendix 5)
The image recognition apparatus according to appendix 4, wherein the acquisition unit acquires combinations of graphic candidates that are not in an inclusion relationship among the combinations of extracted graphic candidates.
(Appendix 6)
The acquisition unit acquires a combination of graphic candidates in which the edge segment is not shared by a plurality of graphic candidates among combinations of extracted graphic candidates. Image recognition device.
(Appendix 7)
An edge extractor for extracting edge segments from the image;
A combination of predetermined geometric figure candidates formed by using the edge segment extracted by the edge extraction unit is extracted, and two or more figure candidates from the combination have an inclusion relation. An acquisition unit that acquires combinations that do not have and share two or more graphic candidates that do not share the same edge segment;
For each combination acquired by the acquisition unit, a calculation unit that calculates an evaluation value indicating the validity of the graphic candidate for the extracted edge segment;
An image extraction unit for extracting a region corresponding to a graphic candidate included in the combination having the maximum evaluation value calculated by the calculation unit;
An image recognition apparatus.
(Appendix 8)
Computer
Extract edge segments from the image,
Obtaining a combination of predetermined geometric figure candidates formed using the extracted edge segments;
For each of the combinations, a reproduction rate indicating the extent to which the outer periphery of the graphic candidate is covered by the extracted edge segment, and a matching rate indicating the extent to which the extracted edge segment is used as the graphic candidate Respectively,
An image recognition method comprising extracting a region corresponding to a graphic candidate included in a combination having a maximum evaluation value determined based on the reproduction rate and the matching rate.
(Appendix 9)
Extract edge segments from the image,
Obtaining a combination of predetermined geometric figure candidates formed using the extracted edge segments;
For each of the combinations, a reproduction rate indicating the extent to which the outer periphery of the graphic candidate is covered by the extracted edge segment, and a matching rate indicating the extent to which the extracted edge segment is used as the graphic candidate Respectively,
An image recognition program for causing a computer to execute a process of extracting a region corresponding to a graphic candidate included in a combination having a maximum evaluation value determined based on the reproduction rate and the matching rate.

１画像認識装置
３処理部
４エッジ抽出部
５取得部
６算出部
７画像抽出部 DESCRIPTION OF SYMBOLS 1 Image recognition apparatus 3 Processing part 4 Edge extraction part 5 Acquisition part 6 Calculation part 7 Image extraction part

Claims

An edge extractor for extracting edge segments from the image;
An acquisition unit that acquires a combination of predetermined geometric figure candidates formed by using the edge segment extracted by the edge extraction unit;
For each combination acquired by the acquisition unit, the recall representing the extent to which the outer periphery of the graphic candidate is covered by the extracted edge segment, and the extracted edge segment are used as the graphic candidate. A calculation unit for calculating the relevance ratio representing the degree of
An image extraction unit for extracting a region corresponding to a graphic candidate included in a combination having a maximum evaluation value determined based on the reproduction rate and the matching rate;
An image recognition apparatus.

The calculation unit calculates the recall by dividing the sum of the lengths of the edge segments used for the graphic candidates by the sum of the outer perimeters of the graphic candidates, and determines the graphic candidates. The precision is calculated by dividing the sum of the lengths of the edge segments used by the sum of the lengths of all the edge segments extracted by the edge extraction unit. Image recognition device.

An edge extractor for extracting edge segments from the image;
A combination of predetermined geometric figure candidates formed by using the edge segment extracted by the edge extraction unit is extracted, and two or more figure candidates from the combination have an inclusion relation. An acquisition unit that acquires combinations that do not have and share two or more graphic candidates that do not share the same edge segment;
For each combination acquired by the acquisition unit, a calculation unit that calculates an evaluation value indicating the validity of the graphic candidate for the extracted edge segment;
An image extraction unit for extracting a region corresponding to a graphic candidate included in the combination having the maximum evaluation value calculated by the calculation unit;
An image recognition apparatus.

Computer
Extract edge segments from the image,
Obtaining a combination of predetermined geometric figure candidates formed using the extracted edge segments;
For each of the combinations, a reproduction rate indicating the extent to which the outer periphery of the graphic candidate is covered by the extracted edge segment, and a matching rate indicating the extent to which the extracted edge segment is used as the graphic candidate Respectively,
An image recognition method comprising extracting a region corresponding to a graphic candidate included in a combination having a maximum evaluation value determined based on the reproduction rate and the matching rate.

Extract edge segments from the image,
Obtaining a combination of predetermined geometric figure candidates formed using the extracted edge segments;
For each of the combinations, a reproduction rate indicating the extent to which the outer periphery of the graphic candidate is covered by the extracted edge segment, and a matching rate indicating the extent to which the extracted edge segment is used as the graphic candidate Respectively,
An image recognition program for causing a computer to execute a process of extracting a region corresponding to a graphic candidate included in a combination having a maximum evaluation value determined based on the reproduction rate and the matching rate.