JP2016031548A

JP2016031548A - Image processing apparatus, imaging device, program, and recording medium

Info

Publication number: JP2016031548A
Application number: JP2014152100A
Authority: JP
Inventors: 今泉　大作; Daisaku Imaizumi; 大作今泉
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2014-07-25
Filing date: 2014-07-25
Publication date: 2016-03-07
Anticipated expiration: 2034-07-25
Also published as: JP6326316B2

Abstract

PROBLEM TO BE SOLVED: To make it possible to reduce storage capacity and time and effort for a user.SOLUTION: An information processing part (300) of an imaging device (10) includes a specification part (307) and an extraction part (308). The specification part (307) specifies, in a photographed image (500), pixels determined not to be background color pixels and determined to be edge pixels. The extraction part (308) refers to the pixels specified by the specification part (307) to extract a photographic object area (510) from the photographed image (500).SELECTED DRAWING: Figure 1

Description

本発明は、撮影によって生成される画像データを処理する画像処理装置に関する。 The present invention relates to an image processing apparatus that processes image data generated by photographing.

特許文献１には、机等の台座を背景とし、この台座に載置した用紙の紙面を撮影対象として撮影する場合において、撮影画像のうち、撮影対象領域（撮影対象を示す領域）を抽出（特定）する技術が開示されている。 In Patent Document 1, when a pedestal such as a desk is used as a background and a sheet of paper placed on the pedestal is photographed as a photographing target, a photographing target region (a region indicating the photographing target) is extracted from the photographed image ( Technology) is disclosed.

特開２００７−２０１９４８号公報JP 2007-201948 A 特開２０１３−１６８１１９号公報JP2013-168119A 特開２０１３−４０８８号公報JP 2013-4088 A

特許文献１の技術では、撮影対象（用紙等）が無い状態の台座を予め撮影し、その後に台座に載置した撮影対象を撮影するようになっている。続いて、撮影対象が無い台座を撮影して得られた撮影画像と、台座に載置された撮影対象を撮影して得られた撮影画像との差分画像を求め、差分画像から輪郭情報（撮影対象領域の輪郭）を抽出するようになっている。 In the technique of Patent Document 1, a pedestal in a state where there is no photographing target (paper or the like) is photographed in advance, and then a photographing target placed on the pedestal is photographed. Subsequently, a difference image between a photographed image obtained by photographing a pedestal having no photographing object and a photographed image obtained by photographing the photographing object placed on the pedestal is obtained, and contour information (photographing is performed from the difference image. The contour of the target area) is extracted.

したがって、特許文献１の技術では、撮影対象を撮影する前に、撮影対象を載置する前の台座の撮影を行わなければならず、ユーザにとって二度手間が生じるという問題がある。また、撮影位置やレイアウトを変えて撮影対象を再撮影する場合、再び撮影対象を載置する前の台座の撮影を行わなければならず、手間である。 Therefore, the technique of Patent Document 1 has a problem in that it takes time for the user to photograph the pedestal before placing the photographing object before photographing the photographing object. Moreover, when re-photographing a photographing target by changing the photographing position and layout, it is necessary to take a picture of the pedestal before placing the photographing target again, which is troublesome.

また、特許文献１の技術では、撮影対象が無い台座を示した撮影画像と、台座に載置された撮影対象を示した撮影画像との２つの画像を保持するための記憶容量が必要になるという問題も生じる。 Further, in the technique of Patent Document 1, a storage capacity is required to hold two images, a captured image showing a pedestal with no shooting target and a captured image showing a shooting target placed on the pedestal. The problem also arises.

本発明は、記憶容量とユーザにかかる手間とを従来技術よりも抑制可能な画像処理装置を提供することを目的とする。 SUMMARY OF THE INVENTION An object of the present invention is to provide an image processing apparatus that can suppress the storage capacity and the user's effort compared to the prior art.

本発明の一態様の画像処理装置は、撮影画像から、撮影対象領域の頂点の候補である候補頂点を検出する検出部と、前記撮影画像のうち、複数の候補頂点をラインで結んで形成される領域以外の領域を背景領域とし、前記背景領域の代表色と同一または類似色の階調値の条件を設定する設定部と、前記撮影画像の画素毎に、前記条件を満たす背景色画素か否かを判定する第１判定部と、前記画素毎に、エッジを示すエッジ画素か否かを判定する第２判定部と、前記撮影画像のうち、前記背景色画素ではないと判定され前記エッジ画素と判定された画素を特定する特定部と、前記特定部に特定された画素を参照して、前記撮影画像から前記撮影対象領域を抽出する抽出部と、を備えたことを特徴とする。 An image processing apparatus according to an aspect of the present invention is formed by connecting, from a photographed image, a detection unit that detects candidate vertices that are candidates for vertices of a photographing target region, and connecting a plurality of candidate vertices of the photographed image with lines. A setting unit that sets a gradation value condition that is the same as or similar to the representative color of the background area, and a background color pixel that satisfies the condition for each pixel of the photographed image. A first determination unit configured to determine whether or not, a second determination unit configured to determine whether or not each pixel is an edge pixel indicating an edge, and the edge determined to be not the background color pixel in the captured image A specifying unit that specifies a pixel determined to be a pixel, and an extraction unit that extracts the shooting target region from the captured image with reference to the pixel specified by the specifying unit.

本発明の一態様によれば、特許文献１のように撮影対象が無い状態の背景（机等の台座）を予め撮影した画像を要することなく撮影対象部分を高精度に抽出できるため、撮影を２度行う必要がなく、記憶容量とユーザにかかる手間とを特許文献１よりも削減できるという効果を奏する。 According to one aspect of the present invention, since a portion to be photographed can be extracted with high accuracy without requiring an image in which a background (a pedestal such as a desk) without a subject to be photographed is used as in Patent Document 1, photographing is performed. There is no need to perform the process twice, so that the storage capacity and the labor required for the user can be reduced as compared with Patent Document 1.

本発明の一実施形態の撮影装置の概略構成を示すブロック図である。1 is a block diagram illustrating a schematic configuration of a photographing apparatus according to an embodiment of the present invention. 図１の撮影装置にて撮影された撮影画像を模式的に示した図である。It is the figure which showed typically the picked-up image image | photographed with the imaging device of FIG. 四隅検出部の処理内容を示すフローチャートである。It is a flowchart which shows the processing content of a four corner detection part. 移動平均フィルタの一例を示した図である。It is the figure which showed an example of the moving average filter. 撮影画像に含まれる背景領域を分割して得られる八つのサブ領域を示す図である。It is a figure which shows eight sub area | regions obtained by dividing | segmenting the background area | region contained in a picked-up image. 図５のサブ領域の各々について、参照範囲のＲチャンネルの階調値の統計量を示す表である。6 is a table showing the statistic of the gradation value of the R channel in the reference range for each of the sub-regions in FIG. 5. 図５のサブ領域の各々について、Ｒチャンネルの階調値の統計量を示す表であり、階調値の平均値が昇り順になるように並び替えた後の表である。FIG. 6 is a table showing R channel tone value statistics for each of the sub-regions in FIG. 5, after rearrangement so that the average value of the tone values is in ascending order. Ｒ，Ｇ，Ｂのチャンネル毎に生成される背景色画像を示す。The background color image produced | generated for every channel of R, G, B is shown. ガウシアンフィルタの一例を示した図である。It is the figure which showed an example of the Gaussian filter. ソーベルフィルタの一例を示した図である。It is the figure which showed an example of the Sobel filter. エッジ検出部による処理（キャニー法）の内容を示したフローチャートである。It is the flowchart which showed the content of the process (canny method) by an edge detection part. Ｒ，Ｇ，Ｂのチャンネル毎に生成されるエッジ画像を示す。The edge image produced | generated for every channel of R, G, B is shown. 背景色画素ではないと判定されエッジ画素と判定された画素を示す非背景色/エッジ画像を示す図である。It is a figure which shows the non-background color / edge image which shows the pixel determined not to be a background color pixel, and was determined to be an edge pixel. 第１探索処理〜第４探索処理の各々について、探索される画素列の一つを示した模式図である。It is the schematic diagram which showed one of the pixel columns searched about each of a 1st search process-a 4th search process. 抽出部にて生成されるマスク画像を示した図である。It is the figure which showed the mask image produced | generated in the extraction part. 抽出部にて生成される抽出済画像を示した図である。It is the figure which showed the extracted image produced | generated in the extraction part. 撮影画像に含まれる背景領域のうち、統計処理の対象となる４つのサブ領域を示す図である。It is a figure which shows four sub area | regions used as the object of a statistical process among the background area | regions contained in a picked-up image. 撮影画像に含まれる背景領域のうち、統計処理の対象となる７つのサブ領域を示す図である。It is a figure which shows seven sub area | regions used as the object of a statistical process among the background area | regions contained in a picked-up image.

〔実施形態１〕
（撮影装置の全体構成）
以下、本発明の一実施形態の撮影装置の構成を詳細に説明する。図１は、本実施形態に係る撮影装置の概略構成を示すブロック図である。 Embodiment 1
(Overall configuration of the imaging device)
Hereinafter, the configuration of an imaging apparatus according to an embodiment of the present invention will be described in detail. FIG. 1 is a block diagram illustrating a schematic configuration of a photographing apparatus according to the present embodiment.

本実施形態の撮影装置１０は、画像を撮影するカメラ機能を備えた携帯型端末装置であり、例えばタブレットが挙げられる。図１に示すように、撮影装置１０は、少なくとも、タッチパネル部１００、撮影部２００、情報処理部３００、および記憶部４００を備えている。 The imaging device 10 of the present embodiment is a portable terminal device having a camera function for capturing an image, and includes, for example, a tablet. As illustrated in FIG. 1, the imaging device 10 includes at least a touch panel unit 100, an imaging unit 200, an information processing unit 300, and a storage unit 400.

さらに、図示しないが、撮影装置１０は、通信部、インターネット通信部、音声操作部、音声出力部、テレビ放送受像部、ＧＰＳ（Global Positioning System）など、携帯型端末装置が標準的に備えている各種ハードウェアを備えている。 Furthermore, although not shown, the photographing apparatus 10 is normally provided in a portable terminal device such as a communication unit, an Internet communication unit, a voice operation unit, a voice output unit, a television broadcast receiver, and a GPS (Global Positioning System). It has various hardware.

タッチパネル部１００は、画像を表示する表示部と、ユーザのタッチ操作に応じた入力信号を生成する入力部とが一体化されている装置であり、周知のタッチパネルを用いることができる。つまり、表示部はＬＣＤ（Liquid Crystal Display）などで構成され、入力部は表示部と一体化された静電容量センサなどで構成される。なお、本実施形態の撮影装置１０は、画像表示手段としてタッチパネル部１００を備えているが、画像表示手段はタッチパネル部１００に限られるものではない。つまり、撮影装置１０において、表示部と入力部とが別々に構成されていてもよく、この場合、表示部はＬＣＤ等であり、入力部はキーボード等である。 The touch panel unit 100 is an apparatus in which a display unit that displays an image and an input unit that generates an input signal corresponding to a user's touch operation are integrated, and a known touch panel can be used. That is, the display unit is configured by an LCD (Liquid Crystal Display) or the like, and the input unit is configured by a capacitance sensor integrated with the display unit. In addition, although the imaging device 10 of the present embodiment includes the touch panel unit 100 as an image display unit, the image display unit is not limited to the touch panel unit 100. That is, in the photographing apparatus 10, the display unit and the input unit may be configured separately. In this case, the display unit is an LCD or the like, and the input unit is a keyboard or the like.

撮影部２００は、ＣＣＤ（Charge Coupled Device）またはＣＭＯＳ（Complementary Metal-Oxide Semiconductor）等のイメージセンサ（カメラ）である。撮影部２００は、ユーザに入力された撮影指示信号に応じて、撮影処理を行い、撮影画像を示す撮影画像データを生成する。なお、撮影画像データは、画素毎に、Ｒ（赤）、Ｇ（緑）、Ｂ（青）の各チャンネルの階調値（８ビットの場合は０〜２５５）を示したカラー画像データである。 The photographing unit 200 is an image sensor (camera) such as a charge coupled device (CCD) or a complementary metal-oxide semiconductor (CMOS). The photographing unit 200 performs photographing processing according to the photographing instruction signal input by the user, and generates photographed image data indicating the photographed image. The photographed image data is color image data indicating the gradation values (0 to 255 in the case of 8 bits) of each channel of R (red), G (green), and B (blue) for each pixel. .

情報処理部（画像処理装置）３００は、撮影装置１０の各ハードウェアの動作を制御するブロックであり、例えば、ＣＰＵ（Central Processing Unit）を備えるマイクロコンピュータやマイクロプロセッサなどによって実現される。情報処理部３００は、記憶部４００に記憶されている各種情報および各種制御を実施するためのプログラムを取り出して情報処理を行い、情報処理の結果に基づいて撮影装置１０の各ハードウェアに制御信号や情報を送って、撮影装置１０の各ハードウェアの動作を制御する。 The information processing unit (image processing apparatus) 300 is a block that controls the operation of each hardware of the photographing apparatus 10, and is realized by, for example, a microcomputer or a microprocessor including a CPU (Central Processing Unit). The information processing unit 300 extracts various information stored in the storage unit 400 and a program for performing various controls, performs information processing, and controls each hardware of the photographing apparatus 10 based on the result of the information processing. And the information are sent to control the operation of each hardware of the photographing apparatus 10.

また、情報処理部３００は、ユーザからの入力指示に応じて書面撮影モードに移行するように設計されている。書面撮影モードとは、撮影部２００にて撮影されると、撮影によって得られた撮影画像データに対して書面用画像処理を施すモードである。書面用画像処理とは、机の上に載置されている書面（書籍や原稿等）が撮影対象であることを前提とした画像処理を指す。 Further, the information processing unit 300 is designed to shift to a document shooting mode in response to an input instruction from a user. The document photographing mode is a mode in which when photographing is performed by the photographing unit 200, document image processing is performed on photographed image data obtained by photographing. Document image processing refers to image processing based on the premise that a document (book, document, etc.) placed on a desk is a subject to be photographed.

つまり、情報処理部３００は、書面撮影モードに移行している場合において、撮影部２００の撮影によって撮影画像データが生成されると、撮影画像データに書面用画像処理を施した上で撮影画像データを記憶部４００に保存する。情報処理部３００にて行われる書面用画像処理については後に詳述する。 That is, in the case where the information processing unit 300 has shifted to the document photographing mode and the photographed image data is generated by photographing by the photographing unit 200, the photographed image data is subjected to document image processing on the photographed image data. Is stored in the storage unit 400. The document image processing performed in the information processing unit 300 will be described in detail later.

記憶部４００は、情報処理部３００が実行する各種アプリケーションプログラム、ＯＳプログラム、制御プログラム、画像処理プログラム、これらプログラムを実行するときに読み出す各種データ（設定値、テーブル等）、前記の撮影画像データを記憶する記憶領域である。記憶部４００としては、従来から公知の種々の記憶手段、例えば、リードオンリーメモリ（ＲＯＭ）、ランダムアクセスメモリ（ＲＡＭ）、フラッシュメモリ、ＥＰＲＯＭ（Erasable Programmable ROM）、ＥＥＰＲＯＭ（登録商標）（Electrically EPROM）、ハードディスクドライブ（ＨＤＤ）などを用いることができる。また、情報処理部３００に取り扱われている各種データや処理中のデータは、記憶部４００のワーキングメモリに一時的に記憶される。 The storage unit 400 stores various application programs, OS programs, control programs, image processing programs executed by the information processing unit 300, various data (setting values, tables, etc.) read when executing these programs, and the captured image data. This is a storage area for storing. The storage unit 400 includes various conventionally known storage means such as read only memory (ROM), random access memory (RAM), flash memory, EPROM (Erasable Programmable ROM), EEPROM (registered trademark) (Electrically EPROM). A hard disk drive (HDD) or the like can be used. In addition, various data handled by the information processing unit 300 and data being processed are temporarily stored in the working memory of the storage unit 400.

（情報処理部３００）
つぎに、情報処理部３００に含まれる各ブロックを詳細に説明する。図１に示すように、情報処理部３００は、四隅検出部３０１、平滑化処理部３０２、統計処理部３０３、条件設定部３０４（設定部）、背景色判定部３０５（第１判定部）、エッジ検出部３０６（第２判定部）、特定部３０７、抽出部３０８、歪み補正部３０９、フォーマット処理部３１０を備えている。これら各部は前記の書面用画像処理を行う処理部である。 (Information Processing Unit 300)
Next, each block included in the information processing unit 300 will be described in detail. As shown in FIG. 1, the information processing unit 300 includes a four-corner detection unit 301, a smoothing processing unit 302, a statistical processing unit 303, a condition setting unit 304 (setting unit), a background color determination unit 305 (first determination unit), An edge detection unit 306 (second determination unit), an identification unit 307, an extraction unit 308, a distortion correction unit 309, and a format processing unit 310 are provided. Each of these units is a processing unit that performs the document image processing.

つまり、書面撮影モードが設定されている場合、撮影部２００にて撮影が行われると、この撮影によって得られた撮影画像データに対して、四隅検出部３０１、平滑化処理部３０２、統計処理部３０３、条件設定部３０４、背景色判定部３０５、エッジ検出部３０６、特定部３０７、抽出部３０８、歪み補正部３０９、フォーマット処理部３１０がこの順で画像処理を実行する。 That is, when the document shooting mode is set, when shooting is performed by the shooting unit 200, four corner detection unit 301, smoothing processing unit 302, statistical processing unit are performed on the shot image data obtained by this shooting. 303, the condition setting unit 304, the background color determination unit 305, the edge detection unit 306, the specifying unit 307, the extraction unit 308, the distortion correction unit 309, and the format processing unit 310 execute image processing in this order.

なお、四隅検出部３０１〜抽出部３０８までの処理によって、撮影画像に含まれる撮影対象領域（撮影対象を示す領域）が抽出（特定）され、歪み補正部３０９の処理によって、当該撮影対象領域に対して歪み補正が実行され、フォーマット処理部３１０の処理によって、当該撮影対象領域を示した画像ファイルが形成される。以下、情報処理部３００に含まれる各ブロックを順に説明する。 Note that a shooting target area (area indicating a shooting target) included in the shot image is extracted (specified) by the processes from the four corner detection units 301 to the extraction unit 308, and the distortion correction unit 309 performs processing on the shooting target area. On the other hand, distortion correction is executed, and an image file indicating the shooting target area is formed by the processing of the format processing unit 310. Hereinafter, each block included in the information processing unit 300 will be described in order.

（四隅検出部３０１）
本実施形態の撮影画像は、机の上に載置された書籍の見開き２頁の書面を撮影対象とし、机が背景として含まれるように撮影して得られた画像である。つまり、図２に示すように、撮影画像５００には、撮影対象を示す画像領域である撮影対象領域（書籍）５１０と、背景領域（机）５５０とが示されている。 (Four corner detector 301)
The photographed image according to the present embodiment is an image obtained by photographing a document on a two-page spread of a book placed on a desk as a subject of photographing and including the desk as a background. In other words, as shown in FIG. 2, the photographed image 500 shows a photographing target area (book) 510 and a background area (desk) 550 that are image areas indicating the photographing target.

四隅検出部３０１は、撮影画像５００に含まれる撮影対象領域５１０が略矩形状の書面であるとみなして、撮影対象領域５１０の頂点の候補となる候補頂点を検出するブロックである。略矩形には、通常、頂点が４つあるため、検出される候補頂点も４つとなる。 The four-corner detection unit 301 is a block that detects candidate vertices that are candidates for vertices of the shooting target area 510 by regarding that the shooting target area 510 included in the shot image 500 is a substantially rectangular document. Since a generally rectangular shape usually has four vertices, four candidate vertices are detected.

略矩形状の撮影対象領域５１０の４つの候補頂点を検出する処理としては、様々な周知の手法があり、四隅検出部３０１は周知のいずれの手法を用いてもよいが、本実施形態では、特許文献２（特開２０１３−１６８１１９）の段落[００５６]〜[０１０６]に示されている手法を用いる。以下では、この手法の処理内容の概要を説明する。 There are various known methods for detecting the four candidate vertices of the substantially rectangular imaging target area 510, and the four-corner detection unit 301 may use any known method, but in the present embodiment, The technique shown in paragraphs [0056] to [0106] of Patent Document 2 (Japanese Patent Laid-Open No. 2013-168119) is used. Below, the outline | summary of the processing content of this method is demonstrated.

図３は、四隅検出部３０１の処理を示すフローチャートである。まず、四隅検出部３０１は、撮影画像５００の中から、エッジを構成するエッジ画素の抽出を行い、エッジ画素を「１」として非エッジ画素を「０」とするエッジ画像（２値画像）を生成する（Ｓ１）。なお、エッジ画素の抽出には様々な周知の手法があり、四隅検出部３０１は、周知のいずれの手法を用いても構わない。但し、本実施形態では、撮影画像から輝度データに変換し、輝度データに対してキャニー法によるエッジ抽出を行う。 FIG. 3 is a flowchart showing processing of the four corner detection unit 301. First, the four-corner detection unit 301 extracts edge pixels constituting an edge from the captured image 500, and generates an edge image (binary image) in which the edge pixel is “1” and the non-edge pixel is “0”. Generate (S1). There are various known methods for extracting edge pixels, and the four-corner detection unit 301 may use any known method. However, in this embodiment, the captured image is converted into luminance data, and edge extraction is performed on the luminance data by the Canny method.

なお、本実施形態では、エッジ検出部３０６でもキャニー法が用いられるようになっており、キャニー法を用いたエッジ抽出の内容についてはエッジ検出部３０６と共に後に詳述する。 In the present embodiment, the canny method is also used in the edge detection unit 306, and details of edge extraction using the canny method will be described later together with the edge detection unit 306.

なお、四隅検出部３０１は、キャニー法にて生成されたエッジ画像に対して膨張収縮などのモルフォロジー変換を行ってもよい。 Note that the four-corner detection unit 301 may perform morphological transformation such as expansion and contraction on the edge image generated by the Canny method.

つぎに、四隅検出部３０１は、４方向または８方向に隣接（連結）しているエッジ画素同士のグループを連結エッジ領域とし、連結エッジ領域ごとに異なるラベルを付すラベリング処理を行う（Ｓ２）。 Next, the four corner detection unit 301 performs a labeling process in which a group of edge pixels adjacent (connected) in four directions or eight directions is set as a connected edge region, and a different label is attached to each connected edge region (S2).

続いて、四隅検出部３０１は、ラベル付けがされた各連結エッジ領域から、撮影対象領域５１０と背景領域５５０との境界を含む連結エッジ領域を特徴領域として抽出する処理を行う（Ｓ３）。撮影対象は、その中心を撮影画像の中心付近とし、撮影画像の中の大部分を占めるようにして撮影されるのが通常である。そこで、四隅検出部３０１は、以下の条件Ａを満たす連結エッジ領域を特徴領域として抽出する。 Subsequently, the four-corner detection unit 301 performs a process of extracting a connected edge region including a boundary between the imaging target region 510 and the background region 550 as a feature region from each of the labeled connected edge regions (S3). The subject is usually photographed so that its center is near the center of the photographed image and occupies most of the photographed image. Therefore, the four corner detection unit 301 extracts a connected edge region that satisfies the following condition A as a feature region.

条件Ａ：撮影画像において左上角部を原点とし、右方向（幅方向）をｘ軸、下方向（高さ方向）をｙ軸とし、撮影画像の右端のｘ座標をＸｍａｘ、撮影画像の下端のｙ座標をＹｍａｘとする。このとき、連結エッジ領域の幅方向の長さが撮影画像の幅（Ｘｍａｘ）の１／４以上、かつ、連結エッジ領域の高さ方向の長さが撮影画像の高さ（つまりＹｍａｘ）の１／４以上であり、かつ、連結エッジ領域の中心点のｘ座標が、Ｘｍａｘ／４以上かつ３×Ｘｍａｘ／４以下で、当該中心点のｙ座標が、Ｙｍａｘ／４以上かつ３×Ｙｍａｘ／４以下である。 Condition A: In the captured image, the upper left corner is the origin, the right direction (width direction) is the x axis, the lower direction (height direction) is the y axis, the x coordinate of the right end of the captured image is Xmax, and the lower end of the captured image is The y coordinate is Ymax. At this time, the length in the width direction of the connected edge region is ¼ or more of the width (Xmax) of the captured image, and the length in the height direction of the connected edge region is 1 of the height of the captured image (that is, Ymax). / 4 or more, and the x coordinate of the center point of the connected edge region is Xmax / 4 or more and 3 × Xmax / 4 or less, and the y coordinate of the center point is Ymax / 4 or more and 3 × Ymax / 4. It is as follows.

なお、連結エッジ領域の中心点の座標の求め方の一例として、連結エッジ領域の全画素の座標値の平均を中心点の座標とする手法が挙げられる。 An example of a method for obtaining the coordinates of the center point of the connected edge region is a method in which the average of the coordinate values of all the pixels in the connected edge region is used as the coordinate of the center point.

続いて、四隅検出部３０１は、特徴領域の中から、四つの候補頂点から形成される略矩形領域の上辺、左辺、右辺、下辺をなす直線を特定する処理（直線抽出処理）を行う（Ｓ４）。 Subsequently, the four-corner detection unit 301 performs processing (straight line extraction processing) for identifying straight lines that form the upper side, the left side, the right side, and the lower side of the substantially rectangular region formed from the four candidate vertices from the feature region (S4). ).

ここで、上辺は、撮影画像の中の上半分（つまり、ｙ座標が０からＹｍａｘ／２の範囲）に位置し、撮影画像の幅方向に平行である確率が高い。また、左辺は、撮影画像の左半分（つまり、ｘ座標が０からＸｍａｘ／２の範囲）に位置し、撮影画像の高さ方向に平行である確率が高い。右辺は、撮影画像の右半分（つまり、ｘ座標がＸｍａｘ／２からＸｍａｘの範囲）に位置し、撮影画像の高さ方向に平行である確率が高い。下辺は、撮影画像の下半分（つまり、ｙ座標がＹｍａｘ／２からＹｍａｘの範囲）に位置し、撮影画像の幅方向に平行である確率が高い。 Here, there is a high probability that the upper side is located in the upper half of the captured image (that is, the y coordinate is in the range of 0 to Ymax / 2) and is parallel to the width direction of the captured image. Further, the left side is located in the left half of the captured image (that is, the x coordinate is in the range of 0 to Xmax / 2) and has a high probability of being parallel to the height direction of the captured image. The right side is located in the right half of the captured image (that is, the x-coordinate is in the range of Xmax / 2 to Xmax) and has a high probability of being parallel to the height direction of the captured image. The lower side is located in the lower half of the captured image (that is, the y coordinate is in the range of Ymax / 2 to Ymax) and has a high probability of being parallel to the width direction of the captured image.

そこで、上辺、左辺、右辺、下辺の夫々について、存在確率の高い範囲において、特定の方向に連なるエッジ画素の数が最大であり、所定長さ以上の線分状のエッジ画素群を特徴領域の中から抽出する。抽出したエッジ画素群の近似直線の式を最小二乗法にて特定する。特定された近似直線が、上辺、左辺、右辺、下辺になる。 Therefore, for each of the upper side, the left side, the right side, and the lower side, the number of edge pixels connected in a specific direction is the maximum in the range where the existence probability is high, and a line-shaped edge pixel group having a predetermined length or more is selected as the feature region. Extract from inside. The expression of the approximate straight line of the extracted edge pixel group is specified by the method of least squares. The specified approximate straight line becomes the upper side, the left side, the right side, and the lower side.

Ｓ４による四辺の直線抽出処理が終了すると、四隅検出部３０１は、Ｓ４で求めた直線の式に基づいて交点座標を求める（Ｓ５）。すなわち、四隅検出部３０１は、左辺直線と上辺直線の交点座標を左上頂点座標、上辺直線と右辺直線の交点座標を右上頂点座標、右辺直線と下辺直線の交点座標を右下頂点座標、下辺直線と左辺直線の交点座標を左下頂点座標として求める。そして、四隅検出部３０１は、これら４つの頂点座標を候補頂点の座標とし、頂点座標を示した抽出結果情報を記憶部４００に保存する。 When the four-side straight line extraction process in S4 is completed, the four-corner detection unit 301 obtains the intersection coordinates based on the straight line formula obtained in S4 (S5). That is, the four corner detection unit 301 sets the intersection coordinates of the left-side straight line and the upper-side straight line as the upper-left vertex coordinates, the intersection coordinates of the upper-side straight line and the right-side straight line as the upper-right vertex coordinates, and the intersection coordinates of the right-side straight line and the lower-side straight line as the lower-right vertex coordinates. And the coordinates of the intersection of the straight line on the left side are obtained as the coordinates of the lower left vertex. Then, the four corner detection unit 301 uses these four vertex coordinates as the coordinates of the candidate vertices, and stores the extraction result information indicating the vertex coordinates in the storage unit 400.

なお、以上において、図３の各ステップの処理の概要を述べたが、図３の各ステップの処理の具体的内容については特許文献２に示されている通りである。また、四隅検出部３０１は、特許文献２に示されている手法以外の周知の手法を用いて、４つの候補頂点の座標を求めるようになっていてもよい。 In the above, the outline of the process of each step of FIG. 3 has been described. The specific contents of the process of each step of FIG. 3 are as shown in Patent Document 2. Further, the four corner detection unit 301 may obtain the coordinates of the four candidate vertices using a known method other than the method disclosed in Patent Document 2.

（平滑化処理部３０２）
つぎに、平滑化処理部３０２について説明する。本実施形態では、後述する統計処理部３０３において、撮影画像５００に含まれる背景領域５５０の階調値の統計量を求めるようになっている。この統計量は平均値ならびに標準偏差である。ここで、撮像素子の特性に依存したノイズが撮影画像データ（入力画像データ）に含まれてしまうと、ノイズが原因で標準偏差の値が大きく振れてしまうことがある。このノイズとしては、暗部ノイズ、熱電流あるいは撮像素子の出力バラつきによるノイズが挙げられる。このようなノイズの影響によって、背景領域のうちの互いに異なる複数の箇所で、本来はほぼ同じ階調値を示すはずなのに、大幅に異なる階調値を示すようになり、統計処理部３０３にて求められる統計値の誤差が大きくなる可能性がある。 (Smoothing processing unit 302)
Next, the smoothing processing unit 302 will be described. In this embodiment, a statistical processing unit 303 (to be described later) obtains the statistic of the gradation value of the background area 550 included in the captured image 500. This statistic is the mean as well as the standard deviation. Here, if noise depending on the characteristics of the image sensor is included in the captured image data (input image data), the value of the standard deviation may greatly fluctuate due to the noise. Examples of this noise include dark part noise, thermal current, and noise due to output variations of the image sensor. Due to the influence of such noise, a plurality of different portions in the background area should originally exhibit substantially the same gradation value, but show significantly different gradation values. There is a possibility that the error of the calculated statistical value becomes large.

そこで、統計処理部３０３の前段に位置する平滑化処理部３０２が、上記のようなノイズの影響を軽減あるいは除去するために、Ｒ，Ｇ，Ｂのチャンネル毎に、平滑化処理を行うようになっている。 Therefore, the smoothing processing unit 302 located in the preceding stage of the statistical processing unit 303 performs the smoothing processing for each of the R, G, and B channels in order to reduce or eliminate the influence of the noise as described above. It has become.

平滑化処理を行うための手段として、移動平均フィルタや荷重平均フィルタなどの畳み込み演算を行う空間フィルタが一般的に知られている。移動平均フィルタの一例を図４に示す。同フィルタを適用することによって、撮影画像の注目画素と周辺画素との間で階調値（画素値）が平均化されることになるため、階調値の変動が緩やかになり、平滑化処理がなされることになる。 As means for performing a smoothing process, a spatial filter that performs a convolution operation such as a moving average filter or a weighted average filter is generally known. An example of the moving average filter is shown in FIG. By applying the same filter, the gradation value (pixel value) is averaged between the target pixel of the captured image and the surrounding pixels, so the fluctuation of the gradation value becomes gradual and smoothing processing is performed. Will be made.

なお、平滑化処理部３０２による平滑化処理の代わりに、メディアンフィルタを用いるようになっていてもよい。 Note that a median filter may be used instead of the smoothing processing by the smoothing processing unit 302.

（統計処理部３０３）
統計処理部３０３は、撮影画像５００に含まれる背景領域５５０について階調値の統計処理を行うブロックである。統計処理部３０３の処理内容を具体的に説明すると以下のとおりである。 (Statistical processing unit 303)
The statistical processing unit 303 is a block that performs statistical processing of gradation values for the background region 550 included in the captured image 500. The processing contents of the statistical processing unit 303 will be specifically described as follows.

統計処理部３０３は、記憶部４００に保存されている抽出結果情報（四隅検出部３０１の処理結果を示した情報）を参照して、平滑化処理部３０２にて平滑化処理された撮影画像のなかから背景領域５５０を特定する。さらに、統計処理部３０３は、背景領域５５０を複数のサブ領域に分割し、サブ領域毎に、各領域の中心画素を中心とした６４×６４画素の参照範囲を設定する。統計処理部３０３は、サブ領域毎に、参照範囲における各チャンネル（Ｒ，Ｇ，Ｂの各チャンネル）毎の階調値の統計値を求める。求められる統計値は、参照範囲の各画素の階調値の平均値（階調平均値）と、参照範囲の各画素の階調値の標準偏差とである。 The statistical processing unit 303 refers to the extraction result information stored in the storage unit 400 (information indicating the processing result of the four-corner detection unit 301), and the statistical processing unit 303 of the captured image smoothed by the smoothing processing unit 302 The background area 550 is specified from among them. Further, the statistical processing unit 303 divides the background region 550 into a plurality of sub-regions, and sets a reference range of 64 × 64 pixels centering on the central pixel of each region for each sub-region. The statistical processing unit 303 obtains the statistical value of the gradation value for each channel (each channel of R, G, B) in the reference range for each sub-region. The obtained statistical values are an average value of gradation values (gradation average value) of each pixel in the reference range and a standard deviation of gradation values of each pixel in the reference range.

より具体的に説明すると、統計処理部３０３は、四隅検出部３０１にて検出された四つの候補頂点を直線（ライン）で結んで形成される矩形領域（図５の破線）と撮影画像５００の各辺との間の領域を背景領域５５０として扱う。さらに、統計処理部３０３は、背景領域５５０を、矩形領域の各辺（図５の破線）の各延長線によって区画される八つのサブ領域に分割し、サブ領域ごとに、前記参照範囲の統計値を求めるようになっている。 More specifically, the statistical processing unit 303 includes a rectangular area (broken line in FIG. 5) formed by connecting four candidate vertices detected by the four corner detection unit 301 with a straight line (a broken line in FIG. 5) and the captured image 500. An area between each side is treated as a background area 550. Further, the statistical processing unit 303 divides the background region 550 into eight sub-regions partitioned by each extension line of each side of the rectangular region (broken line in FIG. 5), and for each sub-region, the statistics of the reference range are divided. The value is calculated.

つまり、図５に示すように、背景領域５５０はサブ領域Ａ〜Ｈに分割される。そして、サブ領域ごとに、前記参照範囲について、Ｒ、Ｇ、Ｂのチャンネルごとの階調平均値（階調値の平均値）ならびに標準偏差を求める。図６は、図５のサブ領域Ａ〜Ｈの各々について、Ｒチャンネルの階調平均値ならびに標準偏差を示したものである。なお、図示を省略しているが、Ｇチャンネル、Ｂチャンネルについても、図６と同様に、図５のサブ領域Ａ〜Ｈの各々における階調平均値ならびに標準偏差が求められる。 That is, as shown in FIG. 5, the background area 550 is divided into sub-areas A to H. Then, for each sub-region, the gradation average value (average gradation value) and standard deviation for each of the R, G, and B channels are obtained for the reference range. FIG. 6 shows the grayscale average value and standard deviation of the R channel for each of the sub-regions A to H in FIG. Although not shown, for the G channel and the B channel, as in FIG. 6, the gradation average value and the standard deviation in each of the sub-regions A to H in FIG. 5 are obtained.

（条件設定部３０４）
条件設定部３０４は、統計処理部３０３の統計処理の結果に基づき、背景領域５５０の代表色と同一または類似色の階調値の数値範囲（条件）を設定するブロックである。この数値範囲は、ＲＧＢの各チャンネルにおいて、（階調平均値の平均値）±（ｎ×標準偏差の平均値）の範囲とする（ｎは所定の係数）。 (Condition setting unit 304)
The condition setting unit 304 is a block for setting a numerical range (condition) of gradation values of the same or similar color as the representative color of the background region 550 based on the result of the statistical processing of the statistical processing unit 303. This numerical range is a range of (average value of gradation average value) ± (n × average value of standard deviation) in each of the RGB channels (n is a predetermined coefficient).

例えば、図６のＲチャンネルの階調平均値を昇順に並び替えを行うと図７のようになり、階調平均値が最大値および最小値になる領域を除いた６つのサブ領域の階調平均値および標準偏差から、Ｒチャンネルの階調平均値の平均値（Ｒgra.avg.）、および、標準偏差の平均値（Ｒσavg.）を算出する。 For example, when the grayscale average values of the R channel in FIG. 6 are rearranged in ascending order, the result is as shown in FIG. 7, and the grayscales of the six sub-regions excluding the region where the grayscale average values are the maximum value and the minimum value are obtained. From the average value and the standard deviation, the average value of the R channel gradation average value (Rgra.avg.) And the average value of the standard deviation (Rσavg.) Are calculated.

すなわち、
Ｒgra.avg.＝((56.15+57.71+59.14+61.27+68.53+76.34))/6=63.19
となる。
また、Ｒチャンネルの標準偏差の平均値Ｒσavg.は、
Ｒσavg.=((1.41+1.09+1.19+0.98+1.22+1.21))/6=1.18
となり、これらの値をベースに数値範囲を設定する。 That is,
Rgra.avg. = ((56.15 + 57.71 + 59.14 + 61.27 + 68.53 + 76.34)) / 6 = 63.19
It becomes.
The average value Rσavg. Of the R channel standard deviation is
Rσavg. = ((1.41 + 1.09 + 1.19 + 0.98 + 1.22 + 1.21)) / 6 = 1.18
The numerical range is set based on these values.

具体的には、Ｒチャンネルについて、代表色と同一または類似色の階調値の数値範囲Ｒcandは、
（Ｒgra.avg.−ｎ×Ｒσ）≦Ｒcand≦（Ｒgra.avg.＋ｎ×Ｒσ）
となり、
所定の係数ｎを３に設定すると、
59.6≦Ｒcand≦66.7
となる。 Specifically, for the R channel, the numerical value range Rcand of gradation values of the same or similar color as the representative color is:
(Rgra.avg.−n × Rσ) ≦ Rcand ≦ (Rgra.avg. + N × Rσ)
And
If the predetermined coefficient n is set to 3,
59.6 ≦ Rcand ≦ 66.7
It becomes.

なお、数値範囲Ｒcandは、
（Ｒgra.avg.−ｎ×Ｒσ）≦Ｒcand≦（Ｒgra.avg.＋ｎ×Ｒσ）ではなく、
勿論、（Ｒgra.avg.−ｎ×Ｒσ）＜Ｒcand＜（Ｒgra.avg.＋ｎ×Ｒσ）であってもよい。 The numerical range Rcand is
(Rgra.avg.−n × Rσ) ≦ Rcand ≦ (Rgra.avg. + N × Rσ)
Of course, (Rgra.avg.−n × Rσ) <Rcand <(Rgra.avg. + N × Rσ) may be satisfied.

以上の処理をＧおよびＢのチャンネルについても同様に行い、Ｒcandのほか、ＧcandおよびＢcandも求める。 The above processing is similarly performed for the G and B channels, and Gcand and Bcand are obtained in addition to Rcand.

なお、統計処理部３０３および条件設定部３０４において、平均値の求め方は、算術平均のみならず、加重平均や相加平均等でもよい。また、本実施形態では、係数ｎを３としているが、特に限定されるものではなく、整数だけでなく小数であってもかまわない。また、統計処理部３０３および条件設定部３０４において、平均値の代わりに中央値を用いてもよい。 Note that, in the statistical processing unit 303 and the condition setting unit 304, the average value may be obtained not only by arithmetic average but also by weighted average or arithmetic average. In the present embodiment, the coefficient n is set to 3. However, the coefficient n is not particularly limited, and may be a decimal number as well as an integer. In the statistical processing unit 303 and the condition setting unit 304, a median value may be used instead of the average value.

（背景色判定部３０５）
背景色判定部３０５は、Ｒ，Ｇ，Ｂのチャンネル毎に、撮影画像５００の各画素について、条件設定部３０４にて設定された数値範囲を満たす背景色画素か否かの判定を行う。つまり、背景色判定部３０５は、Ｒチャンネルにおいて、階調値がＲcandになっている画素を背景色画素と判定し、それ以外の画素を背景色画素ではないと判定する。背景色判定部３０５は、Ｇチャンネルにおいて、階調値がＧcandになっている画素を背景色画素と判定し、それ以外の画素を背景色画素ではないと判定する。背景色判定部３０５は、Ｂチャンネルにおいて、階調値がＢcandになっている画素を背景色画素と判定し、それ以外の画素を背景色画素ではないと判定する。 (Background color determination unit 305)
The background color determination unit 305 determines whether each pixel of the captured image 500 is a background color pixel that satisfies the numerical range set by the condition setting unit 304 for each of the R, G, and B channels. That is, in the R channel, the background color determination unit 305 determines that a pixel whose gradation value is Rcand is a background color pixel, and determines that other pixels are not background color pixels. In the G channel, the background color determination unit 305 determines a pixel whose gradation value is Gcand as a background color pixel, and determines that other pixels are not background color pixels. In the B channel, the background color determination unit 305 determines a pixel whose gradation value is Bcand as a background color pixel, and determines that other pixels are not background color pixels.

さらに、背景色判定部３０５は、Ｒ，Ｇ，Ｂのチャンネル毎に、背景色画素か否かの判定結果を示す背景色画像を生成する。背景色画像は、背景色画素を「１」とし、背景色画素ではない画素を「０」とした２値画像である。図８は、Ｒ，Ｇ，Ｂのチャンネル毎に生成される背景色画像を示す。 Further, the background color determination unit 305 generates a background color image indicating the determination result of whether or not it is a background color pixel for each of the R, G, and B channels. The background color image is a binary image in which the background color pixel is “1” and the non-background color pixel is “0”. FIG. 8 shows a background color image generated for each of the R, G, and B channels.

なお、撮影画像５００に含まれる背景領域５５０のなかにおいて、撮像素子の特性等によって生じたホワイトノイズなどが孤立点としてあらわれ、この孤立点が背景色画素として判定されないケースがある。 In the background region 550 included in the photographed image 500, white noise or the like generated due to the characteristics of the image sensor appears as an isolated point, and this isolated point may not be determined as a background color pixel.

そこで、背景色判定部３０５は、背景色画像に対して膨張収縮などのモルフォロジー変換を実施することにより、上記孤立点を背景色画素に変換するようになっていてもよい。これにより孤立点が除去された背景色画像を生成できる。 Therefore, the background color determination unit 305 may convert the isolated points into background color pixels by performing morphological conversion such as expansion and contraction on the background color image. As a result, a background color image from which isolated points are removed can be generated.

（エッジ検出部３０６）
エッジ検出部３０６は、Ｒ，Ｇ，Ｂのチャンネル毎に、撮影画像５００からエッジ画素の検出処理（抽出処理）を行う。具体的には、Ｒ，Ｇ，Ｂのチャンネル毎に、撮影画像５００を構成する各画素がエッジ画素（エッジを構成する画素）か否かの判定を行うことになる。 (Edge detection unit 306)
The edge detection unit 306 performs edge pixel detection processing (extraction processing) from the captured image 500 for each of the R, G, and B channels. Specifically, for each of R, G, and B channels, it is determined whether or not each pixel constituting the captured image 500 is an edge pixel (a pixel that constitutes an edge).

エッジ検出部３０６によるエッジ画素の検出処理としては、周知の様々なエッジ検出手法（エッジ抽出手法）を用いることができるが、本実施形態ではキャニー法（キャニーフィルタ）を用いる。キャニー法とは、ガウシアンフィルタ（図９）とソーベルフィルタ（図１０）とを用いて細線化されたエッジを検出する手法を指す。以下では、エッジ検出部３０６による処理（キャニー法）を説明する。 As edge pixel detection processing by the edge detection unit 306, various well-known edge detection methods (edge extraction methods) can be used. In this embodiment, a canny method (canny filter) is used. The Canny method refers to a method of detecting a thinned edge using a Gaussian filter (FIG. 9) and a Sobel filter (FIG. 10). Hereinafter, processing (canny method) by the edge detection unit 306 will be described.

図１１は、エッジ検出部３０６の処理内容を示したフローチャートである。エッジ検出部３０６は、Ｒ，Ｇ，Ｂのチャンネル毎に、図１１の処理を実行するようになっている。 FIG. 11 is a flowchart showing the processing contents of the edge detection unit 306. The edge detection unit 306 executes the processing of FIG. 11 for each of R, G, and B channels.

まず、エッジ検出部３０６は、撮影画像に対してガウシアンフィルタを用いて平滑化を行う（Ｓ１０１）。これにより、撮影画像のノイズが抑制される。 First, the edge detection unit 306 smoothes the captured image using a Gaussian filter (S101). Thereby, the noise of a picked-up image is suppressed.

つぎに、エッジ検出部３０６は、ソーベルフィルタを用いて画素毎にエッジ強度とエッジ方向とを求める（Ｓ２０１）。具体的には、エッジ検出部３０６は、注目画素（Ｘ，Ｙ）のエッジ強度▽Ｌとエッジ方向θとを、下記の式（１）および式（２）によって求める。
▽Ｌ=√（Ｌｘ^２+Ｌｙ^２）式（１）
θ=ｔａｎ^−１（Ｌｙ／Ｌｘ）式（２）
Ｌｘは、ソーベルフィルタの水平方向微分値であり、Ｌｙは、ソーベルフィルタの垂直方向微分値である。 Next, the edge detection unit 306 obtains edge strength and edge direction for each pixel using a Sobel filter (S201). Specifically, the edge detection unit 306 obtains the edge intensity ▽ L and the edge direction θ of the target pixel (X, Y) by the following equations (1) and (2).
▽ L = √ (Lx ² + Ly ² ) Formula (1)
θ = tan ⁻¹ (Ly / Lx) Equation (2)
Lx is a horizontal differential value of the Sobel filter, and Ly is a vertical differential value of the Sobel filter.

続いて、エッジ検出部３０６は、Ｓ２０１で求めたθ（エッジ方向）を量子化する（Ｓ３０１）。具体的には、０°，４５°,９０°，１３５°のいずれかの角度に量子化する。 Subsequently, the edge detection unit 306 quantizes θ (edge direction) obtained in S201 (S301). Specifically, quantization is performed to any angle of 0 °, 45 °, 90 °, and 135 °.

つぎに、エッジ検出部３０６は、Ｓ２０１で求めたエッジ強度▽Ｌと、Ｓ３０１で量子化したθとを参照して、エッジを細くするための細線化処理（Ｓ４０１）を行う。細線化処理は以下に示すように実行される。
エッジ検出部３０６は、θ=０°の注目画素（Ｘ，Ｙ）について、▽Ｌ（Ｘ，Ｙ）＞▽Ｌ（Ｘ−１，Ｙ）且つ▽Ｌ（Ｘ，Ｙ）＞▽Ｌ（Ｘ＋１，Ｙ）の条件を満たす場合、▽Ｌ（Ｘ，Ｙ）の値を維持し（変更しない）、この条件を満たさない場合、▽Ｌ（Ｘ，Ｙ）の値を０に変更する。
エッジ検出部３０６は、θ=４５°の注目画素（Ｘ，Ｙ）について、▽Ｌ（Ｘ，Ｙ）＞▽Ｌ（Ｘ−１，Ｙ＋１）且つ▽Ｌ（Ｘ，Ｙ）＞▽Ｌ（Ｘ＋１，Ｙ−１）の条件を満たす場合、▽Ｌ（Ｘ，Ｙ）の値を維持し（変更しない）、この条件を満たさない場合、▽Ｌ（Ｘ，Ｙ）の値を０に変更する。
エッジ検出部３０６は、θ=９０°の注目画素（Ｘ，Ｙ）について、▽Ｌ（Ｘ，Ｙ）＞▽Ｌ（Ｘ，Ｙ−１）且つ▽Ｌ（Ｘ，Ｙ）＞▽Ｌ（Ｘ，Ｙ＋１）の条件を満たす場合、▽Ｌ（Ｘ，Ｙ）の値を維持し（変更しない）、この条件を満たさない場合、▽Ｌ（Ｘ，Ｙ）の値を０に変更する。
エッジ検出部３０６は、θ=１３５°の注目画素（Ｘ，Ｙ）について、▽Ｌ（Ｘ，Ｙ）＞▽Ｌ（Ｘ−１，Ｙ−１）且つ▽Ｌ（Ｘ，Ｙ）＞▽Ｌ（Ｘ＋１，Ｙ＋１）の条件を満たす場合、▽Ｌ（Ｘ，Ｙ）の値を維持し（変更しない）、この条件を満たさない場合、▽Ｌ（Ｘ，Ｙ）の値を０に変更する。 Next, the edge detection unit 306 performs a thinning process (S401) for thinning the edge with reference to the edge intensity ▽ L obtained in S201 and the θ quantized in S301. The thinning process is executed as shown below.
The edge detection unit 306 selects ▽ L (X, Y)> ▽ L (X-1, Y) and ▽ L (X, Y)> ▽ L (X + 1) for the target pixel (X, Y) at θ = 0 °. , Y), the value of ▽ L (X, Y) is maintained (does not change), and if this condition is not satisfied, the value of ▽ L (X, Y) is changed to 0.
The edge detection unit 306 selects ▽ L (X, Y)> ▽ L (X-1, Y + 1) and ▽ L (X, Y)> ▽ L (X + 1) for the pixel of interest (X, Y) at θ = 45 °. , Y-1), the value of ▽ L (X, Y) is maintained (does not change), and if this condition is not satisfied, the value of ▽ L (X, Y) is changed to 0.
The edge detection unit 306 selects ▽ L (X, Y)> ▽ L (X, Y-1) and ▽ L (X, Y)> ▽ L (X) for the target pixel (X, Y) at θ = 90 °. , Y + 1), the value of 変更 L (X, Y) is maintained (does not change), and if this condition is not satisfied, the value of ▽ L (X, Y) is changed to 0.
The edge detection unit 306 selects ▽ L (X, Y)> ▽ L (X-1, Y-1) and ▽ L (X, Y)> ▽ L for the pixel of interest (X, Y) at θ = 135 °. When the condition of (X + 1, Y + 1) is satisfied, the value of ▽ L (X, Y) is maintained (not changed), and when the condition is not satisfied, the value of ▽ L (X, Y) is changed to 0.

以上のように▽Ｌおよびθを用いることでＳ３の細線化処理を実行できる。次に、エッジ検出部３０６は、画素毎に、エッジ強度に対してヒステリシス閾値処理を行うことにより、エッジか否かの判定を行う（Ｓ５０１）。 As described above, the thinning process of S3 can be executed by using ▽ L and θ. Next, the edge detection unit 306 determines whether or not it is an edge by performing hysteresis threshold processing on the edge intensity for each pixel (S501).

具体的には、エッジ検出部３０６は、▽Ｌ（Ｘ，Ｙ）＞高レベル閾値であれば、注目画素（Ｘ，Ｙ）をエッジ画素と判定し、▽Ｌ（Ｘ，Ｙ）＜低レベル閾値であれば、注目画素（Ｘ，Ｙ）を非エッジ画素と判定する。なお、高レベル閾値＞低レベル閾値であり、画像データが８ビットの場合、例えば高レベル閾値は１４４であり、低レベル閾値は６４と設定される。但し、高レベル閾値＞低レベル閾値さえ成立していれば、１４４や６４という値に限定されないことはいうまでもない。 Specifically, the edge detection unit 306 determines that the pixel of interest (X, Y) is an edge pixel if ▽ L (X, Y)> high level threshold, and ▽ L (X, Y) <low level. If it is a threshold value, the pixel of interest (X, Y) is determined as a non-edge pixel. When the high level threshold value> the low level threshold value and the image data is 8 bits, for example, the high level threshold value is 144 and the low level threshold value is set to 64. However, it goes without saying that the value is not limited to 144 or 64 as long as the high level threshold value> the low level threshold value holds.

また、エッジ検出部３０６は、低レベル閾値≦▽Ｌ（Ｘ，Ｙ）≦高レベル閾値の場合、注目画素（Ｘ，Ｙ）がエッジ画素に連結（隣接）していれば注目画素（Ｘ，Ｙ）をエッジ画素と判定し、エッジ画素に連結していなければ注目画素（Ｘ，Ｙ）を非エッジ画素と判定する。 Further, the edge detection unit 306 determines that the target pixel (X, Y) is connected (adjacent) to the edge pixel when the low level threshold ≦≦ L (X, Y) ≦ high level threshold. Y) is determined as an edge pixel, and if not connected to the edge pixel, the target pixel (X, Y) is determined as a non-edge pixel.

続いて、エッジ検出部３０６は、エッジ画素か否かの判定結果を示すエッジ画像を生成する（Ｓ６０１）。エッジ画像は、エッジ画素を「１」、非エッジ画素を「０」とした２値画像である。なお、エッジ検出部３０６は、当該エッジ画像に対して膨張収縮などのモルフォロジー変換を行ってもよい。 Subsequently, the edge detection unit 306 generates an edge image indicating a determination result of whether or not the pixel is an edge pixel (S601). The edge image is a binary image in which the edge pixel is “1” and the non-edge pixel is “0”. Note that the edge detection unit 306 may perform morphological conversion such as expansion and contraction on the edge image.

エッジ検出部３０６が、以上にて示した図１１の処理を、Ｒ，Ｇ，Ｂのチャンネル毎に実行する。これにより、Ｒ，Ｇ，Ｂのチャンネル毎にエッジ画像が生成されることになる。図１２は、Ｒ，Ｇ，Ｂのチャンネル毎に生成されるエッジ画像を示す。 The edge detection unit 306 executes the above-described processing of FIG. 11 for each of R, G, and B channels. As a result, an edge image is generated for each of the R, G, and B channels. FIG. 12 shows an edge image generated for each of R, G, and B channels.

なお、四隅検出部３０１によるエッジ抽出（図３のＳ１）についても、キャニー法が適用されるため、図１１の処理と同じ処理が行われる。但し、エッジ検出部３０６においては、Ｒ，Ｇ，Ｂのチャンネル毎に図１１に示すキャニー法が実行されるのに対し、四隅検出部３０１においては輝度データに対してキャニー法が実行される。 Note that since the Canny method is applied to edge extraction (S1 in FIG. 3) by the four corner detection unit 301, the same processing as the processing in FIG. 11 is performed. However, while the edge detection unit 306 performs the Canny method shown in FIG. 11 for each of the R, G, and B channels, the four-corner detection unit 301 executes the Canny method on the luminance data.

（特定部３０７）
特定部３０７は、背景色判定部３０５にて生成された背景色画像と、エッジ検出部３０６にて生成されたエッジ画像とを参照して、Ｒ，Ｇ，Ｂの全チャンネルにおいて背景色画素ではないと判定されエッジ画素と判定された画素を特定する処理を行う。 (Specific part 307)
The specifying unit 307 refers to the background color image generated by the background color determination unit 305 and the edge image generated by the edge detection unit 306, and determines the background color pixels in all the R, G, and B channels. A process of specifying a pixel that is determined to be an edge pixel and determined not to be performed is performed.

つまり、特定部３０７は、Ｒ，Ｇ，Ｂの全てのチャンネルにおいて、図８の背景色画像では背景色画素ではなく、図１２のエッジ画像ではエッジ画素になっている画素を特定する処理を行う。なお、図８の背景色画像では、０のフラグ（黒を示すフラグ）が立っている画素が背景色画素ではない画素に相当し、図１２のエッジ画像では、１のフラグ（白を示すフラグ）が立っている画素がエッジ画素に相当する。 That is, the specifying unit 307 performs processing for specifying pixels that are not background color pixels in the background color image of FIG. 8 but are edge pixels in the edge image of FIG. 12 in all channels R, G, and B. . In the background color image of FIG. 8, a pixel with a 0 flag (a flag indicating black) is equivalent to a pixel that is not a background color pixel. In the edge image of FIG. ) Stands for the edge pixel.

続いて、特定部３０７は、Ｒ，Ｇ，Ｂの全チャンネルにおいて背景色画素ではないと判定されエッジ画素と判定された画素を特定する非背景色/エッジ画像を生成する。図１３は、非背景色/エッジ画像９００を示す。図１３の非背景色/エッジ画像９００は、Ｒ，Ｇ，Ｂの全チャンネルにおいて背景色画素ではないと判定されエッジ画素と判定された画素のフラグを１（白）として、それ以外の画素のフラグを０（黒）とした２値画像である。 Subsequently, the specifying unit 307 generates a non-background color / edge image that specifies pixels that are determined not to be background color pixels and determined to be edge pixels in all of the R, G, and B channels. FIG. 13 shows a non-background color / edge image 900. The non-background color / edge image 900 in FIG. 13 is set to 1 (white) as the flag of the pixels that are determined not to be background color pixels in all the R, G, and B channels and determined to be edge pixels. This is a binary image with the flag set to 0 (black).

（抽出部３０８）
抽出部３０８は、撮影部２００から入力された撮影画像５００（図２）から、撮影対象領域５１０を抽出する処理を行う。以下、抽出部３０８の処理を説明する。 (Extractor 308)
The extraction unit 308 performs processing for extracting the imaging target area 510 from the captured image 500 (FIG. 2) input from the imaging unit 200. Hereinafter, the processing of the extraction unit 308 will be described.

本実施形態の撮影画像５００（図２）は、机等を示す背景領域５５０と、書面等を示す撮影対象領域５１０とを含む。それゆえ、エッジ検出部３０６による処理では、撮影対象領域５１０の輪郭がエッジとして検出されるが、机の上に傷等が形成されている場合には背景領域５５０においても前記傷等がエッジとして検出されることもある。 The captured image 500 (FIG. 2) of the present embodiment includes a background area 550 indicating a desk or the like, and an imaging target area 510 indicating a document or the like. Therefore, in the processing by the edge detection unit 306, the outline of the imaging target area 510 is detected as an edge. However, when a scratch or the like is formed on the desk, the scratch or the like is also detected as an edge in the background area 550. It may be detected.

ところが、背景領域５５０のエッジ（机の上の傷等）は、上述したようにエッジとして検出されることがあるものの、背景色を示すことには変わらないため、背景色画素と判定される可能性が高い。したがって、図１３の非背景色/エッジ画像にて特定されている特定画素（背景色画素ではないと判定されエッジ画素と判定された画素）から、背景領域５５０のエッジ（机等の傷）は外されることになる。 However, although the edge of the background area 550 (such as a scratch on the desk) may be detected as an edge as described above, it does not change to indicate the background color, so it can be determined as a background color pixel. High nature. Therefore, the edge (scratches on the desk, etc.) of the background region 550 is determined from the specific pixel (pixel determined not to be the background color pixel and determined to be the edge pixel) specified in the non-background color / edge image of FIG. Will be removed.

そして、以上の点からすれば、図１３の非背景色/エッジ画像９００において、上辺に直交する方向に向けて上辺から特定画素を探索した場合に上辺から最も近い位置にある特定画素と、下辺に直交する方向に向けて下辺から特定画素を探索した場合に下辺から最も近い位置にある特定画素と、左辺に直交する方向に向けて左辺から特定画素を探索した場合に左辺から最も近い位置にある特定画素と、右辺に直交する方向に向けて右辺から特定画素を探索した場合に右辺に最も近い位置にある特定画素とからなる画素群は、撮影画像５００の撮影対象領域５１０の輪郭線に相当すると考えられる。 From the above points, in the non-background color / edge image 900 of FIG. 13, when searching for a specific pixel from the upper side in a direction orthogonal to the upper side, the specific pixel located closest to the upper side, and the lower side When searching for a specific pixel from the lower side toward the direction orthogonal to the specific pixel located closest to the lower side, and when searching for a specific pixel from the left side toward the direction orthogonal to the left side, closest to the left side When searching for a specific pixel from the right side in a direction orthogonal to the right side, a pixel group including the specific pixel closest to the right side is a contour line of the shooting target area 510 of the shot image 500. It seems to be equivalent.

そこで、抽出部３０８は、図１３の非背景色/エッジ画像９００を参照して、撮影対象領域５１０の輪郭線を成す輪郭画素を抽出する。具体的に、抽出部３０８は、（ａ）撮影画像５００の上辺を構成する画素ごとに、当該画素から上辺と直交する方向に向けて並んでいる画素列の中から当該画素に最も近い位置にある特定画素を探索する第１探索処理と、（ｂ）撮影画像５００の下辺を構成する画素ごとに、当該画素から下辺と直交する方向に向けて並んでいる画素列の中から当該画素に最も近い位置にある特定画素を探索する第２探索処理と、（ｃ）撮影画像５００の左辺を構成する画素ごとに、当該画素から左辺と直交する方向に向けて並んでいる画素列の中から当該画素に最も近い位置にある特定画素を探索する第３探索処理と、（ｄ）撮影画像５００の右辺を構成する画素ごとに、当該画素から右辺に直交する方向に向けて並んでいる画素列の中から当該画素に最も近い位置にある特定画素を探索する第４探索処理とを実行する。なお、図１４は、第１探索処理〜第４探索処理の各々について、探索される画素列のうちの一つを示した模式図である。第１探索処理〜第４探索処理にて抽出された画素を輪郭画素とする。 Therefore, the extraction unit 308 refers to the non-background color / edge image 900 in FIG. 13 and extracts the contour pixels that form the contour line of the imaging target region 510. Specifically, the extraction unit 308 (a) for each pixel that configures the upper side of the captured image 500, the extraction unit 308 is located at the position closest to the pixel from among the pixel columns that are aligned in the direction orthogonal to the upper side from the pixel. A first search process for searching for a specific pixel; and (b) for each pixel constituting the lower side of the captured image 500, the pixel from the pixel row arranged in a direction orthogonal to the lower side from the pixel is the most. A second search process for searching for a specific pixel at a close position; and (c) for each pixel constituting the left side of the photographed image 500, the pixel row is aligned from the pixel row arranged in a direction orthogonal to the left side. A third search process for searching for a specific pixel at a position closest to the pixel; and (d) for each pixel constituting the right side of the photographed image 500, a pixel row arranged in a direction orthogonal to the right side from the pixel. The pixel is the most Performing a fourth search processing for searching for a specific pixel at the position are. FIG. 14 is a schematic diagram illustrating one of the pixel columns to be searched for each of the first search process to the fourth search process. The pixels extracted in the first search process to the fourth search process are set as contour pixels.

さらに、抽出部３０８は、前記輪郭画素、および、前記輪郭画素に囲まれている画素のフラグを０（黒）とし、その他の画素のフラグを１（白）とした２値画像をマスク画像として生成する。図１５は、抽出部３０８にて生成されるマスク画像７００を示した図である。 Further, the extraction unit 308 uses, as a mask image, a binary image in which the flags of the contour pixels and the pixels surrounded by the contour pixels are set to 0 (black) and the flags of the other pixels are set to 1 (white). Generate. FIG. 15 is a diagram showing a mask image 700 generated by the extraction unit 308.

続いて、抽出部３０８は、生成したマスク画像７００を用いて撮影画像５００にマスキング処理を行うことにより、撮影画像５００のなかから撮影対象領域５１０を抽出する。具体的には、抽出部３０８は、図１５のマスク画像７００にて前記輪郭画素および前記輪郭画素に囲まれている画素（０（黒）のフラグで示されている画素）については、撮影部２００から入力した撮影画像５００の階調値を与え、図１５のマスク画像において前記輪郭画素よりも外側に位置する画素（１（白）のフラグで示されている画素）については、所定の階調値（例えば画像データが８ビットで表される場合は「０」）を与えた抽出済画像８００を生成する。図１６は、抽出済画像８００を示す図である。図１６の抽出済画像８００は、図２の撮影画像５００から抽出された撮影対象領域５１０が示され、撮影対象領域５１０の周囲は黒色が均一に塗り潰されたベタ画像領域となっている。つまり、図１６の抽出済画像８００は、図２の撮影画像５００において撮影対象領域５１０以外の箇所が黒ベタ領域でマスキングされた画像に相当する。 Subsequently, the extraction unit 308 extracts a photographing target area 510 from the photographed image 500 by performing a masking process on the photographed image 500 using the generated mask image 700. Specifically, the extraction unit 308 performs an imaging unit for the contour pixels and pixels surrounded by the contour pixels (pixels indicated by a 0 (black) flag) in the mask image 700 of FIG. A gradation value of the photographed image 500 input from 200 is given, and a pixel (a pixel indicated by a 1 (white) flag) located outside the contour pixel in the mask image of FIG. An extracted image 800 having a tone value (for example, “0” when the image data is represented by 8 bits) is generated. FIG. 16 is a diagram showing an extracted image 800. The extracted image 800 in FIG. 16 shows a shooting target area 510 extracted from the shooting image 500 in FIG. 2, and the periphery of the shooting target area 510 is a solid image area in which black is uniformly filled. That is, the extracted image 800 in FIG. 16 corresponds to an image in which a portion other than the shooting target area 510 in the captured image 500 in FIG. 2 is masked with a black solid area.

（歪み補正部３０９）
歪み補正部３０９は、書面のような矩形状の撮影対象領域や、書籍の見開き頁などの若干湾曲した略矩形状の撮影対象領域に対し、撮影対象の正面の法線方向に対して傾いた方向から撮影対象を撮影することによる撮影対象物の歪みを補正するとともに、撮影対象領域の傾きを補正するブロックである。歪み補正部３０９の補正手法としては、周知の様々な手法を利用できるが、一例として特許文献３（特開２０１３−４０８８）に開示されている手法を挙げることができる。 (Distortion correction unit 309)
The distortion correction unit 309 is inclined with respect to the normal direction of the front of the shooting target with respect to a rectangular shooting target area such as a document or a slightly curved shooting target area such as a spread page of a book. It is a block that corrects the distortion of the object to be photographed by photographing the object to be photographed from the direction and also corrects the inclination of the area to be photographed. As a correction method of the distortion correction unit 309, various known methods can be used. As an example, a method disclosed in Patent Document 3 (Japanese Patent Laid-Open No. 2013-4088) can be given.

歪み補正部３０９は、抽出部３０８にて生成された抽出済画像８００（図１６）に示されている撮影対象領域に対して歪み補正を行う。 The distortion correction unit 309 performs distortion correction on the imaging target region indicated in the extracted image 800 (FIG. 16) generated by the extraction unit 308.

（フォーマット処理部３１０）
フォーマット処理部３１０は、歪み補正部３０９にて歪み補正された後の抽出済画像８００を、所定フォーマットの画像ファイルに変換する。所定フォーマットとしては、ＪＰＥＧ若しくはＴＩＦＦが挙げられる。フォーマット処理部３１０は、生成した画像ファイルを記憶部４００に保存する。 (Format processing unit 310)
The format processing unit 310 converts the extracted image 800 after distortion correction by the distortion correction unit 309 into an image file of a predetermined format. Examples of the predetermined format include JPEG and TIFF. The format processing unit 310 stores the generated image file in the storage unit 400.

本実施形態の情報処理部３００においては、以上の通り、四隅検出部３０１〜抽出部３０８の処理によって、撮影画像５００から抽出された撮影対象領域５１０を示した抽出済画像８００が生成され、歪み補正部３０９によって、抽出済画像８００に歪み補正が施され、歪み補正の施された抽出済画像８００が所定フォーマットの画像ファイルに変換されて保存されることになる。 In the information processing unit 300 of the present embodiment, as described above, the extracted image 800 showing the shooting target area 510 extracted from the shot image 500 is generated by the processing of the four corner detection units 301 to 308, and the distortion is generated. The correction unit 309 performs distortion correction on the extracted image 800, and the extracted image 800 subjected to distortion correction is converted into an image file of a predetermined format and stored.

そして、本実施形態では、エッジ検出部３０６のエッジ検出の結果と、背景色判定部３０５による背景色検出の結果とを併用して、撮影対象領域５１０を抽出している。これにより撮影対象領域５１０の抽出精度を相乗的に高めることができる。この点を以下具体的に説明する。 In the present embodiment, the imaging target region 510 is extracted using both the edge detection result of the edge detection unit 306 and the background color detection result of the background color determination unit 305. Thereby, the extraction accuracy of the imaging target area 510 can be synergistically increased. This point will be specifically described below.

書籍等を撮影対象として撮影した場合、図２に示すように、撮影画像５００においては、背景領域５５０に陰影領域５２０が示されることがある（同図では、背景領域５５０のうち、書籍の上辺付近に陰影領域５２０が形成されている）。陰影領域５２０は、照明光の影響による影、背景にて映りこまれたユーザの影（意図しない影）を示す領域である。 When a book or the like is photographed as a subject of photographing, as shown in FIG. 2, in the photographed image 500, a shadow region 520 may be shown in the background region 550 (in FIG. 2, the upper side of the book in the background region 550). A shadow area 520 is formed in the vicinity). The shadow area 520 is an area showing a shadow due to the influence of illumination light and a user's shadow (unintended shadow) reflected in the background.

陰影領域５２０は基本的にはグレイ色である。したがって、背景色判定部３０５が、図２に示す撮影画像５００を対象として画素毎に背景色画素か否かの判定を行うと、背景領域５５０に位置する陰影領域５２０が背景色画素として判定されないというケースが生じる。 The shaded area 520 is basically a gray color. Therefore, when the background color determination unit 305 determines whether or not each pixel is a background color pixel for the captured image 500 shown in FIG. 2, the shadow area 520 located in the background area 550 is not determined as a background color pixel. The case occurs.

ところが、本実施形態では、エッジ検出部３０６のエッジ検出の結果を併用しているため、陰影領域５２０が撮影対象領域５１０として抽出されることが抑制される。これは、陰影領域５２０はエッジとして検出されないし、背景色画素と判定された部分と陰影領域５２０との境界は、撮影対象領域５１０と陰影領域５２０との境界よりもエッジ強度が低いため、背景色画素と判定された部分と陰影領域５２０との境界がエッジとして判定されることは少ないからである。つまり、背景色画素か否かの判定結果を利用する手法の欠点を、エッジ検出結果を利用する手法によって補っていることになる。 However, in the present embodiment, since the result of edge detection by the edge detection unit 306 is used together, the shadow area 520 is suppressed from being extracted as the imaging target area 510. This is because the shadow area 520 is not detected as an edge, and the boundary between the portion determined to be the background color pixel and the shadow area 520 has a lower edge strength than the boundary between the shooting target area 510 and the shadow area 520, This is because the boundary between the portion determined to be a color pixel and the shadow area 520 is rarely determined as an edge. That is, the drawback of the method using the determination result of whether or not it is a background color pixel is compensated by the method using the edge detection result.

これに対し、前述したように、エッジ検出部３０６による処理では、撮影対象領域５１０の輪郭がエッジとして検出されるが、机の上に傷等が形成されている場合には背景領域５５０においても前記傷等がエッジとして検出されるケースもある。 On the other hand, as described above, in the processing by the edge detection unit 306, the outline of the imaging target area 510 is detected as an edge. However, when a scratch or the like is formed on the desk, the background area 550 is also detected. In some cases, the scratch or the like is detected as an edge.

ところが、背景領域５５０の机の上の傷等は、上述したようにエッジとして検出されることがあるものの、背景色と類似する色を示すため、背景色画素と判定される可能性が高い。したがって、最終的には、抽出部３０８にて、背景領域５５０のエッジ（机の上の傷等）が撮影対象領域５１０の一部として抽出されることを抑制できる。すなわち、エッジ検出結果を利用する手法の欠点を、背景色画素か否かの判定結果を利用する手法によって補っているのである。 However, although a scratch on the desk in the background region 550 may be detected as an edge as described above, it shows a color similar to the background color, and thus is highly likely to be determined as a background color pixel. Therefore, finally, the extraction unit 308 can suppress the extraction of the edge of the background area 550 (such as a scratch on the desk) as a part of the imaging target area 510. That is, the drawback of the method using the edge detection result is compensated by the method using the determination result of whether or not it is a background color pixel.

このように、本実施形態では、エッジ検出部３０６のエッジ検出の結果と、背景色判定部３０５による背景色検出の結果とを併用することにより、互いの欠点を補い合っているため、撮影対象領域５１０の抽出精度を相乗的に高めることができるのである。 As described above, in this embodiment, since the edge detection result of the edge detection unit 306 and the background color detection result of the background color determination unit 305 are used in combination, the disadvantages of each other are compensated. The extraction accuracy of 510 can be increased synergistically.

図１６は、最終的に生成される抽出済画像８００であるが、抽出済画像８００において、図２の陰影領域５２０は、背景を成すベタ画像領域に塗り潰されており、撮影対象領域５１０と共に抽出されていないことがわかる。 FIG. 16 shows an extracted image 800 that is finally generated. In the extracted image 800, the shaded area 520 in FIG. 2 is filled with the solid image area that forms the background, and is extracted together with the imaging target area 510. You can see that it has not been done.

また、以上の実施形態では、図５に示すように背景領域５５０を複数のサブ領域Ａ〜Ｈに分割し、サブ領域の中心位置を含む所定サイズの参照領域をサブ領域ごとに設定し、サブ領域毎に参照領域の階調値の統計処理を行うようになっている。これにより、撮影対象（書籍等）の縁や影の影響の受けにくい領域を対象として前記統計処理を行うことができるため、背景色判定部３０５の判定精度を向上させることができ、これにより撮影対象領域５１０の抽出精度も向上させることができる。 In the above embodiment, as shown in FIG. 5, the background region 550 is divided into a plurality of sub-regions A to H, and a reference region having a predetermined size including the center position of the sub-region is set for each sub-region. Statistical processing of the gradation value of the reference area is performed for each area. As a result, the statistical processing can be performed on a region that is not easily affected by the edge or shadow of the subject to be photographed (such as a book), so that the judgment accuracy of the background color judgment unit 305 can be improved, thereby photographing. The extraction accuracy of the target area 510 can also be improved.

また、本実施形態では、図７に示すように、条件設定部３０４は、階調平均値が最大値および最小値になるサブ領域（Ａ、Ｅ）を除いた上で、背景の代表色と同一または類似色の数値範囲を求めている。この理由は以下の通りである。背景領域５５０の一部において、偶然、光の反射（照明の映り込み等）によって他の領域に比べ極端に階調値が高くなったり、偶然、影（撮影者の影）が形成されることによって他の領域に比べ極端に階調値が低くなったり、標準偏差が大きくなるという特異な状況が生じることがある。そこで、階調平均値が最大値および最小値になる領域を除くことにより、背景領域５５０の代表色と同一または類似色の階調値の数値範囲の算出精度を向上させているのである。 In the present embodiment, as shown in FIG. 7, the condition setting unit 304 removes the sub-regions (A, E) where the gradation average value is the maximum value and the minimum value, The numerical value range of the same or similar color is obtained. The reason is as follows. In a part of the background area 550, the gradation value becomes extremely high as compared with other areas due to light reflection (reflection of illumination, etc.), or a shadow (shadow of the photographer) is accidentally formed. As a result, there may be a unique situation in which the gradation value becomes extremely low or the standard deviation becomes large compared to other regions. Therefore, by excluding the area where the average gradation value is the maximum value and the minimum value, the calculation accuracy of the numerical value range of the gradation value of the same or similar color as the representative color of the background area 550 is improved.

また、情報処理部３００においては、撮影画像５００から撮影対象領域５１０を抽出した抽出済画像８００（図１６）を生成できればよく、抽出済画像８００の生成とは直接関係の無い歪み補正部３０９およびフォーマット処理部３１０は特に備えられていなくてもよい。また、平滑化処理部３０２も備えられていなくてもよい。 The information processing unit 300 only needs to be able to generate an extracted image 800 (FIG. 16) obtained by extracting the shooting target region 510 from the shot image 500, and the distortion correction unit 309 that is not directly related to the generation of the extracted image 800 and The format processing unit 310 may not be particularly provided. Further, the smoothing processing unit 302 may not be provided.

また、エッジ検出部３０６は、キャニー法によってエッジを検出しているが、キャニー法以外の手法を用いても構わない。例えば、注目画素について、水平方向あるいは垂直方向に隣接する画素との差分を求め、差分に対して閾値処理を行うことによってエッジ画素か否かを判定してもよい。閾値は、画像データが８ビットの場合、例えば４８とする。 Further, although the edge detection unit 306 detects the edge by the canny method, a method other than the canny method may be used. For example, it may be determined whether the pixel of interest is an edge pixel by obtaining a difference from a pixel adjacent in the horizontal direction or the vertical direction and performing threshold processing on the difference. The threshold value is, for example, 48 when the image data is 8 bits.

あるいは、図２に示すソーベルフィルタによってエッジ強度を求め、エッジ強度に対して閾値処理を行うことによってエッジ画素か否かを判定してもよい。閾値は、画像データが８ビットの場合、例えば９６に設定する。 Alternatively, the edge strength may be obtained by a Sobel filter shown in FIG. 2 and a threshold process may be performed on the edge strength to determine whether or not the pixel is an edge pixel. The threshold is set to 96, for example, when the image data is 8 bits.

〔実施形態２〕
実施形態１では、図５に示すように背景領域５５０を八つのサブ領域Ａ〜Ｈに分割し、サブ領域Ａ〜Ｈについて統計処理を行っているが、八つのサブ領域Ａ〜Ｈの全てについて統計処理を行う必要はない。例えば、統計処理部３０３は、図１７に示すように、八つのサブ領域のうち、四つの候補頂点を直線で結んで形成される矩形領域の上下左右に隣接する四つのサブ領域イ〜二を検出し、サブ領域イ〜二の各々において、各領域の中心画素を中心とした６４×６４画素の参照範囲を設定するようになってもよい。そして、統計処理部３０３は、サブ領域イ〜二の各々において、参照範囲における各チャンネル（Ｒ，Ｇ，Ｂの各チャンネル）毎の階調値の統計値を求める。図１７の例によれば、図５の例よりも計算量を減らせることができるというメリットを有し、図５の例によれば、図１７の例よりもサンプル量が多いため高精度であるというメリットを有する。 [Embodiment 2]
In the first embodiment, as shown in FIG. 5, the background region 550 is divided into eight sub-regions A to H, and statistical processing is performed on the sub-regions A to H. There is no need to perform statistical processing. For example, as shown in FIG. 17, the statistical processing unit 303 calculates four sub-regions i to 2 adjacent to each other in the vertical and horizontal directions of a rectangular region formed by connecting four candidate vertices with straight lines among the eight sub-regions. It is also possible to detect and set a reference range of 64 × 64 pixels centering on the central pixel of each region in each of the sub-regions a to b. Then, the statistical processing unit 303 obtains the statistical value of the gradation value for each channel (each channel of R, G, B) in the reference range in each of the sub-regions a to b. The example of FIG. 17 has an advantage that the calculation amount can be reduced as compared with the example of FIG. 5. According to the example of FIG. 5, the sample amount is larger than that of the example of FIG. It has the advantage of being.

或いは、図１８に示すように、四つの候補頂点を直線で結んで形成される矩形領域の上辺、下辺、左辺、右辺に隣接するブロックのうち、上辺に隣接するブロックを７つのサブ領域に等間隔で分割し（図１８の「あ」〜「ま」）、サブ領域毎に、サブ領域の全画素の階調値の統計値を求めるようになっていてもよい。 Alternatively, as shown in FIG. 18, among the blocks adjacent to the upper, lower, left, and right sides of a rectangular area formed by connecting four candidate vertices with straight lines, the block adjacent to the upper side is divided into seven sub-areas, etc. It may be divided at intervals (“A” to “MA” in FIG. 18), and for each sub-region, the statistical values of the gradation values of all the pixels in the sub-region may be obtained.

〔実施形態３〕
実施形態１では撮影装置１０の一例としてタブレットを挙げたが、撮影装置１０は、撮影部２００を有する各種携帯型端末（スマートフォン、タブレット、ノートパソコン、携帯電話等）であってもよい。また、撮影装置１０は、デジタルスチルカメラ、汎用パソコン等であってもよい。 [Embodiment 3]
In the first embodiment, a tablet is used as an example of the imaging device 10, but the imaging device 10 may be various portable terminals (smartphones, tablets, notebook computers, mobile phones, and the like) having the imaging unit 200. Further, the photographing apparatus 10 may be a digital still camera, a general-purpose personal computer, or the like.

また、本実施形態の情報処理部３００は、撮影部２００を備えた撮影装置１０に備えられている必要性はなく、撮影部２００を備えない装置に設けられていても構わない。要は、カメラ付きの携帯端末から撮影画像を受信可能な装置（カメラ無しの汎用パソコンやサーバ装置）であれば情報処理部３００を設けることが可能なのである。また、複合機、大型表示装置（インフォメーションディスプレイや電子黒板等）も撮影画像を受信可能であり、複合機、大型表示装置に情報処理部３００を設けてもよい。なお、携帯端末と、パソコン、サーバ装置、複合機、大型表示装置等との間の撮影画像の通信手段としては、有線でも無線でもよい。また、インターネット等のネットワークを介した通信であってもよい。 In addition, the information processing unit 300 of the present embodiment is not necessarily provided in the imaging device 10 including the imaging unit 200, and may be provided in an apparatus that does not include the imaging unit 200. The point is that the information processing unit 300 can be provided as long as it is a device (a general-purpose personal computer or server device without a camera) that can receive a captured image from a mobile terminal with a camera. In addition, a multifunction peripheral or a large display device (such as an information display or an electronic blackboard) can receive a captured image, and the information processing section 300 may be provided in the multifunction peripheral or the large display device. Note that the communication means for the captured image between the portable terminal and the personal computer, server device, multifunction device, large display device, or the like may be wired or wireless. Also, communication via a network such as the Internet may be used.

また、携帯端末から、情報処理部３００を備えた装置へ撮影画像を送信する際、携帯端末側において、撮影対象領域の抽出処理を実施して幾何学補正処理を行う第１モードと、撮影対象領域の抽出処理および幾何学補正処理を実施しない第２モードとをユーザに選択させるようになっていてもよい。いずれのモードが選択されたかを示すモード情報が撮影画像と共に携帯端末から情報処理部３００を備えた装置へ送信され、情報処理部３００は、モード情報が第１モードを示している場合は第１モードを実行し、モード情報が第２モードを示している場合は第２モードを実行するようになっている。 In addition, when transmitting a captured image from a mobile terminal to an apparatus including the information processing unit 300, the mobile terminal side performs a geometric correction process by performing a process for extracting a target area, and a target to be captured The user may be allowed to select the second mode in which the region extraction process and the geometric correction process are not performed. Mode information indicating which mode is selected is transmitted from the portable terminal together with the captured image to the apparatus including the information processing unit 300, and the information processing unit 300 is the first when the mode information indicates the first mode. The mode is executed, and when the mode information indicates the second mode, the second mode is executed.

〔実施形態４〕
情報処理部３００は、上述の通り、ＣＰＵを用いてソフトウェアにて実現してもよいし、集積回路等に形成された論理回路によって実現してもよい。なお、ソフトウェアによる場合、情報処理部３００は、前記ソフトウェアであるプログラムがコンピュータ（またはＣＰＵ）で読み取り可能に記録されたＲＯＭまたは記憶装置等の記録媒体を備えている。上記記録媒体としては、例えば、カード、ディスク、半導体メモリ、プログラマブルな論理回路などの「一時的でない有形の媒体」であってもよい。また、上記プログラムは、任意の伝送媒体（通信ネットワークや放送波等）を介して上記コンピュータに伝送されてもよい。なお、本発明の一態様は、上記プログラムが電子的な伝送によって具現化された、搬送波に埋め込まれたデータ信号の形態でも実現され得る。 [Embodiment 4]
As described above, the information processing unit 300 may be realized by software using a CPU, or may be realized by a logic circuit formed in an integrated circuit or the like. In the case of software, the information processing unit 300 includes a recording medium such as a ROM or a storage device in which a program that is the software is recorded so as to be readable by a computer (or CPU). The recording medium may be a “non-temporary tangible medium” such as a card, a disk, a semiconductor memory, or a programmable logic circuit. The program may be transmitted to the computer via any transmission medium (communication network, broadcast wave, etc.). Note that one embodiment of the present invention can also be realized in the form of a data signal embedded in a carrier wave, in which the program is embodied by electronic transmission.

〔まとめ〕
本発明の態様１の情報処理部３００（画像処理装置）は、撮影画像５００から、撮影対象領域５１０の頂点の候補である候補頂点を検出する四隅検出部３０１と、撮影画像５００のうち、複数の候補頂点をラインで結んで形成される領域以外の領域を背景領域５５０とし、背景領域５５０の代表色と同一または類似色の階調値の数値範囲（条件）を設定する条件設定部３０４と、撮影画像５００の画素毎に、前記数値範囲を満たす背景色画素か否かを判定する背景色判定部３０５（第１判定部）と、前記画素毎に、エッジを示すエッジ画素か否かを判定するエッジ検出部３０６（第２判定部）と、撮影画像５００のうち、前記背景色画素ではないと判定され前記エッジ画素と判定された画素を特定する特定部３０７と、特定部３０７に特定された画素を参照して、撮影画像５００から撮影対象領域５１０を抽出する抽出部３０８と、を備えたことを特徴とする。 [Summary]
The information processing unit 300 (image processing apparatus) according to the first aspect of the present invention includes a four-corner detection unit 301 that detects candidate vertices that are candidates for vertices in the shooting target area 510 from the shot image 500, and a plurality of shot images 500. A region other than the region formed by connecting the candidate vertices with a line as a background region 550, and a condition setting unit 304 for setting a numerical value range (condition) of gradation values of the same or similar color as the representative color of the background region 550; A background color determination unit 305 (first determination unit) that determines whether or not each pixel of the photographed image 500 is a background color pixel that satisfies the numerical range, and whether or not each pixel is an edge pixel that indicates an edge. An edge detection unit 306 (second determination unit) for determination, a specifying unit 307 for specifying a pixel that is determined not to be the background color pixel and determined to be the edge pixel in the captured image 500, and specified to the specifying unit 307 The It was referring to pixels, an extraction unit 308 which extracts the imaging region 510 from the captured image 500, and further comprising a.

本発明の態様１によれば、エッジ検出部３０６のエッジ検出結果と、背景色判定部３０５の判定結果とを併用して、撮影画像より撮影対象領域５１０を抽出しており、これにより撮影対象領域５１０の抽出精度を相乗的に高めることができるという効果を奏する。 According to the aspect 1 of the present invention, the imaging target region 510 is extracted from the captured image by using the edge detection result of the edge detection unit 306 and the determination result of the background color determination unit 305 in combination, and thereby the imaging target There is an effect that the extraction accuracy of the region 510 can be synergistically increased.

つまり、（ａ）エッジ検出部３０６のエッジ検出結果を用いずに、背景色判定部３０５の判定結果を用いて撮影対象領域５１０を抽出する構成では、背景領域５５０に含まれる陰影領域５２０が撮影対象領域５１０として抽出されるという欠点があり、（ｂ）、背景色判定部３０５の判定結果を用いずに、エッジ検出部３０６のエッジ検出結果を用いて撮影対象領域５１０を抽出する構成では、背景領域５５０に傷画像（例えば机の上の傷）がある場合に前記傷画像が撮影対象領域５１０として抽出されるという欠点がある。ところが、本願発明では、エッジ検出部３０６のエッジ検出結果を用いることにより、背景色判定部３０５の判定結果のみを用いる場合の欠点（陰影領域５２０が撮影対象領域５１０として抽出されてしまう）を抑制でき、背景色判定部３０５の判定結果を用いることによって、エッジ検出部３０６のエッジ検出結果のみを用いる場合の欠点（前記傷画像が撮影対象領域５１０として抽出されてしまう）を抑制できる。すなわち、本願発明では、エッジ検出と背景色判定とを併用することにより、互いの欠点を補い合うことができ、撮影対象の抽出精度を高めるという効果を相乗的に得ることができるのである。 That is, (a) in the configuration in which the imaging target region 510 is extracted using the determination result of the background color determination unit 305 without using the edge detection result of the edge detection unit 306, the shadow region 520 included in the background region 550 is captured. (B) In the configuration in which the imaging target area 510 is extracted using the edge detection result of the edge detection unit 306 without using the determination result of the background color determination unit 305, There is a drawback that the scratch image is extracted as the imaging target region 510 when the background region 550 has a scratch image (for example, a scratch on a desk). However, in the present invention, by using the edge detection result of the edge detection unit 306, the disadvantage (only the shadow region 520 is extracted as the imaging target region 510) when only the determination result of the background color determination unit 305 is used is suppressed. In addition, by using the determination result of the background color determination unit 305, it is possible to suppress a defect (only the scratch image is extracted as the imaging target region 510) when only the edge detection result of the edge detection unit 306 is used. That is, in the present invention, by using edge detection and background color determination together, it is possible to compensate for each other's defects and synergistically obtain the effect of improving the extraction accuracy of the photographing target.

また、本発明の態様１の情報処理部３００によれば、特許文献１のように撮影対象が無い状態の背景（机等の台座）を予め撮影した画像を要することなく撮影対象部分を高精度に抽出できるため、撮影を２度行う必要がなく、記憶容量とユーザにかかる手間とを特許文献１よりも削減できるという効果を奏する。 In addition, according to the information processing unit 300 according to the first aspect of the present invention, it is possible to accurately capture a shooting target portion without requiring an image in which a background (a pedestal such as a desk) in a state where there is no shooting target as in Patent Document 1 is required. Therefore, it is not necessary to take the image twice, and it is possible to reduce the storage capacity and the trouble for the user as compared with Patent Document 1.

本発明の態様２の情報処理部３００は、態様１の構成に加えて、背景領域５５０に含まれる範囲を統計対象として、統計対象の階調値の統計処理を行う統計処理部３０３を備え、条件設定部３０４は、前記統計処理の結果に基づいて前記数値範囲を設定することを特徴とする。 In addition to the configuration of the aspect 1, the information processing unit 300 of the aspect 2 of the present invention includes a statistical processing unit 303 that performs statistical processing on the gradation value of the statistical object, with the range included in the background region 550 as the statistical object, The condition setting unit 304 sets the numerical range based on the result of the statistical processing.

これにより、撮影画像５００のうち、背景領域５５０の代表色と同一または類似色を示す階調値の数値範囲を高精度に設定でき、背景色画素か否かの判定を精度よく行えることができるという効果を奏する。 Thereby, in the photographed image 500, a numerical value range of gradation values indicating the same or similar color as the representative color of the background region 550 can be set with high accuracy, and it can be accurately determined whether or not it is a background color pixel. There is an effect.

本発明の態様３の情報処理部３００は、態様２の構成に加えて、前記統計対象の範囲を複数のサブ領域に分割し、サブ領域の中心位置を含む所定サイズの参照領域をサブ領域ごとに設定し、所定領域毎に参照領域の階調値の統計処理を行うことを特徴とする。 In addition to the configuration of aspect 2, the information processing unit 300 according to aspect 3 of the present invention divides the statistical target range into a plurality of sub-regions, and generates a reference region of a predetermined size including the center position of each sub-region for each sub-region. And the statistical processing of the gradation value of the reference area is performed for each predetermined area.

本発明の態様３によれば、撮影対象（書籍等）の縁や影の影響の受けにくい領域を対象として前記統計処理を行うことができるため、背景色判定部３０５の判定精度を向上させることができ、これにより撮影対象領域５１０の抽出精度も向上させることができる。 According to the third aspect of the present invention, the statistical processing can be performed on a region that is not easily affected by the edge or shadow of a shooting target (such as a book), and thus the determination accuracy of the background color determination unit 305 is improved. As a result, the extraction accuracy of the imaging target region 510 can be improved.

本発明の態様４の情報処理部３００は、態様３の構成に加えて、条件設定部３０４は、前記統計処理にて求められる統計量のうち平均値または中央値が最大となるサブ領域と前記平均値または中央値が最小となるサブ領域とを除いた各サブ領域の前記統計量を参照して、前記条件を設定することを特徴とする。 In addition to the configuration of aspect 3, the information processing unit 300 according to aspect 4 of the present invention includes the condition setting unit 304 that includes a sub-region having a maximum average value or median value among the statistics obtained by the statistical processing, and the The condition is set with reference to the statistic of each sub-region excluding the sub-region having the minimum average value or median value.

本発明の態様４によれば、照明の映り込みや撮影者の影に起因して値のバラツキが大きくなる局所部分が背景領域５５０に含まれている場合であっても、この局所部分による影響を抑制できるという効果を奏する。 According to the aspect 4 of the present invention, even when the background region 550 includes a local portion in which the variation in the value increases due to the reflection of illumination or the shadow of the photographer, the influence of the local portion There is an effect that can be suppressed.

本発明の態様５の情報処理部３００は、態様２〜態様４のいずれかに加え、撮影画像５００に対して平滑化処理を行う平滑化処理部３０２を備え、統計処理部３０３は、前記平滑化処理の施された撮影画像５００を用いて前記統計処理を行うことを特徴とする。 The information processing unit 300 according to the fifth aspect of the present invention includes a smoothing processing unit 302 that performs a smoothing process on the captured image 500 in addition to any one of the second to fourth aspects, and the statistical processing unit 303 includes the smoothing process. The statistical processing is performed using the captured image 500 that has been subjected to the conversion processing.

本発明の態様５によれば、撮影画像５００中のノイズの影響が前記統計処理に及ぶことを抑制でき、前記統計処理の精度を向上させることができるという効果を奏する。 According to the aspect 5 of the present invention, it is possible to suppress the influence of noise in the captured image 500 from affecting the statistical processing, and it is possible to improve the accuracy of the statistical processing.

本発明の態様６の情報処理部３００は、態様１〜態様５のいずれかに加え、特定部３０７にて特定された画素を特定画素とすると、抽出部３０８は、（ａ）撮影画像５００の上辺を構成する画素ごとに、当該画素から上辺と直交する方向に並んでいる画素列の中から当該画素に最も近い位置にある特定画素を探索する第１探索処理と、（ｂ）撮影画像５００の下辺を構成する画素ごとに、当該画素から下辺と直交する方向に並んでいる画素列の中から当該画素に最も近い位置にある特定画素を探索する第２探索処理と、（ｃ）撮影画像５００の左辺を構成する画素ごとに、当該画素から左辺と直交する方向に並んでいる画素列の中から当該画素に最も近い位置にある特定画素を探索する第３探索処理と、（ｄ）撮影画像５００の右辺を構成する画素ごとに、当該画素から右辺と直交する方向に並んでいる画素列の中から当該画素に最も近い位置にある特定画素を探索する第４探索処理と、（ｅ）前記第１〜第４探索処理にて抽出された特定画素を輪郭として、この輪郭および輪郭に囲まれている領域を、撮影対象領域５１０として抽出する抽出処理とを実行することを特徴とする。 The information processing unit 300 according to the sixth aspect of the present invention, in addition to any one of the first to fifth aspects, if the pixel specified by the specifying unit 307 is a specific pixel, the extraction unit 308 (a) For each pixel constituting the upper side, a first search process for searching for a specific pixel closest to the pixel from a pixel row arranged in a direction orthogonal to the upper side from the pixel; and (b) the captured image 500. A second search process for searching for a specific pixel at a position closest to the pixel from a pixel row arranged in a direction orthogonal to the lower side from the pixel for each pixel constituting the lower side of the pixel; and (c) a captured image. For each pixel constituting the left side of 500, a third search process for searching for a specific pixel closest to the pixel from a pixel row arranged in a direction orthogonal to the left side from the pixel; Configure the right side of image 500 For each pixel, a fourth search process for searching for a specific pixel closest to the pixel from a pixel row arranged in a direction orthogonal to the right side from the pixel; and (e) the first to fourth searches. An extraction process for extracting the specific pixel extracted by the process as an outline and extracting the outline and an area surrounded by the outline as an imaging target area 510 is performed.

本発明の態様６によれば、前記背景色画素ではないと判定され且つ前記エッジ画素と判定された画素である特定画素から、撮影対象領域５１０の輪郭を特定でき、この輪郭から、撮影対象領域５１０を特定できる。 According to the sixth aspect of the present invention, it is possible to specify the outline of the shooting target area 510 from the specific pixel that is determined not to be the background color pixel and to be the edge pixel, and from this outline, the shooting target area can be specified. 510 can be specified.

本発明の態様７の撮影装置１０は、撮影部２００と、態様１〜態様６のいずれかの情報処理部３００とを備えたことを特徴とする。 The photographing apparatus 10 according to the seventh aspect of the present invention includes the photographing unit 200 and the information processing unit 300 according to any one of the first to sixth aspects.

本発明の態様１〜６に係る情報処理部３００は、コンピュータによって実現してもよく、この場合には、コンピュータを情報処理部３００が備える各部として動作させることにより、情報処理部３００をコンピュータにて実現させるプログラム、およびそれを記録したコンピュータ読み取り可能な記録媒体も、本発明の範疇に入る。 The information processing unit 300 according to aspects 1 to 6 of the present invention may be realized by a computer. In this case, the information processing unit 300 is made to operate on each computer by causing the computer to operate as each unit included in the information processing unit 300. The program to be realized and the computer-readable recording medium on which the program is recorded also fall within the scope of the present invention.

本発明の一態様は上述した各実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能であり、異なる実施形態にそれぞれ開示された技術的手段を適宜組み合わせて得られる実施形態についても本発明の一態様の技術的範囲に含まれる。 One aspect of the present invention is not limited to the above-described embodiments, and various modifications can be made within the scope of the claims, and the technical means disclosed in different embodiments can be appropriately combined. Such embodiments are also included in the technical scope of one aspect of the present invention.

本発明は、撮影画像を処理する画像処理ソフトをインストール可能な装置、或いは、撮影画像を処理する画像処理回路を搭載可能な装置に適用される。このような装置として、例えば、汎用コンピュータ、サーバ装置、複合機、デジタルスチルカメラ、携帯端末（スマートフォン、タブレット、ノートパソコン、携帯電話）等が挙げられる。 The present invention is applied to a device capable of installing image processing software for processing a photographed image or a device capable of mounting an image processing circuit for processing a photographed image. Examples of such devices include general-purpose computers, server devices, multifunction devices, digital still cameras, and mobile terminals (smartphones, tablets, notebook computers, mobile phones) and the like.

１０撮影装置
２００撮影部
３００情報処理部（画像処理装置）
３０１四隅検出部
３０２平滑化処理部
３０３統計処理部
３０４条件設定部（設定部）
３０５背景色判定部（第１判定部）
３０６エッジ検出部（第２判定部）
３０７特定部
３０８抽出部
３０９歪み補正部
３１０フォーマット処理部
４００記憶部
５００撮影画像
５１０撮影対象領域
５２０陰影領域
５５０背景領域
７００マスク画像
８００抽出済画像 DESCRIPTION OF SYMBOLS 10 Imaging device 200 Imaging part 300 Information processing part (image processing apparatus)
301 Four corner detection unit 302 Smoothing processing unit 303 Statistical processing unit 304 Condition setting unit (setting unit)
305 Background color determination unit (first determination unit)
306 Edge detection unit (second determination unit)
307 Identification unit 308 Extraction unit 309 Distortion correction unit 310 Format processing unit 400 Storage unit 500 Captured image 510 Shooting target region 520 Shadow region 550 Background region 700 Mask image 800 Extracted image

Claims

A detection unit that detects candidate vertices that are candidates for vertices in the shooting target region from the shot image;
A setting unit that sets a condition for a gradation value of the same or similar color as the representative color of the background area, with a background area other than an area formed by connecting a plurality of candidate vertices with lines in the captured image ,
A first determination unit that determines whether or not each pixel of the captured image is a background color pixel that satisfies the condition;
A second determination unit that determines whether or not each pixel is an edge pixel indicating an edge;
Among the captured images, a specifying unit that specifies a pixel that is determined not to be the background color pixel and determined to be the edge pixel;
An image processing apparatus comprising: an extraction unit that extracts the shooting target region from the captured image with reference to the pixel specified by the specifying unit.

A statistical processing unit that performs statistical processing of gradation values using the range included in the background region as a statistical target;
The image processing apparatus according to claim 1, wherein the setting unit sets the condition based on a result of the statistical processing.

The statistical processing unit divides the range of the statistical object into a plurality of sub-regions, sets a reference range of a predetermined size including the center position of the sub-region for each sub-region, and sets a reference region gradation value for each sub-region. The image processing apparatus according to claim 2, wherein the statistical processing is performed.

The setting unit includes the statistics of each sub-region excluding the sub-region having the maximum average value or median value and the sub-region having the minimum average value or median value among the statistics obtained by the statistical processing. The image processing apparatus according to claim 3, wherein the condition is set with reference to an amount.

A smoothing processing unit that performs a smoothing process on the captured image;
The image processing apparatus according to claim 2, wherein the statistical processing unit performs the statistical processing using the captured image that has been subjected to the smoothing processing.

When the pixel specified by the specifying unit is a specific pixel,
The extraction unit searches (a) for each pixel constituting the upper side of the captured image, a search for a specific pixel located closest to the pixel from a pixel row arranged in a direction orthogonal to the upper side from the pixel. 1 search process, and (b) for each pixel constituting the lower side of the photographed image, a search is made for a specific pixel located closest to the pixel from a pixel row arranged in a direction orthogonal to the lower side from the pixel. 2 search processing, and (c) for each pixel constituting the left side of the photographed image, a search is made for a specific pixel located closest to the pixel from a pixel row arranged in a direction orthogonal to the left side from the pixel. 3 search processing, and (d) for each pixel constituting the right side of the photographed image, a search is performed for a specific pixel at a position closest to the pixel from a pixel row arranged in a direction orthogonal to the right side from the pixel. 4 search processing; and (e) the first The extraction process of extracting the specific pixel extracted in the fourth search process as an outline and extracting the outline and the area surrounded by the outline as the imaging target area is performed. The image processing apparatus according to claim 5.

An imaging apparatus comprising: an imaging unit; and the image processing apparatus according to claim 1.

A program that causes a computer to function as each unit of the image processing apparatus according to any one of claims 1 to 6.

A computer-readable recording medium on which the program according to claim 8 is recorded.