JP2013182330A

JP2013182330A - Image processor and image processing method

Info

Publication number: JP2013182330A
Application number: JP2012044304A
Authority: JP
Inventors: Xiao Yan Dai; 暁艶戴
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2012-02-29
Filing date: 2012-02-29
Publication date: 2013-09-12
Anticipated expiration: 2032-02-29
Also published as: JP5914046B2

Abstract

PROBLEM TO BE SOLVED: To more suitably determine and extract a subject area from an input image.SOLUTION: The image processor includes: distance information input means for inputting distance information from a point of view with respect to each pixel of an image; pixel selection means for selecting pixels whose distance information is within a predetermined distance range from among pixels included in the image; attention degree calculation means for calculating the degree of attention as a value showing the degree of change of the image at each pixel position with a first area including all the pixels selected by the pixel selection means as an object; and determination means for determining a second area including all the pixels whose degrees of attention calculated by the attention degree calculation means are equal to or more than a predetermined threshold as an area including a subject area.

Description

本発明は、画像処理技術に関するものであり、特に、入力画像から被写体領域をより好適に抽出する技術に関するものである。 The present invention relates to an image processing technique, and more particularly to a technique for more suitably extracting a subject area from an input image.

従来から、画像から所定の領域（被写体対象領域）を抽出する技術（セグメンテーション技術とも呼ばれる）が研究されており、映像編集における画像合成や特定領域のリフォーカス等の目的で応用されている。対象領域抽出において、画像の色情報に基づく手法として、背景差分法やクロマキー法が良く知られている。背景差分法は、対象領域を含まない背景のみの画像を予め撮影しておき、対象領域を含む画像と背景のみの画像とを比較して、その差分を計算することにより対象領域を抽出する手法である。 Conventionally, a technique for extracting a predetermined area (subject object area) from an image (also referred to as a segmentation technique) has been studied and applied for purposes such as image synthesis in video editing and refocusing of a specific area. Background extraction methods and chroma key methods are well known as methods based on image color information in target area extraction. The background subtraction method is a method in which an image of only a background that does not include a target area is captured in advance, an image including the target area is compared with an image of only the background, and the target area is extracted by calculating the difference. It is.

クロマキー法は映画業界で標準的に用いられている手法であり、背景領域を一定の色にして、対象領域の色に背景色が含まれないことを仮定して対象領域を抽出する手法である。ただし、背景差分法やクロマキー法は背景の制御が容易な環境のみで使われる。一方、特定の背景を必要としない方法として、グラフ理論に基づき任意の背景を有する画像から対象領域を分離する方法、即ち、グラフカット及びグラブカットという手法が提案されている（非特許文献１，非特許文献２）。 The chroma key method is a standard technique used in the movie industry, and it is a technique for extracting the target area on the assumption that the background area is not included in the color of the target area by making the background area a certain color. . However, the background subtraction method and the chroma key method are used only in an environment where the background control is easy. On the other hand, as a method that does not require a specific background, a method of separating a target region from an image having an arbitrary background based on graph theory, that is, a method of graph cut and grab cut has been proposed (Non-Patent Document 1, Non-patent document 2).

グラフカットでは、まず、ユーザが予め対象領域にある画素、背景領域にある画素をマウスでクリックすることにより指定する、或いは、対象領域の一部及び背景領域の一部に対しそれぞれ曲線をマウスでドラッグすることにより指定する。そして、指定された画素又は曲線を正解情報として、当該正解情報に基づきグラフに係るエネルギー関数のパラメータを生成し、当該グラフのエネルギー関数の最小化問題を解くことで対象領域と背景領域を分離している。さらに、対象領域の抽出精度を高めるため、得られた抽出結果に更にユーザ指定を加え、上記処理を繰り返すことも可能である。 In the graph cut, the user first designates the pixel in the target area and the pixel in the background area by clicking with the mouse in advance, or the curve is set with the mouse for a part of the target area and a part of the background area. Specify by dragging. Then, using the specified pixel or curve as correct answer information, generate a parameter of the energy function related to the graph based on the correct answer information, and separate the target area and the background area by solving the energy function minimization problem of the graph. ing. Furthermore, in order to improve the extraction accuracy of the target region, it is possible to add the user designation to the obtained extraction result and repeat the above processing.

また、グラブカットでは、上述のグラフカットをより簡単に実現する手法であり、ユーザ指定は対象領域が含まれる矩形領域を指定するだけでよい。そして、グラブカットでは、当該矩形領域に含まれる対象領域内外をそれぞれ色クラスタリングし、各画素及び各クラスタの色情報に基づき、グラフ用のパラメータを計算し、グラフのエネルギー関数の最小化をグローバルに解くことにより対象領域を抽出する。 Grab cut is a method for realizing the above-described graph cut more easily, and the user designation only needs to designate a rectangular area including the target area. In grab cut, color clustering is performed on the inside and outside of the target area included in the rectangular area, graph parameters are calculated based on the color information of each pixel and each cluster, and the energy function of the graph is minimized globally. The target area is extracted by solving.

一方、物体追跡、オブジェクト認識のため、人間の視覚性に基づき物体の存在する注目領域を抽出する技術も研究されている。当該技術では、基本的に色、輝度、テクスチャ等の特徴に基づき、画素がどれだけ人間の注目を引くかを示す注目度という指標を特徴毎に計算し、特徴毎の注目度を重み付けで画素の注目度を求め、物体のある注目領域を抽出する（特許文献１，非特許文献３）。さらに、近年、単眼カメラに距離画像センサを搭載することにより、画像の各画素に対する距離情報（深度情報）の推定が可能になり、色情報以外の有用な情報として距離情報を対象領域抽出に用いる技術が提案されている（非特許文献４）。 On the other hand, for object tracking and object recognition, a technique for extracting a region of interest where an object exists based on human visibility has been studied. In this technology, based on features such as color, brightness, texture, etc., an index called attention level that indicates how much attention a pixel attracts human attention is calculated for each feature, and the attention level for each feature is weighted by pixel. Attention level is extracted, and an attention area with an object is extracted (Patent Document 1, Non-Patent Document 3). Furthermore, in recent years, it has become possible to estimate distance information (depth information) for each pixel of an image by mounting a distance image sensor on a monocular camera, and distance information is used for target area extraction as useful information other than color information. Technology has been proposed (Non-Patent Document 4).

特開２０１０−２５７４２３号公報JP 2010-257423 A

Boykov, Y.Y. and Jolly, M.-P., Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images, In Proc. IEEE Int. Conf. on Computer Vision 2001, vol.1, pp. 105-112Boykov, Y.Y. and Jolly, M.-P., Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images, In Proc.IEEE Int. Conf. On Computer Vision 2001, vol.1, pp. 105-112 Rother et al., Grabcut: Interactive Foreground Extraction Using Iterated Graph Cuts, ACM Trans. Graph., vol. 23, No. 3, 2004, pp. 309-314Rother et al., Grabcut: Interactive Foreground Extraction Using Iterated Graph Cuts, ACM Trans. Graph., Vol. 23, No. 3, 2004, pp. 309-314 Itti, L., Koch, C. and Niebur, E., A model of saliency-based visual attention for rapid scene analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.20, Issue 11, pp. 1254-1259, 1998Itti, L., Koch, C. and Niebur, E., A model of saliency-based visual attention for rapid scene analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.20, Issue 11, pp. 1254-1259, 1998 Ouerhani, N. and Hugli, H., Computing visual attention from scene depth, In Proc. 15th International Conference on Pattern Recognition 2000, vol.1, pp. 375-378Ouerhani, N. and Hugli, H., Computing visual attention from scene depth, In Proc. 15th International Conference on Pattern Recognition 2000, vol.1, pp. 375-378

しかしながら、上述のグラフカット及びグラブカットの手法においては、対象領域及び背景領域のユーザ指定が必要であり、ユーザにとって煩雑であるという問題がある。図１は、領域抽出におけるユーザによる手動指定を例示的に示す図である。画像１００ａは、対象領域に存在する前景画素及び背景領域に存在する背景画素を画素単位で指定する例を示している。ここでは、前景画素として指定した画素を白色で示される×印で、背景画素として指定した画素を黒色で示される×印で示している。一方、画像１００ｂは、自由曲線により前景画素及び背景画素を指定する例を示している。ここでは、前景画素として指定した画素を白色で示される曲線で、背景画素として指定した画素を黒色で示される曲線で示している。このようにしてユーザが指定した前景画素及び背景画素は正解画素として取り扱われるため、ユーザが誤指定してしまった場合には対象領域の抽出精度への影響が生じるという問題がある。 However, the above-described graph cut and grab cut methods require a user designation of the target area and the background area, which is troublesome for the user. FIG. 1 is a diagram exemplarily illustrating manual designation by a user in region extraction. The image 100a shows an example in which foreground pixels existing in the target area and background pixels existing in the background area are designated in pixel units. Here, a pixel designated as a foreground pixel is indicated by a cross indicated by white, and a pixel designated as a background pixel is indicated by a cross indicated by black. On the other hand, the image 100b shows an example in which foreground pixels and background pixels are designated by a free curve. Here, a pixel designated as a foreground pixel is indicated by a curve indicated by white, and a pixel specified as a background pixel is indicated by a curve indicated by black. Since the foreground pixel and the background pixel specified by the user are handled as correct pixels in this way, there is a problem in that if the user erroneously specifies, the extraction accuracy of the target area is affected.

更に、入力画像の背景に変化の大きい目立った領域が存在する場合に誤判定するという問題がある。図２は、被写体のある注目領域の抽出技術における誤判定を例示的に示す図である。画像２００ａには、対象領域は均一な色特徴を持ち、背景領域に色差の変化が大きい部分が存在している。このような画像を対象として対象領域の抽出処理を行った場合、色差の変化が大きい部分を取り囲む矩形領域が注目領域として判定されてしまうことになる。 Furthermore, there is a problem of erroneous determination when there is a conspicuous region with a large change in the background of the input image. FIG. 2 is a diagram exemplarily illustrating misjudgment in a technique for extracting a region of interest with a subject. In the image 200a, the target region has a uniform color characteristic, and a portion where the change in color difference is large exists in the background region. When the target area extraction process is performed on such an image, a rectangular area surrounding a portion with a large change in color difference is determined as the attention area.

本発明は上述の問題点に鑑みなされたものであり、入力画像から被写体領域をより好適に決定し抽出可能とする技術を提供することを目的とする。 The present invention has been made in view of the above-described problems, and an object of the present invention is to provide a technique that makes it possible to more appropriately determine and extract a subject area from an input image.

上述の１以上の問題点を解決するため、本発明の画像処理装置は以下の構成を備える。すなわち、画像の各画素に対する視点からの距離情報を入力する距離情報入力手段と、前記画像に含まれる画素のうち、前記距離情報が所定の距離範囲内にある画素を選択する画素選択手段と、前記画素選択手段により選択された画素の全てを内包する第１の領域を対象として、各画素位置での画像の変化度合いを示す値である注目度を算出する注目度計算手段と、前記注目度計算手段により計算された注目度が所定閾値以上の画素の全てを内包する第２の領域を、前記被写体領域を内包する領域として決定する決定手段と、を備える。 In order to solve one or more problems described above, the image processing apparatus of the present invention has the following configuration. That is, distance information input means for inputting distance information from the viewpoint for each pixel of the image, pixel selection means for selecting a pixel in which the distance information is within a predetermined distance range among the pixels included in the image, Attention level calculating means for calculating a degree of attention, which is a value indicating the degree of change in the image at each pixel position, for the first region including all of the pixels selected by the pixel selection means, and the degree of attention Determining means for determining a second area including all pixels having a degree of attention calculated by the calculating means having a predetermined threshold value or more as an area including the subject area;

本発明によれば、入力画像から被写体領域をより好適に決定し抽出可能とする技術を提供することができる。 According to the present invention, it is possible to provide a technique that allows a subject region to be more appropriately determined and extracted from an input image.

領域抽出におけるユーザによる手動指定の一例を示す図である。It is a figure which shows an example of the manual designation | designated by the user in area | region extraction. 注目領域の誤判定の一例を示す図である。It is a figure which shows an example of the erroneous determination of an attention area. 第１実施形態に係る画像処理装置を含む撮影装置の全体ブロック図である。1 is an overall block diagram of a photographing apparatus including an image processing apparatus according to a first embodiment. 第１実施形態に係る画像処理部の機能ブロック図である。3 is a functional block diagram of an image processing unit according to the first embodiment. FIG. 第１実施形態に係る画像処理の全体フローチャートである。3 is an overall flowchart of image processing according to the first embodiment. 色画像及び距離画像の一例を示す図である。It is a figure which shows an example of a color image and a distance image. 被写体領域の候補画素の選定処理を示すフローチャートである。It is a flowchart which shows the selection process of the candidate pixel of a to-be-photographed area | region. 注目度計算処理を示すフローチャートである。It is a flowchart which shows attention degree calculation processing. 被写体領域抽出処理を示すフローチャートである。It is a flowchart which shows subject area extraction processing. 各処理による処理結果画像の例を示す図である。It is a figure which shows the example of the process result image by each process. 第２実施形態に係る画像処理装置の機能ブロック図である。It is a functional block diagram of the image processing apparatus which concerns on 2nd Embodiment. 第２実施形態に係る画像処理の全体フローチャートである。It is a whole flowchart of the image processing which concerns on 2nd Embodiment. 重み設定処理を示すフローチャートである。It is a flowchart which shows a weight setting process. 注目度計算処理を示すフローチャートである。It is a flowchart which shows attention degree calculation processing. 第３実施形態に係る画像処理装置の機能ブロック図である。It is a functional block diagram of the image processing apparatus which concerns on 3rd Embodiment. 第３実施形態に係る画像処理の全体フローチャートである。It is a whole flowchart of the image processing which concerns on 3rd Embodiment. 入力画像を複数の距離レイヤに分ける例を示す図である。It is a figure which shows the example which divides | segments an input image into a some distance layer.

以下に、図面を参照して、この発明の好適な実施の形態を詳しく説明する。なお、以下の実施の形態はあくまで例示であり、本発明の範囲を限定する趣旨のものではない。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the drawings. The following embodiments are merely examples, and are not intended to limit the scope of the present invention.

（第１実施形態）
本発明に係る画像処理装置の第１実施形態として、撮影装置を例に挙げて以下に説明する。特に、画像内の各画素に対する距離情報に基づいて被写体領域の候補画素を選択し、当該候補画素に対して注目度を算出し被写体領域の決定に使用する例について説明する。 (First embodiment)
A first embodiment of the image processing apparatus according to the present invention will be described below by taking a photographing apparatus as an example. In particular, an example will be described in which a candidate pixel of a subject area is selected based on distance information for each pixel in the image, and a degree of attention is calculated for the candidate pixel and used for determination of the subject area.

＜装置構成＞
図３は、第１実施形態に係る画像処理装置を実現した一実施形態である撮影装置の主要部構成を示すブロック図である。中央処理装置（ＣＰＵ）１１０１は、以下に述べる各部を統括的に制御する。 <Device configuration>
FIG. 3 is a block diagram illustrating a main part configuration of a photographing apparatus which is an embodiment that realizes the image processing apparatus according to the first embodiment. A central processing unit (CPU) 1101 comprehensively controls each unit described below.

撮影部１１０２は、画像の色情報、距離情報を取得する。ここでは、一般の、カラー画像として色情報を入力すると共に、視点からの距離の値が各画素に割り当てられた距離画像として距離情報を入力することを想定する。 The imaging unit 1102 acquires image color information and distance information. Here, it is assumed that color information is input as a general color image and distance information is input as a distance image in which the value of the distance from the viewpoint is assigned to each pixel.

表示部１１０４は撮影画像や文字の表示を行い、例えば、液晶ディスプレイが用いられる。表示部１１０４はタッチスクリーン機能を有していても良い。表示制御部１１０５は、表示部１１０４に表示される撮影画像や文字の表示制御を行う。操作部１１０６はユーザの指示を受け付け、ボタンや撮影モードダイヤルなどが含まれる。この操作による設定内容はＣＰＵを介して所定の処理を制御する。例えば、後述の領域の手動指定は、表示制御部１１０５と操作部１１０６を使うことにより容易に実現できる。 A display unit 1104 displays captured images and characters. For example, a liquid crystal display is used. The display portion 1104 may have a touch screen function. A display control unit 1105 performs display control of captured images and characters displayed on the display unit 1104. An operation unit 1106 receives user instructions and includes buttons, a shooting mode dial, and the like. The setting content by this operation controls a predetermined process via the CPU. For example, manual designation of an area to be described later can be easily realized by using the display control unit 1105 and the operation unit 1106.

撮影制御部１１０７は、フォーカスを合わせる、シャッターを開く・閉じる、絞りを調節するなどの、ＣＰＵ１２０１からの指示に基づいた撮像系の制御を行う。 The imaging control unit 1107 controls the imaging system based on instructions from the CPU 1201 such as focusing, opening / closing the shutter, and adjusting the aperture.

信号処理部１１０８は、バス１１０３を介して受け取ったデジタルデータに対し、ホワイトバランス処理、ガンマ処理、ノイズ低減処理などの各種処理を行う。画像処理部１１０９は、撮影部１１０２で取得された画像、或いは、信号処理部１１０８から出力される画像から出力されるデジタル画像、或いは、操作部１１０６からユーザの指定に応じて画像処理を行う。圧縮／伸長部１１１０は、デジタルデータ、或いは、画像処理の結果をＪＰＥＧやＭＰＥＧやベクトル化などのファイルフォーマットへの変換、或いは、符号化制御を行う。 The signal processing unit 1108 performs various processes such as white balance processing, gamma processing, and noise reduction processing on the digital data received via the bus 1103. The image processing unit 1109 performs image processing according to an image acquired by the imaging unit 1102, a digital image output from an image output from the signal processing unit 1108, or a user's specification from the operation unit 1106. The compression / decompression unit 1110 converts digital data or image processing results into a file format such as JPEG, MPEG, or vectorization, or performs encoding control.

バス１１０３は、各種データの転送経路でなる。例えば、撮影部１１０２によって取得されたデジタルデータはこのバス１１０３を介して所定の処理部に送られる。内部メモリ１１１１は、ＣＰＵ１１０１の主メモリ、ワークエリア等として機能するほか、ＣＰＵ１０１で実行される制御プラグラム等を格納する。外部メモリ制御部１１１２は、ＰＣやその他のメディア（例えば、ハードディスク、メモリカード、ＣＦカード、ＳＤカード、ＵＳＢメモリ）に繋ぐためのインターフェースである。 The bus 1103 is a transfer path for various data. For example, digital data acquired by the photographing unit 1102 is sent to a predetermined processing unit via the bus 1103. The internal memory 1111 functions as a main memory and work area of the CPU 1101, and stores control programs executed by the CPU 101 and the like. The external memory control unit 1112 is an interface for connecting to a PC or other media (for example, hard disk, memory card, CF card, SD card, USB memory).

図４は、第１実施形態に係る撮影装置の画像処理部の内部構成を示す機能ブロック図である。画像処理部１１０９は、撮影部１１０２により取得された画像データ、或いは、内部メモリ１１１１、外部メモリに蓄積された画像データを処理対象とし、ＣＰＵ１１０１による制御で被写体領域抽出を行う。また、画像処理部１１０９により処理された画像は、例えば、圧縮／伸長部１１１０に送信され圧縮符号化される、或いは、内部メモリ１１１１、外部メモリに保存される。 FIG. 4 is a functional block diagram showing an internal configuration of the image processing unit of the photographing apparatus according to the first embodiment. The image processing unit 1109 uses the image data acquired by the photographing unit 1102 or the image data stored in the internal memory 1111 or the external memory as a processing target, and performs subject area extraction under the control of the CPU 1101. Also, the image processed by the image processing unit 1109 is transmitted to the compression / decompression unit 1110 and compression-coded, or stored in the internal memory 1111 and the external memory, for example.

画像処理部１１０９は、画素選定部１１０、注目度計算部１２０、注目領域判定部１３０、対象領域抽出部１４０を備えており、入力された画像の色情報及び距離情報に基づき、画像に含まれる被写体領域を抽出する。ここで、画像の色情報とは、例えば、各画素の３原色（ＲＧＢ）の画素値を示し、距離情報とは、各画素の視点からの距離（深度）に対応する情報である。 The image processing unit 1109 includes a pixel selection unit 110, a degree-of-interest calculation unit 120, a region-of-interest determination unit 130, and a target region extraction unit 140, and is included in the image based on color information and distance information of the input image. Extract the subject area. Here, the color information of the image indicates, for example, pixel values of the three primary colors (RGB) of each pixel, and the distance information is information corresponding to the distance (depth) from the viewpoint of each pixel.

図６は、入力する色画像および距離画像を例示的に示す図である。画像６００ａは処理対象画像の一例である。この例では、玩具が置き台に置かれて、背景領域に色特徴のある部分がある。画像６００ｂは距離画像の一例である。距離の遠近は明度で表されており、ここでは、明度が高い程、撮影機器までの距離が近いことを示している。特に、画像６００ｂでは、距離情報の特徴を分かりやすく説明するために、被写体領域部分、被写体領域と背景の境界部分、背景部分をそれぞれ均一な距離値で表し、背景の一部である置き台をグラデーション状の距離で表している。なお、黒色で示される領域は、距離推定が出来ておらず距離情報が欠落していることを例示的に示している。 FIG. 6 is a diagram exemplarily showing a color image and a distance image to be input. The image 600a is an example of a processing target image. In this example, a toy is placed on a pedestal and there is a portion with a color feature in the background area. The image 600b is an example of a distance image. The distance is expressed in terms of brightness. Here, the higher the brightness, the closer the distance to the photographing device. In particular, in the image 600b, in order to explain the feature of distance information in an easy-to-understand manner, the subject region portion, the boundary portion between the subject region and the background, and the background portion are represented by uniform distance values, and a pedestal that is a part of the background is displayed. It is expressed as a gradation distance. In addition, the area | region shown by black has shown that distance estimation is not completed and distance information is missing.

なお、距離情報は、センサなどにより測定されたものでも良いし、複数の視点から撮影した画像から推定されたものでも良い。また、ユーザが画像データの各領域に対して距離を指定することにより生成しても良いし、背景技術で述べた推定を行うことにより生成しても良い。推定により得られた距離画像の場合、一般にノイズ（つまり、周囲とは異なる距離値を示す画素）も含まれうる。 The distance information may be measured by a sensor or the like, or may be estimated from images taken from a plurality of viewpoints. Further, it may be generated by the user specifying a distance for each region of the image data, or may be generated by performing the estimation described in the background art. In the case of a distance image obtained by estimation, noise (that is, a pixel indicating a distance value different from the surroundings) may generally be included.

画素選定部１１０は、入力された距離画像により示される距離値が所定距離範囲内にある、色画像上の対応する画素を処理対象の画素として選定する。ここでは、説明を簡単にするため、所定の距離閾値より小さい距離情報を有する画素を被写体の候補画素として選定する。一方、当該所定の距離閾値以上の距離情報を有する画素については背景画素であると判定する。 The pixel selection unit 110 selects a corresponding pixel on the color image whose distance value indicated by the input distance image is within a predetermined distance range as a pixel to be processed. Here, in order to simplify the description, pixels having distance information smaller than a predetermined distance threshold are selected as subject candidate pixels. On the other hand, a pixel having distance information equal to or greater than the predetermined distance threshold is determined to be a background pixel.

注目度計算部１２０は、色画像に含まれる各画素の、各画素位置での画像の変化度合いを示す値である注目度を求める。特に、第１実施形態においては、画素選定部１１０により選定された候補画素を対象として、２以上のガウシアンフィルタを適用しフィルタ結果の差分を取ることにより各画素の注目度を求める。 The attention level calculation unit 120 obtains the attention level that is a value indicating the degree of change of the image at each pixel position of each pixel included in the color image. In particular, in the first embodiment, two or more Gaussian filters are applied to candidate pixels selected by the pixel selection unit 110, and the attention level of each pixel is obtained by taking a difference between filter results.

注目領域判定部１３０は、注目度計算部１２０により算出された各画素の注目度のうち所定の距離範囲にある画素を包含する色画像内の領域を注目領域として判定する。例えば、所定の距離範囲にある画素の全てを内包する矩形領域を当該注目領域として判定する。 The attention area determination unit 130 determines, as the attention area, an area in the color image that includes pixels within a predetermined distance range among the attention degrees calculated by the attention degree calculation unit 120. For example, a rectangular area including all pixels within a predetermined distance range is determined as the attention area.

対象領域抽出部１４０は、注目領域判定部１３０により判定された注目領域を対象として、エネルギー関数のパラメータを設定し、エネルギー関数の最小化問題を解く。これにより、当該注目領域に含まれる被写体領域と背景領域とを分離し、例えば、被写体領域のみを抽出する。すなわち、当該注目領域の外側領域は、背景領域と確定し、予め抽出処理の対象から除外されている。 The target area extraction unit 140 sets energy function parameters for the target area determined by the target area determination unit 130, and solves the energy function minimization problem. Thereby, the subject area and the background area included in the attention area are separated, and for example, only the subject area is extracted. That is, the outer area of the attention area is determined as the background area and is excluded from the extraction processing target in advance.

なお、第１実施形態においては本発明の画像処理装置（画像処理部１１０９）が撮像装置の構成要素であるものとして説明しているが、当該画像処理装置を撮像装置とは別体の装置として構成しても良い。例えば、パーソナルコンピュータ（ＰＣ）が、画像処理ソフトウェアプログラムを実行し上述の各部として機能させる形態としても実現可能である。 In the first embodiment, the image processing apparatus (image processing unit 1109) of the present invention is described as a component of the imaging apparatus. However, the image processing apparatus is a separate apparatus from the imaging apparatus. It may be configured. For example, the present invention can be realized as a form in which a personal computer (PC) executes an image processing software program and functions as each unit described above.

図４の画像処理装置は、図３の撮影装置における撮影部１１０２からの撮影画像データ、或いは、内部メモリ１１１１、外部メモリに蓄積した画像データを処理対象とし、ＣＰＵ１１０１による制御で被写体の存在する領域を判定し、被写体領域を抽出する。また、図４の画像処理装置の画像処理結果は、図３の撮影装置における圧縮／伸長部１１１０に符号化する、或いは、内部メモリ１１１１、外部メモリに保存する、或いは、他の画像処理に用いる。 The image processing apparatus in FIG. 4 is an area in which a subject exists under the control of the CPU 1101 with the captured image data from the imaging unit 1102 in the imaging apparatus in FIG. To extract a subject area. Also, the image processing result of the image processing apparatus in FIG. 4 is encoded in the compression / decompression unit 1110 in the photographing apparatus in FIG. 3, or stored in the internal memory 1111 and the external memory, or used for other image processing. .

＜装置の動作＞
図５は、第１実施形態に係る画像処理の全体フローチャートである。 <Operation of the device>
FIG. 5 is an overall flowchart of image processing according to the first embodiment.

ステップＳ１１０では、画像処理装置が、処理対象の画像の色情報及び、当該画像に含まれる各画素に対する距離情報を、それぞれ、色画像データ及び距離画像データとして入力する（距離情報入力手段）。 In step S110, the image processing apparatus inputs color information of an image to be processed and distance information for each pixel included in the image as color image data and distance image data, respectively (distance information input unit).

ステップＳ１２０では、画素選定部１１０（画素選択手段）が、入力された色画像データ及び距離画像データに基づいて、所定の距離範囲内にある画素をより被写体領域の候補画素を選定する。そして、選定された画素の集合領域（第１の領域）を決定する。つまり、選定されたかった画素は背景領域の画素として確定される。この処理により、背景画素の少なくとも一部を予め除外することが出来、後述する処理の処理負荷の低減及び後述する注目領域の判定精度の向上を狙う。ステップＳ１２０の処理の詳細については図７を参照して後述する。 In step S120, the pixel selection unit 110 (pixel selection means) selects candidate pixels in the subject area from pixels within a predetermined distance range based on the input color image data and distance image data. Then, a collection area (first area) of the selected pixels is determined. That is, the pixel that was desired to be selected is determined as the pixel in the background area. By this process, at least a part of the background pixels can be excluded in advance, aiming to reduce the processing load of the process to be described later and to improve the determination accuracy of the attention area to be described later. Details of the processing in step S120 will be described later with reference to FIG.

ステップＳ１３０では、注目度計算部１２０が、選定された候補画素の各々に対して注目度を計算する。つまり、候補画素の集合の中には、被写体領域の画素に加え、被写体と同程度の距離にある背景の画素がまだ含まれた状態にある。そのため、候補画素の各々の特徴量を算出し、候補画素に含まれる背景画素を区別する。この処理の詳細については図８を参照して後述する。 In step S130, the attention level calculator 120 calculates the attention level for each of the selected candidate pixels. That is, in the set of candidate pixels, in addition to the pixels in the subject area, the background pixels at the same distance as the subject are still included. Therefore, the feature amount of each candidate pixel is calculated, and background pixels included in the candidate pixel are distinguished. Details of this processing will be described later with reference to FIG.

ステップＳ１４０では、注目領域判定部１３０が、候補画素の内、所定閾値以上の注目度を有する画素を包含する領域を注目領域（第２の領域）として判定する。この処理により、候補画素に含まれる背景画素の少なくとも一部を除外することが出来、後述する抽出処理の処理負荷の低減が可能になる。 In step S <b> 140, the attention area determination unit 130 determines, as the attention area (second area), a region that includes pixels having a degree of attention equal to or greater than a predetermined threshold among the candidate pixels. By this process, at least a part of the background pixels included in the candidate pixels can be excluded, and the processing load of the extraction process described later can be reduced.

ステップＳ１５０では、対象領域抽出部１４０が、ステップＳ１４０で注目領域と判定された領域の内部を対象として、各画素が被写体領域の画素か背景領域の画素かを判定し、被写体領域を抽出する。この処理の詳細については図９を参照して後述する。 In step S150, the target area extraction unit 140 determines whether each pixel is a pixel in the subject area or a pixel in the background area, and extracts a subject area, targeting the inside of the area determined as the attention area in step S140. Details of this processing will be described later with reference to FIG.

＜距離情報による候補画素の選定処理（ステップＳ１２０）＞
図７は、第１実施形態における被写体領域の候補画素の選定処理を示すフローチャートである。 <Candidate Pixel Selection Processing Based on Distance Information (Step S120)>
FIG. 7 is a flowchart illustrating a process for selecting candidate pixels for a subject area in the first embodiment.

ステップＳ１２０１では、画素選定処理のための距離値Ｔを設定し、ステップＳ１２０２では、画素の距離値が設定値Ｔ以下であるかどうかを判断する。そして、画素の距離値が設定値Ｔ以下であればステップＳ１２０３に進み、画素の距離値が設定値Ｔより大きい場合はステップＳ１２０４に進む。 In step S1201, a distance value T for pixel selection processing is set. In step S1202, it is determined whether or not the pixel distance value is equal to or less than the set value T. If the pixel distance value is less than or equal to the set value T, the process proceeds to step S1203. If the pixel distance value is greater than the set value T, the process proceeds to step S1204.

すなわち、一般的には、被写体はユーザが意図をもって撮影するものであるため、撮影装置からの距離が近い。一方、背景部分は撮影機器からの距離が遠い。したがって、処理対象の距離範囲を背景よりも近い所定の距離範囲に設定すれば、当該距離範囲に入る画素を候補画素として選定することができる。 That is, in general, since the subject is intended to be photographed by the user, the distance from the photographing apparatus is short. On the other hand, the background portion is far from the photographing device. Therefore, if the distance range to be processed is set to a predetermined distance range closer to the background, pixels that fall within the distance range can be selected as candidate pixels.

ステップＳ１２０３では、着目画素を候補画素として選定し、ステップＳ１２０４では、着目画素を背景画素として決定する。 In step S1203, the target pixel is selected as a candidate pixel, and in step S1204, the target pixel is determined as a background pixel.

ステップＳ１２０５では、入力された色画像に含まれる全ての画素が判定済みであるかどうかを判断する。未処理の画素が残っていれば、新しい画素を着目画素としてステップＳ１２０２からステップＳ１２０４までの処理を行う。一方、未処理の画素が残っていなければ選定処理を終える。 In step S1205, it is determined whether all the pixels included in the input color image have been determined. If unprocessed pixels remain, the process from step S1202 to step S1204 is performed with the new pixel as the target pixel. On the other hand, if no unprocessed pixels remain, the selection process is terminated.

＜候補画素の注目度計算処理（ステップＳ１３０）＞
図８は、第１実施形態における注目度計算処理を示すフローチャートである。特に、ここでは、上述の非特許文献３に示される中心−周囲差分（center-surround）法をより効率的に実行する例について述べる。 <Candidate Pixel Attention Calculation Processing (Step S130)>
FIG. 8 is a flowchart showing attention level calculation processing in the first embodiment. In particular, here, an example in which the center-surround method shown in Non-Patent Document 3 described above is executed more efficiently will be described.

ステップＳ１３０１では、２以上のガウシアンフィルタのスケールを決める（スケール決定手段）。特に、ここでは、候補画素として選定された画素の個数を計数し、当該個数に基づいて２個以上のガウシアンフィルタの標準偏差（σ）を決定する。ここでは例として２個の標準偏差σ_１、及び当該σ_１と異なるσ_２を決める。候補画素として選定された画素の個数に基づいてスケールを決定することにより、より少数のスケールのみが決定されることになる。 In step S1301, the scale of two or more Gaussian filters is determined (scale determining means). In particular, here, the number of pixels selected as candidate pixels is counted, and the standard deviation (σ) of two or more Gaussian filters is determined based on the number. Here, as an example, two standard deviations σ ₁ and σ ₂ different from σ ₁ are determined. By determining the scale based on the number of pixels selected as candidate pixels, only a smaller number of scales are determined.

ステップＳ１３０２では、候補画素に対応する領域の色画像に対し、決定したスケールに基づくガウシアンフィルタをそれぞれ適用する。ここでは、ステップＳ１３０１において決定したσ_１、σ_２をそれぞれ有するガウシアンフィルタを適用する。 In step S1302, a Gaussian filter based on the determined scale is applied to each color image in the region corresponding to the candidate pixel. Here, a Gaussian filter having σ ₁ and σ ₂ determined in step S1301 is applied.

ステップＳ１３０３では、ステップＳ１３０２で得られた２以上のガウシアンフィルタの処理結果に基づいて、中心−周囲差分法により画素の注目度を計算する。ここでは、σ_１のガウシアンフィルタによる処理結果画像と、σ_２のガウシアンフィルタによる処理結果画像との差分画像を導出する。そして当該差分画像の各画素の値を注目度として導出する。 In step S1303, based on the processing results of the two or more Gaussian filters obtained in step S1302, the attention level of the pixel is calculated by the center-surround difference method. Here, a difference image between the processing result image by the σ ₁ Gaussian filter and the processing result image by the σ ₂ Gaussian filter is derived. Then, the value of each pixel of the difference image is derived as the attention level.

このように、第１実施形態においては、注目度計算処理の効率を向上するため、候補画素に対してのみ、かつ、少数のガウシアンフィルタに限定することにより処理負荷を低減する。また、ここでは、中心−周囲差分法を利用して注目度を計算する例について説明したが、本質的には、後述する画像１０００ｂに示されるような注目度を導出するものであれば他の方法を利用可能である。例えば、各種のエッジ検出フィルタが利用可能である。 As described above, in the first embodiment, in order to improve the efficiency of the attention degree calculation process, the processing load is reduced by limiting only to candidate pixels and a small number of Gaussian filters. In addition, here, an example of calculating the attention level using the center-surrounding difference method has been described, but in essence, as long as the attention level is derived as shown in an image 1000b described later, A method is available. For example, various edge detection filters can be used.

＜被写体領域の抽出処理（ステップＳ１５０）＞
図９は、第１実施形態における被写体領域抽出処理を示すフローチャートである。なお、前景背景分離については、背景技術で説明したグラフカット及びグラブカット及びその変形手法が利用可能である。 <Subject Area Extraction Processing (Step S150)>
FIG. 9 is a flowchart showing subject area extraction processing in the first embodiment. For the foreground / background separation, the graph cut and grab cut described in the background art and the deformation methods thereof can be used.

ステップＳ１５０１では、被写体領域抽出処理の初期設定を行う。ここで、注目領域枠外の領域を背景領域として設定し、注目領域枠内の領域を未確定領域に設定する。そして、以下では、当該未確定領域に対して、背景領域の画素か被写体領域の画素かの判定が行われる。なお、上述の画素の注目度に基づき、被写体領域抽出に使用する初期値を制御するよう構成しても良い。 In step S1501, initial setting of subject area extraction processing is performed. Here, an area outside the attention area frame is set as a background area, and an area inside the attention area frame is set as an undetermined area. In the following, it is determined whether the unconfirmed region is a pixel in the background region or a pixel in the subject region. Note that the initial value used for subject area extraction may be controlled based on the above-described pixel attention level.

ステップＳ１５０２では、背景領域と未確定領域の色情報を解析する。この処理では、クラスタリング処理により、背景領域を複数の色特性の違う複数の背景クラスタに分け、同様に、未確定領域も複数の色特性の違う複数の被写体候補クラスタに分ける。このクラスタリング処理は、公知の混合ガウス分布の推定手法を利用することが可能である。 In step S1502, the color information of the background area and the undetermined area is analyzed. In this process, the background region is divided into a plurality of background clusters having different color characteristics by clustering processing, and similarly, the undetermined region is also divided into a plurality of subject candidate clusters having different color characteristics. This clustering process can use a known method of estimating a mixed Gaussian distribution.

ステップＳ１５０３では、色画像の処理結果によりエネルギー関数のパラメータを計算する。具体的には、ある注目画素と該注目画素の近傍画素とに関するそれぞれの色情報及び距離情報に基づいて、当該注目画素に対するパラメータを導出する。具体的には、エネルギー関数のパラメータを計算する。例えば、パラメータとして、
類似度：注目画素の近傍画素との類似度
背景尤度：注目画素がどれぐらい背景に近いかを示す度合い
前景尤度：注目画素がどれぐらい前景に近いかを示す度合い
が導出される。 In step S1503, an energy function parameter is calculated from the processing result of the color image. Specifically, a parameter for the target pixel is derived based on the color information and the distance information regarding a certain target pixel and a neighboring pixel of the target pixel. Specifically, the energy function parameters are calculated. For example, as a parameter
Similarity: Similarity between the target pixel and neighboring pixels Background likelihood: Degree indicating how close the target pixel is to the background Foreground likelihood: Degree indicating how close the target pixel is to the foreground.

ステップＳ１５０４では、ステップＳ１５０３により計算されたパラメータに基づき、エネルギー関数の最小化問題を解き、前景領域と背景領域をカットする。このエネルギー関数の最小化は、グラフ理論でのネットワークフロー問題で、公知のフォード・ファルカーソンのアルゴリズムなどを利用することが可能である。 In step S1504, the energy function minimization problem is solved based on the parameters calculated in step S1503, and the foreground region and the background region are cut. This energy function minimization is a network flow problem in graph theory, and it is possible to use a well-known Ford Falkerson algorithm.

ステップＳ１５０５では、エネルギー関数の流量が小さくなるか、あるいは、指定する反復回数になるかを判断する。上記の条件の何れかを満たす場合、ステップＳ１５０６に進み、前景背景分離結果を出力する。一方、上記の条件の何れも満たさない場合、ステップＳ１５０７に進み、カットされた前景領域、背景領域をそれぞれ新たな未確定領域、確定背景領域として設定し、ステップＳ１５０２に戻りステップＳ１５０４までの処理を繰り返す。 In step S1505, it is determined whether the flow rate of the energy function is small or the specified number of iterations is reached. If any of the above conditions is satisfied, the process advances to step S1506 to output a foreground / background separation result. On the other hand, if none of the above conditions is satisfied, the process proceeds to step S1507, and the cut foreground area and background area are set as new unconfirmed areas and confirmed background areas, respectively, and the process returns to step S1502 and the processes up to step S1504 are performed. repeat.

＜処理結果例＞
図１０は、第１実施形態の画像処理における各処理段階での結果の例を示す図である。 <Example of processing results>
FIG. 10 is a diagram illustrating an example of a result at each processing stage in the image processing according to the first embodiment.

画像１０００ａは、色画像である画像６００ａ及び距離画像である画像６００ｂに対し、上述の選定処理（ステップＳ１２０）により得られた候補画素群の一例を示している。この例では、除外された背景候補画素は黒色で表され、選定された候補画素は色画像における色で表されている。ただし、画像１０００ａに示されるように、背景である”置き台”も被写体と同程度の距離に位置していることから、当該置き台も候補画素として選定されている。 The image 1000a shows an example of a candidate pixel group obtained by the above selection process (step S120) for the image 600a that is a color image and the image 600b that is a distance image. In this example, the excluded background candidate pixels are represented in black, and the selected candidate pixels are represented in colors in the color image. However, as shown in the image 1000a, the “table” that is the background is located at the same distance as the subject, so the table is also selected as a candidate pixel.

画像１０００ｂは、注目度の計算結果例である。画像１０００ｂは、各画素の注目度を明度で表したものであり、明度が高い程、視覚的に注目されやすいことを示している。なお、この例では、注目度の傾向を分かりやすく説明するために、被写体と背景の境界部分、置き台の区切り部分、それ以外の背景部分をそれぞれ均一な注目値で表している。 The image 1000b is an example of the calculation result of the attention level. The image 1000b represents the degree of attention of each pixel in terms of lightness, and indicates that the higher the lightness, the more visually noticeable. In this example, in order to explain the tendency of the attention level in an easy-to-understand manner, the boundary portion between the subject and the background, the separation portion of the pedestal, and the other background portions are represented by uniform attention values.

画像１０００ｃは、所定値より大きい注目度を有する画素を包含する矩形領域として注目領域を決定した例を示している。ここでは、画像１０００ａで含まれていた”置き台”の大部分が背景として除外されていることが分かる。 The image 1000c shows an example in which the attention area is determined as a rectangular area including pixels having a degree of attention greater than a predetermined value. Here, it can be seen that most of the “table” included in the image 1000a is excluded as a background.

画像１０００ｄは、画像１０００ｃに示される注目領域を対象として被写体領域の抽出処理を行うことにより、被写体が好適に抽出されている例を示している。すなわち、画像６００ａに示される背景の特徴部分を、被写体として誤って抽出することを防止することに成功している。 An image 1000d shows an example in which a subject is suitably extracted by performing subject region extraction processing on the target region shown in the image 1000c. That is, the background feature portion shown in the image 600a has been successfully prevented from being erroneously extracted as a subject.

以上説明したように、第１実施形態によれば、入力画像の各画素に対応する距離情報を利用することにより、背景であると確定できる画像領域を予め抽出対象から除外する。これにより、図２の画像２００ｂに示されるような誤判定を未然に防ぐことができる。その結果、複雑な画像であっても、被写体領域の抽出をより正確に実行することを可能としている。また、第１実施形態においては、処理負荷が比較的大きいエネルギー関数の最小化問題を解く処理（ステップＳ１５０４）は、入力画像の一部領域（注目領域）に対してのみ実行されるため、処理負荷を大幅に低減することが可能になる。 As described above, according to the first embodiment, by using distance information corresponding to each pixel of the input image, an image region that can be determined to be the background is excluded in advance from the extraction target. Thereby, an erroneous determination as shown in the image 200b of FIG. 2 can be prevented in advance. As a result, the subject area can be extracted more accurately even for a complex image. In the first embodiment, the process for solving the energy function minimization problem (step S1504) with a relatively large processing load is executed only for a partial area (attention area) of the input image. The load can be greatly reduced.

また、上述したように、第１実施形態では、距離情報により選定された候補画素の個数に基づいてガウシアンフィルタのスケール（標準偏差）を決定した。これにより、従来より少ない個数のガウシアンフィルタにより効率的に注目度画像を生成することが可能となっている。なお、上述の説明では、注目度の計算に色情報のみを使用した。しかし、色情報の他、輝度、テクスチャなどをさらに使用しても良い。その場合、各特徴から計算される注目度を重み付けることになる。 As described above, in the first embodiment, the scale (standard deviation) of the Gaussian filter is determined based on the number of candidate pixels selected based on the distance information. This makes it possible to efficiently generate the attention level image with a smaller number of Gaussian filters than in the past. In the above description, only the color information is used for calculating the attention level. However, brightness, texture, etc. may be further used in addition to the color information. In that case, the degree of attention calculated from each feature is weighted.

（第２実施形態）
第２実施形態では、距離情報に基づいて候補画素に対し注目度の重みを設定し、重み付けされた注目度により注目領域を決定する形態について説明する。 (Second Embodiment)
In the second embodiment, a mode in which a weight of attention is set for a candidate pixel based on distance information and a region of interest is determined based on the weighted attention will be described.

＜装置構成＞
図１１は、第２実施形態に係る画像処理装置の機能ブロック図である。画像処理装置は、重み設定部２１０、注目度計算部２２０、注目領域判定部２３０、対象領域抽出部２４０を備えている。そして、第１実施形態と同様、入力された画像の色情報及び距離情報に基づき、画像に含まれる被写体領域を抽出する。つまり、画素選定部１１０の代わりに重み設定部２１０を備えている部分が第１実施形態と異なる。他の機能部は第１実施形態の対応する機能部と同様であるため説明は省略する。 <Device configuration>
FIG. 11 is a functional block diagram of the image processing apparatus according to the second embodiment. The image processing apparatus includes a weight setting unit 210, an attention level calculation unit 220, an attention region determination unit 230, and a target region extraction unit 240. Then, as in the first embodiment, a subject area included in the image is extracted based on the color information and distance information of the input image. That is, the part provided with the weight setting part 210 instead of the pixel selection part 110 differs from 1st Embodiment. The other functional units are the same as the corresponding functional units in the first embodiment, and thus description thereof is omitted.

重み設定部２１０では、各画素の距離値が所定距離範囲内であるか否かに基づいて、注目度計算部２２０で算出する注目度に対する重みを設定する。ここでは、距離値が所定距離範囲内である画素に対しては重みを高く設定し、所定距離範囲外の画素に対しては重みを低く設定する。 The weight setting unit 210 sets a weight for the attention level calculated by the attention level calculation unit 220 based on whether or not the distance value of each pixel is within a predetermined distance range. Here, a high weight is set for pixels whose distance value is within the predetermined distance range, and a low weight is set for pixels outside the predetermined distance range.

＜装置の動作＞
図１２は、第２実施形態に係る画像処理の全体フローチャートである。ステップＳ２２０において、注目度に対する重みを設定し、ステップＳ２３０において、設定された重みに基づく注目度を算出する点が第１実施形態と異なる。なお、この処理の詳細については図１３及び図１４を参照して後述する。 <Operation of the device>
FIG. 12 is an overall flowchart of image processing according to the second embodiment. In step S220, a weight for the attention level is set, and in step S230, the attention level based on the set weight is calculated, which is different from the first embodiment. Details of this processing will be described later with reference to FIGS.

＜距離情報による注目度重みの設定処理（ステップＳ２２０）＞
図１３は、重み設定処理を示すフローチャートである。 <Attention level weight setting process based on distance information (step S220)>
FIG. 13 is a flowchart showing the weight setting process.

ステップＳ２２０１では、画素選定処理のための距離値Ｔを設定し、ステップＳ２２０２では、画素の距離値が設定値Ｔ以下であるかどうかを判断する。そして、画素の距離値が設定値Ｔ以下であればステップＳ２２０３に進み、画素の距離値が設定値Ｔより大きい場合はステップＳ２２０４に進む。 In step S2201, a distance value T for pixel selection processing is set. In step S2202, it is determined whether or not the pixel distance value is equal to or smaller than the set value T. If the pixel distance value is less than or equal to the set value T, the process proceeds to step S2203. If the pixel distance value is greater than the set value T, the process proceeds to step S2204.

ステップＳ２２０３では、着目画素に対する重みを高い値（例えば１より大きい値）に設定し、ステップＳ２２０４では、着目画素に対する重みを低い値（例えば１より小さい値）に設定する。 In step S2203, the weight for the pixel of interest is set to a high value (for example, a value greater than 1), and in step S2204, the weight for the pixel of interest is set to a low value (for example, a value less than 1).

ステップＳ２２０５では、入力された色画像に含まれる全ての画素が判定済みであるかどうかを判断する。未処理の画素が残っていれば、新しい画素を着目画素としてステップＳ２２０２からステップＳ２２０４までの処理を行う。一方、未処理の画素が残っていなければ重み設定処理を終える。 In step S2205, it is determined whether all the pixels included in the input color image have been determined. If unprocessed pixels remain, the process from step S2202 to step S2204 is performed with the new pixel as the target pixel. On the other hand, if no unprocessed pixels remain, the weight setting process is terminated.

＜画素の注目度計算処理（ステップＳ２３０）＞
図１４は、第２実施形態における注目度計算処理を示すフローチャートである。

図８は、第１実施形態における注目度計算処理を示すフローチャートである。第１実施形態と同様、非特許文献３に示される中心−周囲差分（center-surround）法をベースとした例について述べる。 <Pixel Attention Level Calculation Processing (Step S230)>
FIG. 14 is a flowchart showing attention level calculation processing in the second embodiment.

FIG. 8 is a flowchart showing attention level calculation processing in the first embodiment. Similar to the first embodiment, an example based on the center-surround method shown in Non-Patent Document 3 will be described.

ステップＳ２３０１では、入力された色画像全体を対象として互いに異なるスケール（標準偏差σ）が設定された複数個のガウシアンフィルタを適用する。 In step S2301, a plurality of Gaussian filters set with different scales (standard deviation σ) for the entire input color image are applied.

ステップＳ２３０２では、ステップＳ２３０１で得られた複数のガウシアンフィルタの処理結果に基づいて、中心−周囲差分法により画素の注目度を計算する。 In step S2302, the attention level of the pixel is calculated by the center-surround difference method based on the processing results of the plurality of Gaussian filters obtained in step S2301.

ステップＳ２３０３では、ステップ２３０２により算出された各画素の注目度に対し、重みの設定処理（ステップＳ２２０）で設定した重みにより注目度の重み付けを行う。 In step S2303, the attention degree is weighted by the weight set in the weight setting process (step S220) with respect to the attention degree of each pixel calculated in step 2302.

このように、第２実施形態においては、距離情報に基づいて候補画素に対し注目度の重み付けを行うことにより、第１実施形態と同様、好適に背景領域を除外した注目領域を設定することが可能となる。なお、上述の説明では、第１実施形態における、画素選定部１１０は利用しない場合について説明したが、画素選定部１１０を併せて利用するよう構成することも可能である。 As described above, in the second embodiment, by assigning the attention level to the candidate pixels based on the distance information, it is possible to appropriately set the attention area excluding the background area, as in the first embodiment. It becomes possible. In the above description, the case where the pixel selection unit 110 is not used in the first embodiment has been described. However, the pixel selection unit 110 may be used together.

（第３実施形態）
第３実施形態では、入力画像内に抽出対象となる被写体領域が複数存在する場合に、より好適に当該複数の被写体領域を自動抽出可能とする形態について説明する。 (Third embodiment)
In the third embodiment, a description will be given of a mode in which a plurality of subject areas can be automatically extracted more suitably when there are a plurality of subject areas to be extracted in the input image.

＜装置構成＞
図１５は、第３実施形態に係る画像処理装置の機能ブロック図である。画像処理装置は、距離レイヤ化部３１０、注目領域判定部３２０、対象領域抽出部３３０を備えている。そして、第１実施形態や第２実施形態と同様、入力された画像の色情報及び距離情報に基づき、画像に含まれる被写体領域を抽出する。 <Device configuration>
FIG. 15 is a functional block diagram of an image processing apparatus according to the third embodiment. The image processing apparatus includes a distance layering unit 310, a region of interest determination unit 320, and a target region extraction unit 330. Then, similarly to the first embodiment and the second embodiment, a subject region included in the image is extracted based on the color information and distance information of the input image.

距離レイヤ化部３１０では、入力距離画像により距離範囲を２以上の範囲に分割し、各距離範囲に含まれる画素をそれぞれ選定する。注目領域判定部３２０では、各距離レイヤに含まれる画素を処理対象とし、注目度を算出し、注目度の高い領域を判定する。対象領域抽出部３３０では、各距離レイヤにおける注目領域から、各レイヤにおける被写体領域を抽出する。 The distance layering unit 310 divides the distance range into two or more ranges based on the input distance image, and selects pixels included in each distance range. The attention area determination unit 320 calculates a degree of attention by determining pixels included in each distance layer, and determines a region having a high degree of attention. The target area extraction unit 330 extracts the subject area in each layer from the attention area in each distance layer.

＜装置の動作＞
図１６は、第３実施形態に係る画像処理の全体フローチャートである。 <Operation of the device>
FIG. 16 is an overall flowchart of image processing according to the third embodiment.

ステップＳ３０１では、画像処理装置が、処理対象の画像の色情報及び、当該画像に含まれる各画素に対する距離情報を、それぞれ、色画像データ及び距離画像データとして入力する。 In step S301, the image processing apparatus inputs color information of an image to be processed and distance information for each pixel included in the image as color image data and distance image data, respectively.

ステップＳ３０２では、距離レイヤ化部３１０が、入力された距離画像に基づいて、色画像をレイヤ化する。具体的には、入力された距離画像から導出される各画素の距離情報に基づいて、色画像に含まれる画素のクラスタリング処理を行い、色画像に含まれる画素を複数の画素グループに分離する。すなわち、各画素グループは、互いに類似の距離値を持つ複数の画素の集合として設定され、以下では距離レイヤとも呼ぶ。 In step S302, the distance layering unit 310 layers the color image based on the input distance image. Specifically, clustering processing of pixels included in the color image is performed based on distance information of each pixel derived from the input distance image, and the pixels included in the color image are separated into a plurality of pixel groups. That is, each pixel group is set as a set of a plurality of pixels having distance values similar to each other, and is hereinafter also referred to as a distance layer.

ステップＳ３０３では、距離レイヤ化部３１０が、距離のクラスタリングの結果に基づいて、各距離レイヤにおいて後述する処理対象画素を選定するための距離範囲を設定する。以降、ステップＳ３０４からステップＳ３０８までは、各距離レイヤについて、注目領域判定部３２０による注目領域の判定、及び、対象領域抽出部３３０による被写体領域の抽出がそれぞれ個別に処理される。 In step S303, the distance layering unit 310 sets a distance range for selecting a processing target pixel to be described later in each distance layer based on the result of distance clustering. Thereafter, from step S304 to step S308, for each distance layer, determination of the attention area by the attention area determination section 320 and extraction of the subject area by the target area extraction section 330 are individually processed.

ステップＳ３０４では、注目領域判定部３２０が、着目距離レイヤにおいて、画素の距離値が、当該着目距離レイヤに対して設定された距離範囲内であるか否かを判定する。画素の距離値が距離範囲の設定値以内であればステップＳ３０５に進み、当該画素を処理対象画素とする。一方、画素の距離値は距離範囲外であればステップＳ３０６に進み、当該画素を非処理対象画素とする。この処理は着目距離レイヤに含まれる全ての画素に対して実行される。 In step S304, the attention area determination unit 320 determines whether or not the pixel distance value is within the distance range set for the attention distance layer in the attention distance layer. If the distance value of the pixel is within the set value of the distance range, the process proceeds to step S305, and the pixel is set as a processing target pixel. On the other hand, if the distance value of the pixel is out of the distance range, the process proceeds to step S306, and the pixel is set as a non-processing target pixel. This process is executed for all pixels included in the target distance layer.

ステップＳ３０７では、注目領域判定部３２０が、処理対象画素を対象として注目領域を判定する。なお、注目領域の判定処理は、上述の第１実施形態及び第２実施形態における注目度計算処理及び判定処理（ステップＳ１３０及びＳ１４０）と同様であるため説明は省略する。 In step S307, the attention area determination unit 320 determines the attention area for the processing target pixel. Note that the attention area determination processing is the same as the attention calculation processing and determination processing (steps S130 and S140) in the first embodiment and the second embodiment described above, and a description thereof will be omitted.

ステップＳ３０８では、対象領域抽出部３３０が、注目領域と判定された領域の内部を対象として、各画素が被写体領域の画素か否かを判定し、被写体領域を抽出する。なお、抽出処理は、上述の第１実施形態及び第２実施形態における抽出処理（ステップＳ１５０）と同様であるため説明は省略する。 In step S <b> 308, the target area extraction unit 330 determines whether each pixel is a pixel in the subject area for the inside of the area determined as the attention area, and extracts the subject area. The extraction process is the same as the extraction process (step S150) in the first embodiment and the second embodiment described above, and a description thereof will be omitted.

ステップＳ３０９では、画像処理装置が、すべての距離レイヤに対し処理済みであるかどうかを判断する。未処理の距離レイヤが残っていれば、新しい距離レイヤを着目距離レイヤとしてステップＳ３０４からステップＳ３０８までの処理を行う。一方、未処理の距離レイヤが残っていなければ処理を終える。 In step S309, the image processing apparatus determines whether all distance layers have been processed. If an unprocessed distance layer remains, the process from step S304 to step S308 is performed with the new distance layer as the target distance layer. On the other hand, if there is no unprocessed distance layer remaining, the process ends.

＜距離レイヤの画像例＞
図１７は、入力画像を複数の距離レイヤに分ける例を示す図である。画像１７００ａでは、”花”及び、”蝶”という２つの被写体（対象領域）が存在する。更に、対象領域である花の近くに”背景の特徴部”がある。そして、各領域に対する距離値が、画像１７００ｂにより示されている。図６の画像６００ｂと同様、距離の遠近は明度で表されており、ここでは、明度が高い程、撮影機器までの距離が近いことを示している。この場合、画像１７００ａは、被写体である”花”、被写体である”蝶”及び”背景の特徴部”の３つの距離レイヤとして設定される。 <Example of distance layer image>
FIG. 17 is a diagram illustrating an example in which an input image is divided into a plurality of distance layers. In the image 1700a, there are two subjects (target regions) “flowers” and “butterflies”. Furthermore, there is a “background feature” near the target area of the flower. And the distance value with respect to each area | region is shown by the image 1700b. Similar to the image 600b in FIG. 6, the distance is expressed by brightness, and here, the higher the brightness, the closer the distance to the photographing device. In this case, the image 1700a is set as three distance layers of the subject “flower”, the subject “butterfly”, and the “background feature”.

このように、距離範囲に応じてレイヤ化し、それぞれのレイヤで対象領域抽出処理を行うことにより、被写体（ここでは、”花”及び”蝶”）の抽出漏れを防ぎつつ、背景に含まれる特徴部も併せて抽出することが可能となる。 In this way, by layering according to the distance range and performing target area extraction processing on each layer, features included in the background while preventing omission of extraction of the subject (here “flower” and “butterfly”) Parts can also be extracted.

（その他の実施例）
また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 (Other examples)
The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

Claims

Distance information input means for inputting distance information from the viewpoint for each pixel of the image;
Pixel selection means for selecting a pixel in which the distance information is within a predetermined distance range among the pixels included in the image;
Attention level calculation means for calculating a degree of attention, which is a value indicating the degree of change of the image at each pixel position, for a first region including all of the pixels selected by the pixel selection means;
Determining means for determining, as an area including the subject area, a second area including all pixels having an attention level calculated by the attention level calculating means equal to or greater than a predetermined threshold;
An image processing apparatus comprising:

The attention level calculation means applies two or more different Gaussian filters to the first region, and calculates the attention level as a difference between filter results of the two or more Gaussian filters. Item 8. The image processing apparatus according to Item 1.

3. The image according to claim 2, wherein the attention level calculation unit includes a scale determination unit that determines a scale of each of the two or more Gaussian filters based on the number of pixels included in the first region. Processing equipment.

The determining means is configured to determine the second region as a rectangular region;
The image processing apparatus according to claim 1, further comprising an extraction unit that extracts the subject area from the second area.

The weighting means for weighting the attention degree of each pixel included in the first area calculated by the attention degree calculation means based on distance information corresponding to each pixel. The image processing apparatus according to any one of 1 to 4.

Separating means for separating the image into two or more distance layers based on distance information corresponding to each pixel included in the image;
6. The process according to claim 1, wherein processing by the pixel selection unit, the attention level calculation unit, and the determination unit is individually performed for each of the two or more distance layers. Image processing device.

A distance information input step in which the distance information input means inputs distance information from the viewpoint for each pixel of the image;
A pixel selection step in which the pixel selection means selects a pixel in which the distance information is within a predetermined distance range among the pixels included in the image;
Attention degree calculation means for calculating the attention degree, which is a value indicating the degree of change of the image at each pixel position, for the first region including all of the pixels selected by the pixel selection step. Process,
A determining step for determining, as a region including the subject region, a second region including all of the pixels having an attention level calculated by the attention level calculation step equal to or greater than a predetermined threshold;
An image processing method comprising:

The program for functioning a computer as each means of the image processing apparatus as described in any one of Claims 1 thru | or 6.