JP2012252648A

JP2012252648A - Image processing device, image processing method, program, imaging device, and television receiver

Info

Publication number: JP2012252648A
Application number: JP2011126626A
Authority: JP
Inventors: Daisuke Murayama; 大輔村山; Kenichi Iwauchi; 謙一岩内; Tomoya Shimura; 智哉紫村; Shinichi Arita; 真一有田
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2011-06-06
Filing date: 2011-06-06
Publication date: 2012-12-20
Anticipated expiration: 2031-06-06
Also published as: JP5694060B2

Abstract

PROBLEM TO BE SOLVED: To provide an image processing device capable of extracting only a specific subject without the need for preparation in advance even when there are a plurality of subjects having the same depth in an image.SOLUTION: A television receiver 1 acquires an original image having distance information related to a distance in the image depth direction on each of a plurality of positions on the image, and an image to be replaced with a part of the original image and synthesizes an extracted image of a part of the acquired original image and the image to be replaced. The television receiver 1 comprises: a threshold value setting part that sets a threshold value for the information on the basis of each position on the original image, and an extracted image generation part that extracts a part of the original image on the basis of the set threshold value and the distance information and outputs the part as an extracted image.

Description

本発明は、原画像の一部を抽出し、抽出画像と他の画像とを画像合成する画像処理装置、画像処理方法、プログラム、撮像装置、及びテレビジョン受像機に関する。 The present invention relates to an image processing apparatus, an image processing method, a program, an imaging apparatus, and a television receiver that extract a part of an original image and combine the extracted image with another image.

撮像装置で静止画又は動画を取得した際、風景を撮影した画像や記念撮影として取得した画像は、被写体だけを抽出して背景を入れ替える処理及びその他の処理は通常不要である。
それに対し、取得した画像をそのまま用いるのではなく、特定の被写体を抽出し、背景画像などそれ以外の部分の画像を入れ替えたいという要望も存在する。例えば、テレビ電話やテレビ会議においては、お互いの顔が双方で表示されれば良いのであり、背景画像は基本的に不要である。特に、背景となるエリアが整然と整理されていない場合は、背景画像を取り除くことが望ましい。
また、プリントシール機などのように自分自身を撮影し文字や模様などで装飾した画像をシールとしてプリントする機械では、機械そのものに背景を遮断すべくカーテンが取り付けられている。これは邪魔な背景が写り込まないようにするためで、背景と人物を分離したいという要求は多い。
このような背景より、画像から特定の被写体を抽出し背景と分離する手法が提案されている。 When a still image or a moving image is acquired by an imaging apparatus, an image obtained by capturing a landscape or an image acquired as a commemorative image does not normally require processing for extracting only the subject and replacing the background and other processing.
On the other hand, instead of using the acquired image as it is, there is a desire to extract a specific subject and replace an image of other parts such as a background image. For example, in a videophone or a video conference, it is only necessary that both faces are displayed, and a background image is basically unnecessary. In particular, when the background area is not neatly arranged, it is desirable to remove the background image.
Also, in a machine that photographs an image of itself and decorates it with letters and patterns as a sticker, such as a print sticker machine, a curtain is attached to the machine itself to block the background. This is to prevent the background from getting in the way, and there are many requests to separate the background from the person.
From such a background, a method of extracting a specific subject from an image and separating it from the background has been proposed.

例えば、一種類の色（青色）を背景として人物などの被写体を撮影し、背景色と異なる色の部分のみを抽出することにより、被写体のみの抽出を可能とするクロマキー技術や、予め背景のみの画像を取得しておき、背景と被写体を撮影した画像と、背景のみの画像との差分をとることにより、被写体のみの画像を得る手法である。 For example, a subject such as a person is photographed with one type of color (blue) as a background, and only a portion of a color different from the background color is extracted, so that only the background can be extracted in advance. This is a technique for obtaining an image of only a subject by obtaining a difference between an image obtained by capturing an image of the background and the subject and an image of only the background.

また、カメラから対像物までの距離を取得する装置を採用し、取得した距離から、求めた奥行き値を用いる手法もある。奥行き値とは、画像を構成する複数の画像部分それぞれの奥行き方向の距離を示した値である。奥行き値は、例えば、被写体と、撮像に用いた撮像装置との距離を用いて算出される。撮像装置及び各被写体間の距離を計測する手法は、例えばＴＯＦ（Time Of Flight）方式、ステレオ方式などがある。 There is also a method that employs a device that acquires the distance from the camera to the object to be imaged and uses the obtained depth value from the acquired distance. The depth value is a value indicating the distance in the depth direction of each of a plurality of image parts constituting the image. The depth value is calculated using, for example, the distance between the subject and the imaging device used for imaging. Methods for measuring the distance between the imaging device and each subject include, for example, a TOF (Time Of Flight) method, a stereo method, and the like.

ＴＯＦ方式は、ＬＥＤ（Light Emitting Diode）などの光源から赤外線など眼に見えない光を照射し、その光が被写体などに当たり反射して返ってくる飛行時間を計測することで距離を測距するものである。その計測を細かく分割された領域毎に計測することで、一点だけでなく被写体の様々な部分の測距が可能となる。
なお、飛行時間の測定は、レーザ光をパルス照射し、パルスを発射してから反射光が戻ってくるまでの時間を計測する方法や、照射する赤外線を変調し、照射したときの位相と、反射光との位相差から算出する方法などがある。 The TOF method measures the distance by irradiating invisible light such as infrared rays from a light source such as an LED (Light Emitting Diode), and measuring the time of flight when the light hits the subject and returns. It is. By measuring the measurement for each finely divided area, it is possible to measure not only one point but also various parts of the subject.
In addition, the measurement of the time of flight is a method of measuring the time from the pulse irradiation of the laser light, the time from when the pulse is emitted until the reflected light returns, the phase when the irradiation infrared light is modulated and irradiated, There is a method of calculating from the phase difference with the reflected light.

ステレオ方式は、略平行に並べた二台の撮像装置で同じ領域を撮像し、得られた２つの画像において対応する画素の視差を求め、視差を基に距離を算出するものである。その他の計測方式としては、照射した赤外線の強度及び反射光との強度から距離を計測するものがある。 In the stereo method, the same region is imaged by two imaging devices arranged substantially in parallel, the parallax of corresponding pixels in the two obtained images is obtained, and the distance is calculated based on the parallax. As another measurement method, there is a method of measuring a distance from the intensity of irradiated infrared rays and the intensity of reflected light.

ステレオ方式において、２つの画像において対応する画素を求めることをステレオマッチングという。例えば、次のような処理を行う。一方の撮像画像のある画素について、他方の撮像画像上を水平方向に走査することで画素マッチングを行う。画素マッチングは注目画素を中心としたブロック単位で行われ、ブロック内の画素の絶対値差分の総和をとるＳＡＤ（Sum of Absolute Difference）を計算し、ＳＡＤの値が最小となるブロックを決定することで、一方の撮像画像の注目画素に対応する他方の撮像画像上の画素を求める。ＳＡＤによる計算手法以外に、ＳＳＤ（Sum of Squared Intensity Difference）やグラフカット、ＤＰ（Dynamic Programming）マッチングといった計算手法もある。対応画素が求まることでその画素の視差値が算出可能となる。 In the stereo method, obtaining corresponding pixels in two images is called stereo matching. For example, the following processing is performed. Pixel matching is performed by scanning a pixel of one captured image in the horizontal direction on the other captured image. Pixel matching is performed in units of blocks centered on the pixel of interest, and a SAD (Sum of Absolute Difference) that calculates the sum of absolute value differences of pixels in the block is calculated to determine a block that minimizes the SAD value. Thus, the pixel on the other captured image corresponding to the target pixel of the one captured image is obtained. In addition to SAD calculation methods, there are calculation methods such as SSD (Sum of Squared Intensity Difference), graph cut, and DP (Dynamic Programming) matching. When the corresponding pixel is obtained, the parallax value of the pixel can be calculated.

一方の撮像画像の全画素について行うことで、全画素の視差値を算出し、２つの撮像装置の位置関係と視差値を用いて三角測量を行うことで、撮像装置から被写体までの距離が算出できる。また、二つの撮像部が左右方向でなく、上下方向に配置されていても視差値の算出が可能で、その場合は撮像画像の走査を水平方向に代えて垂直方向にすれば良い。 The parallax value of all pixels is calculated by performing for all the pixels of one captured image, and the distance from the imaging device to the subject is calculated by performing triangulation using the positional relationship and the parallax value of the two imaging devices. it can. In addition, the parallax value can be calculated even when the two imaging units are arranged in the vertical direction instead of the horizontal direction. In this case, scanning of the captured image may be performed in the vertical direction instead of the horizontal direction.

図２８及び図２９は、従来技術による画像抽出の例を示す説明図である。図２８に示すように被写体Ｆ３１、Ｆ３２、Ｆ３３、Ｆ３４からなる画像より、被写体Ｆ３３を抽出するものとする。図２９は被写体Ｆ３１からＦ３４の位置関係を示す図であり、図２９から分かるようにＦ３３が最も手前に位置している。 28 and 29 are explanatory diagrams illustrating an example of image extraction according to the conventional technique. Assume that the subject F33 is extracted from an image made up of subjects F31, F32, F33, and F34 as shown in FIG. FIG. 29 is a diagram showing the positional relationship between the subjects F31 to F34, and as can be seen from FIG. 29, F33 is positioned closest to the front.

図３０は撮像装置と被写体との距離情報を概念的に示す説明図である。デプスマップ（Depth Map）と言われるものを概念的に示したもので、通常は画素毎に対応付けられた奥行き値を２５６階調の白黒画像を用いて表すが、ここではわかり易くするために、ハッチングを用いて表示している。
図３０の例では、奥行き値は距離が短い順に１、２、…、５の昇順の数字で示している。ハッチングと奥行き値との対応関係から明らかなように、被写体Ｆ３３が最も手前に位置し、被写体Ｆ３４が最も奥に位置するという図２９に示された位置関係が表現されている。 FIG. 30 is an explanatory diagram conceptually showing distance information between the imaging device and the subject. This is a conceptual representation of what is called a depth map. Normally, the depth value associated with each pixel is represented using a 256-tone black-and-white image. It is displayed using hatching.
In the example of FIG. 30, the depth values are indicated by numbers in ascending order of 1, 2,. As is clear from the correspondence between hatching and depth values, the positional relationship shown in FIG. 29 in which the subject F33 is positioned closest to the front and the subject F34 is positioned deepest is represented.

図２８に示した画像より、例えば、被写体Ｆ３３のみを抽出したい場合、図２９に示した点線Ｂ０で表した奥行きを閾値として、該閾値よりも小さい奥行き値を持つ部分（手前側の部分）を抽出すべき被写体とし、該閾値よりも大きい奥行き値を持つ部分（奥の部分）を背景として抽出することにより、被写体Ｆ３３とそれ以外の被写体Ｆ３１、Ｆ３２、Ｆ３４、及び背景を分離することができる。 For example, when it is desired to extract only the subject F33 from the image shown in FIG. 28, a portion having a depth value smaller than the threshold (front portion) is set with the depth represented by the dotted line B0 shown in FIG. By extracting the subject to be extracted and the portion having the depth value larger than the threshold (the back portion) as the background, the subject F33 and the other subjects F31, F32, F34, and the background can be separated. .

特許文献１には、撮像範囲の撮像対像を撮像することにより撮像画像を得て、同時に撮像範囲の中で撮像対像までの距離を複数点にわたって測定することにより、撮像画像に対応した距離分布を表現する距離情報を得て、距離情報にもとづいて、撮像範囲の中で人物と背景とを分離し、撮像画像の中から人物の画像のみを抽出する手法が開示されている。 Patent Document 1 discloses a distance corresponding to a captured image by obtaining a captured image by capturing a captured image in the imaging range and simultaneously measuring the distance to the captured image in a plurality of points within the captured range. There has been disclosed a method of obtaining distance information expressing a distribution, separating a person and a background in an imaging range based on the distance information, and extracting only a person image from the captured image.

特開２００１−１６７２７６号公報JP 2001-167276 A

しかしながら、クロマキー技術や背景画像との差分をとる手法は、撮影環境が制限されたり、事前の準備が必要である。特許文献１に係る手法は、ある奥行きより手前に存在する被写体と、それより奥に存在する背景とを距離情報（奥行き値）を見ることで分離、抽出することが可能であるが、同じ奥行きに存在する被写体を切り分けて背景と分離することはできない。複数の人物がカメラから同じ奥行きに存在する場合、仮に一人を抽出しようとしても同じ奥行きに存在する別の人物も抽出される。また、画像に存在する全く関係のない別の被写体であっても、抽出しようとする人物の奥行きと関係のない別の被写体の奥行きとが同じ値である場合、抽出しようとする人物と一緒に抽出されるという問題があった。 However, the chroma key technique and the method of obtaining a difference from the background image require a preparatory preparation or a limited shooting environment. The method according to Patent Document 1 can separate and extract a subject existing in front of a certain depth and a background existing in the depth by looking at distance information (depth value). It is impossible to separate the subject existing in the background and separate it from the background. When a plurality of persons are present at the same depth from the camera, even if one person is to be extracted, another person existing at the same depth is also extracted. Also, even if another subject that is not related to the image has the same value as the depth of another subject that is not related to the depth of the person to be extracted, There was a problem of being extracted.

本発明は、上述のごとき実情に鑑みてなされたものであり、事前の準備を必要とせず、奥行きの同じ被写体が画像に複数存在していても、特定の被写体のみを抽出することが可能な画像処理装置、撮像装置又はテレビジョン受像機、上記画像処理装置における画像処理方法、及びコンピュータを上記画像処理装置として動作させるためのプログラムを提供することを目的とする。 The present invention has been made in view of the above circumstances, and it is possible to extract only a specific subject even if there are a plurality of subjects having the same depth in the image without any prior preparation. An object of the present invention is to provide an image processing apparatus, an imaging apparatus or a television receiver, an image processing method in the image processing apparatus, and a program for causing a computer to operate as the image processing apparatus.

本発明に係る画像処理装置は、画像上の複数位置それぞれに画像の奥行き方向の距離に係る距離情報を有する原画像と、原画像の一部と入れ替わる入替用画像とを取得し、取得した原画像の一部を抽出した抽出画像と前記入替用画像とを画像合成する画像処理装置において、前記距離情報に対する閾値を原画像上の位置それぞれに基づいて設定する閾値設定部と、設定した閾値及び前記距離情報に基づいて、前記抽出画像を生成する抽出画像生成部とを備えることを特徴とする。 The image processing apparatus according to the present invention acquires an original image having distance information related to a distance in the depth direction of the image at each of a plurality of positions on the image, and a replacement image that replaces a part of the original image. In an image processing apparatus that combines an extracted image obtained by extracting a part of an image and the replacement image, a threshold setting unit that sets a threshold for the distance information based on each position on the original image, And an extracted image generating unit that generates the extracted image based on the distance information.

本発明に係る画像処理装置は、前記抽出部により抽出されなかった画像部分を前記入替用画像に入れ替えるように構成したことを特徴とする。 The image processing apparatus according to the present invention is configured to replace an image portion that has not been extracted by the extraction unit with the replacement image.

本発明に係る画像処理装置は、前記原画像を複数の領域に分割する領域分割部をさらに備え、前記閾値設定部は、分割した領域毎に閾値を設定し、前記抽出画像生成部は、分割した領域毎に抽出画像を生成することを特徴とする。 The image processing apparatus according to the present invention further includes a region dividing unit that divides the original image into a plurality of regions, the threshold setting unit sets a threshold for each divided region, and the extracted image generation unit An extracted image is generated for each of the areas.

本発明に係る画像処理装置は、前記原画像を複数の領域に分割する領域分割部をさらに備え、前記閾値設定部は、分割した領域毎に閾値を設定し、前記抽出画像生成部は、分割した領域毎に抽出画像を生成し、前記入替用画像を複数取得し、分割した領域毎に異なる入替用画像を用いることを特徴とする。 The image processing apparatus according to the present invention further includes a region dividing unit that divides the original image into a plurality of regions, the threshold setting unit sets a threshold for each divided region, and the extracted image generation unit An extracted image is generated for each of the areas, a plurality of replacement images are acquired, and different replacement images are used for the divided areas.

本発明に係る画像処理方法は、画像上の複数位置それぞれに画像の奥行き方向の距離に係る距離情報を有する原画像と、原画像の一部と入れ替わる入替用画像とを取得し、取得した原画像の一部を抽出した抽出画像と前記入替用画像とを画像合成する画像処理方法において、前記距離情報に対する閾値を原画像上の位置それぞれに基づいて設定するステップと、設定した閾値及び前記距離情報に基づいて、前記抽出画像を生成するステップとを備えることを特徴とする。 The image processing method according to the present invention acquires an original image having distance information related to the distance in the depth direction of the image at each of a plurality of positions on the image, and a replacement image that replaces a part of the original image. In the image processing method for synthesizing an extracted image obtained by extracting a part of an image and the replacement image, a step of setting a threshold for the distance information based on each position on the original image, the set threshold and the distance Generating the extracted image based on the information.

本発明に係るプログラムは、コンピュータに、画像上の複数位置それぞれに画像の奥行き方向の距離に係る距離情報を有する原画像と、原画像の一部と入れ替わる入替用画像とを取得させ、取得させた原画像の一部を抽出した抽出画像と前記入替用画像とを画像合成させるプログラムにおいて、前記コンピュータに、前記距離情報に対する閾値を前記原画像上の位置それぞれに基づいて設定するステップと、設定した閾値と前記距離情報に基づいて、前記抽出画像を生成するステップとを実行させることを特徴とする。 A program according to the present invention causes a computer to acquire and acquire an original image having distance information related to the distance in the depth direction of the image at each of a plurality of positions on the image and a replacement image that replaces a part of the original image. In the program for synthesizing the extracted image obtained by extracting a part of the original image and the replacement image, the computer sets a threshold for the distance information based on each position on the original image; And generating the extracted image based on the threshold value and the distance information.

本発明に係る撮像装置は、被写体を撮像する撮像部と、入替用画像を記憶する入替用画像記憶部とを備え、撮像した画像の一部を抽出した抽出画像と、撮像した画像の一部と入れ替わる入替用画像とを画像合成する撮像装置において、前記被写体までの距離を複数位置について測定する測距部と、前記測距部で測定した複数位置の距離値を補間することにより、前記撮像した画像上の複数位置それぞれに画像の奥行き方向の距離に係る距離情報を算出する距離情報補間部と、前記距離情報に対する閾値を前記撮像した画像上の位置それぞれに基づいて設定する閾値設定部と、設定した閾値及び前記距離情報に基づいて、前記抽出画像を生成する抽出画像生成部とを備えることを特徴とする。 An imaging apparatus according to the present invention includes an imaging unit that images a subject and a replacement image storage unit that stores a replacement image, and an extracted image obtained by extracting a part of the captured image and a part of the captured image In the imaging device for synthesizing the replacement image that is replaced with the distance measurement unit that measures the distance to the subject at a plurality of positions, and interpolating the distance values of the plurality of positions measured by the distance measurement unit, the imaging A distance information interpolation unit that calculates distance information related to the distance in the depth direction of the image at each of a plurality of positions on the captured image, and a threshold setting unit that sets a threshold for the distance information based on each of the positions on the captured image And an extracted image generation unit that generates the extracted image based on the set threshold and the distance information.

本発明に係る撮像装置は、被写体を撮像する複数の撮像部と、入替用画像を記憶する入替用画像記憶部とを備え、撮像した画像の一部を抽出した抽出画像と、撮像した画像の一部と入れ替わる入替用画像とを画像合成する撮像装置において、前記複数の撮像部それぞれが撮像した複数の画像を用いて、前記撮像した画像上の複数位置それぞれに画像の奥行き方向の距離に係る距離情報を算出する距離情報算出部と、前記距離情報に対する閾値を前記撮像した画像上の位置それぞれに基づいて設定する閾値設定部と、設定した閾値及び前記距離情報に基づいて、前記抽出画像を生成する抽出画像生成部とを備えることを特徴とする。 An imaging apparatus according to the present invention includes a plurality of imaging units that image a subject and a replacement image storage unit that stores a replacement image, an extracted image obtained by extracting a part of the captured image, and a captured image In the imaging apparatus for synthesizing a replacement image that is replaced with a part, a plurality of images captured by each of the plurality of imaging units are used to relate to the distance in the depth direction of each of the plurality of positions on the captured image. A distance information calculation unit that calculates distance information; a threshold setting unit that sets a threshold for the distance information based on each position on the captured image; and the extracted image based on the set threshold and the distance information. And an extracted image generation unit for generation.

本発明に係るテレビジョン受像機は、上記のいずれか一項に記載の画像処理装置と、テレビジョン放送を受信するチューナ部と、該チューナ部が受信したテレビジョン放送に係る画像を表示する表示部とを備え、該表示部は前記画像処理装置が取得、抽出、生成又は合成した画像を表示するようにしてあることを特徴とする。 A television receiver according to the present invention includes an image processing apparatus according to any one of the above, a tuner unit that receives a television broadcast, and a display that displays an image related to the television broadcast received by the tuner unit. And the display unit displays an image acquired, extracted, generated or synthesized by the image processing apparatus.

本発明に係るテレビジョン受像機は、上記のいずれか一項に記載の撮像装置と、テレビジョン放送を受信するチューナ部と、該チューナ部が受信したテレビジョン放送に係る画像を表示する表示部とを備え、該表示部は前記撮像装置が撮像、抽出、生成又は合成した画像を表示するようにしてあることを特徴とする。 A television receiver according to the present invention includes an imaging device according to any one of the above, a tuner unit that receives a television broadcast, and a display unit that displays an image related to the television broadcast received by the tuner unit. And the display unit displays an image captured, extracted, generated or synthesized by the imaging device.

本発明にあっては、距離情報に対する閾値を原画像上の位置それぞれに基づいて設定し、設定した閾値及び前記距離情報に基づいて、抽出画像を生成するので、奥行き値の同じ被写体が複数あっても、特定の被写体が存在する画像上の位置での閾値と、その他の被写体が存在する画像の位置での閾値とは異なる値を設定するため、特定の被写体のみの画像を抽出することが可能となる。 In the present invention, a threshold for distance information is set based on each position on the original image, and an extracted image is generated based on the set threshold and the distance information. Therefore, there are a plurality of subjects having the same depth value. However, since the threshold value at the position on the image where the specific subject exists is different from the threshold value at the position of the image where the other subject exists, an image of only the specific subject may be extracted. It becomes possible.

本発明にあっては、抽出されなかった画像部分を入替用画像とするように構成されている。従って、原画像の背景画像を入替用画像とした新たな画像を生成することが可能となる。 In the present invention, the image portion that has not been extracted is configured as a replacement image. Therefore, it is possible to generate a new image using the background image of the original image as a replacement image.

本発明にあっては、原画像を複数の領域に分割し、分割した領域毎に閾値を設定し、分割した領域毎に抽出画像を生成する。複数の被写体を抽出したい場合に、抽出したい被写体毎に領域を分割することにより、容易に複数の被写体を抽出することが可能となる。 In the present invention, the original image is divided into a plurality of regions, a threshold is set for each divided region, and an extracted image is generated for each divided region. When it is desired to extract a plurality of subjects, it is possible to easily extract a plurality of subjects by dividing the region for each subject to be extracted.

本発明にあっては、原画像を複数の領域に分割し、分割した領域毎に閾値を設定し、分割した領域毎に抽出画像を生成し、入替用画像を複数取得し、分割した領域毎に異なる入替用画像を用いて画像合成をおこなう。従って、被写体毎に異なる背景画像を合成した画像を生成することが可能となる。 In the present invention, the original image is divided into a plurality of regions, a threshold is set for each divided region, an extracted image is generated for each divided region, a plurality of replacement images are acquired, and each divided region is Then, image composition is performed using different replacement images. Therefore, it is possible to generate an image in which different background images for each subject are combined.

本発明にあっては、距離情報に対する閾値を撮像した画像上の位置それぞれに基づいて設定する閾値設定部と、設定した閾値及び前記距離情報に基づいて、撮像した画像の一部を抽出し、抽出画像を生成する抽出画像生成部とを備える。閾値を撮像した画像上の位置それぞれに基づいて設定するので、奥行き値の同じ被写体が複数あっても、特定の被写体が存在する画像の位置での閾値と、その他の被写体が存在する画像の位置での閾値とは異なる値を設定するため、特定の被写体のみの画像を抽出することが可能となる。 In the present invention, a threshold setting unit that sets a threshold for distance information based on each position on the captured image, and extracts a part of the captured image based on the set threshold and the distance information, An extracted image generation unit that generates an extracted image. Since the threshold is set based on each position on the captured image, even if there are multiple subjects with the same depth value, the threshold at the position of the image where the specific subject exists and the position of the image where other subjects exist Since a value different from the threshold value at is set, it is possible to extract an image of only a specific subject.

本発明にあっては、距離情報に対する閾値を撮像した画像上の位置それぞれに基づいて設定する閾値設定部と、設定した閾値及び前記距離情報に基づいて、撮像した画像の一部を抽出し、抽出画像として出力する抽出画像生成部とを備える。閾値を撮像した画像の位置それぞれに基づいて設定するので、奥行き値の同じ被写体が複数あっても、特定の被写体が存在する画像の位置での閾値と、その他の被写体が存在する画像の位置での閾値とは異なる値を設定するため、特定の被写体のみの画像を抽出することが可能となる。 In the present invention, a threshold setting unit that sets a threshold for distance information based on each position on the captured image, and extracts a part of the captured image based on the set threshold and the distance information, An extracted image generation unit that outputs the extracted image. Since the threshold value is set based on the position of each captured image, even if there are multiple subjects with the same depth value, the threshold value at the image location where a specific subject exists and the image location where other subjects exist Since a value different from the threshold value is set, an image of only a specific subject can be extracted.

本発明にあっては、テレビジョン受像機は上記のいずれか一項に記載の画像処理装置または上記のいずれか一項に記載の撮像装置を備えるので、外部機器から受け付けた原画像又は撮像部で取得した原画像において、奥行き値の同じ被写体が複数あっても、特定の被写体が存在する画像の位置での閾値と、その他の被写体が存在する画像の位置での閾値とは異なる値を設定するため、特定の被写体のみの画像を抽出することが可能となる。 In the present invention, since the television receiver includes the image processing device according to any one of the above or the imaging device according to any one of the above, the original image or the imaging unit received from the external device Even if there are multiple subjects with the same depth value in the original image acquired in step 1, the threshold value at the position of the image where the specific subject exists is different from the threshold value at the position of the image where the other subject exists. Therefore, it is possible to extract an image of only a specific subject.

本発明によれば、原画像又は撮像した画像において同一の奥行きに位置する被写体が複数あっても、特定の被写体の画像のみを原画像又は撮像した画像より抽出することができる。 According to the present invention, even if there are a plurality of subjects located at the same depth in the original image or the captured image, only the image of the specific subject can be extracted from the original image or the captured image.

本発明の実施の形態１に係る画像処理装置のハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions of the image processing apparatus which concerns on Embodiment 1 of this invention. 原画像の一例を示す説明図である。It is explanatory drawing which shows an example of an original image. 原画像における被写体の位置関係を示す説明図である。It is explanatory drawing which shows the positional relationship of the to-be-photographed object in an original image. 原画像に対応したデプスマップを示す説明図である。It is explanatory drawing which shows the depth map corresponding to an original image. デプスマップのデータ構造を概念的に示す説明図である。It is explanatory drawing which shows notionally the data structure of a depth map. 抽出範囲を概念的に示す説明図であるIt is explanatory drawing which shows an extraction range notionally 閾値の設定を概念的に示す説明図である。It is explanatory drawing which shows the setting of a threshold value notionally. 抽出画像の一例を示す説明図である。It is explanatory drawing which shows an example of an extracted image. 表示される合成画像の一例を示す説明図である。It is explanatory drawing which shows an example of the synthesized image displayed. 被写体を顔認識により認識する場合の説明図である。It is explanatory drawing in the case of recognizing a subject by face recognition. 画像処理装置が実施する画像処理方法の流れを示すフローチャートである。It is a flowchart which shows the flow of the image processing method which an image processing apparatus implements. 画像処理装置が実施する閾値設定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the threshold value setting process which an image processing apparatus implements. 変形例１に係る閾値の設定を概念的に示す説明図である。It is explanatory drawing which shows notionally the setting of the threshold value which concerns on the modification 1. FIG. 変形例２に係る閾値の設定を概念的に示す説明図である。It is explanatory drawing which shows notionally the setting of the threshold value which concerns on the modification 2. FIG. 本発明の実施の形態２に係る画像処理装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the image processing apparatus which concerns on Embodiment 2 of this invention. 本発明の実施の形態３において画像の領域分割の例を示す説明図である。It is explanatory drawing which shows the example of the area | region division of the image in Embodiment 3 of this invention. 本発明の実施の形態３に係る画像処理装置が実施する処理の流れを示すフローチャートである。14 is a flowchart illustrating a flow of processing performed by the image processing apparatus according to the third embodiment of the present invention. 本発明の実施の形態３において分割された各領域に対して行われる処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process performed with respect to each area | region divided in Embodiment 3 of this invention. 本発明の実施の形態３において閾値設定の例を示す説明図である。It is explanatory drawing which shows the example of a threshold value setting in Embodiment 3 of this invention. 本発明の実施の形態３において出力画像の例を示す説明図である。It is explanatory drawing which shows the example of an output image in Embodiment 3 of this invention. 変形例３におけるユーザ操作画面の例を示す説明図である。It is explanatory drawing which shows the example of the user operation screen in the modification 3. 変形例３におけるユーザ操作画面の例を示す説明図である。It is explanatory drawing which shows the example of the user operation screen in the modification 3. 変形例３におけるユーザ操作画面の例を示す説明図である。It is explanatory drawing which shows the example of the user operation screen in the modification 3. 本発明の実施の形態４に係る撮像装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the imaging device which concerns on Embodiment 4 of this invention. 本発明の実施の形態５に係る撮像装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the imaging device which concerns on Embodiment 5 of this invention. 本発明の実施の形態６に係るテレビジョン受像機の構成例を示すブロック図である。It is a block diagram which shows the structural example of the television receiver which concerns on Embodiment 6 of this invention. 本発明の実施の形態７に係るテレビジョン受像機の構成例を示すブロック図である。It is a block diagram which shows the structural example of the television receiver which concerns on Embodiment 7 of this invention. 従来技術による画像抽出の例を示す説明図である。It is explanatory drawing which shows the example of the image extraction by a prior art. 従来技術による画像抽出の例を示す説明図である。It is explanatory drawing which shows the example of the image extraction by a prior art. 撮像装置と被写体との距離情報を概念的に示す説明図である。It is explanatory drawing which shows notionally the distance information of an imaging device and a to-be-photographed object.

（実施の形態１）
以下に、本発明の実施の形態１を、図面を用いて説明する。
図１は、本発明の実施の形態１に係る画像処理装置のハードウェア構成を示すブロック図である。実施の形態１に係る画像処理装置１は、制御部１１、外部インタフェース部１２、表示インタフェース部１３、及び操作インタフェース部１４を備える。 (Embodiment 1)
Embodiment 1 of the present invention will be described below with reference to the drawings.
FIG. 1 is a block diagram showing a hardware configuration of the image processing apparatus according to Embodiment 1 of the present invention. The image processing apparatus 1 according to Embodiment 1 includes a control unit 11, an external interface unit 12, a display interface unit 13, and an operation interface unit 14.

制御部１１は、演算を行なうＣＰＵ（Central Processing Unit ）１１ａ、ＣＰＵ１１ａが行なう演算の制御手順を示す制御プログラム等を記憶したＲＯＭ（Read Only Memory）１１ｂ、ＣＰＵ１１ａが行なう演算に伴う一時的な情報を記憶するＲＡＭ（Random Access Memory）１１ｃ等を備える。
そして、制御部１１は、ＣＰＵ１１ａがＲＯＭ１１ｂに予め格納されている制御プログラムをＲＡＭ１１ｃに読み出して実行することにより、本発明の閾値設定部、抽出画像生成部として機能する。 The control unit 11 includes a CPU (Central Processing Unit) 11a that performs calculations, a ROM (Read Only Memory) 11b that stores a control program indicating a control procedure of calculations performed by the CPU 11a, and temporary information associated with calculations performed by the CPU 11a. A RAM (Random Access Memory) 11c is stored.
The control unit 11 functions as a threshold setting unit and an extracted image generation unit of the present invention by the CPU 11a reading out and executing a control program stored in the ROM 11b in the RAM 11c.

外部インタフェース部１２は、ＵＳＢ（Universal Serial Bus）コネクタ、ＨＤＭＩ（High-Definition Multimedia Interface）コネクタ又はＩＥＥ１３９４コネクタを備え、例えば、デジタルカメラとレーザレンジファインダーとから構成される測距機能を持った撮像装置２が接続される。撮像装置２より外部インタフェース部１２を介して、制御部１１は、原画像と該画像に対応した距離情報を受け付ける。受け付けた原画像と該画像に対応した距離情報は、制御部１１により、ＲＡＭ１１ｃに記憶される。 The external interface unit 12 includes a USB (Universal Serial Bus) connector, an HDMI (High-Definition Multimedia Interface) connector, or an IEEE 1394 connector, and has, for example, an imaging apparatus having a distance measuring function including a digital camera and a laser range finder. 2 are connected. The control unit 11 receives an original image and distance information corresponding to the image from the imaging device 2 via the external interface unit 12. The received original image and distance information corresponding to the image are stored in the RAM 11c by the control unit 11.

表示インタフェース部１３は、テレビジョン受像機などの表示装置３と接続され、ユーザ操作画面や処理対像となる原画像、画像処理後の画像などを表示装置３に表示させる。
操作インタフェース部１４は、キーボード、マウスなどの操作装置４が接続され、ユーザからの操作入力を操作装置４を介して受付ける。 The display interface unit 13 is connected to the display device 3 such as a television receiver, and causes the display device 3 to display a user operation screen, an original image to be processed, an image after image processing, and the like.
The operation interface unit 14 is connected to an operation device 4 such as a keyboard and a mouse, and receives an operation input from a user via the operation device 4.

次に、本発明の実施の形態において扱う距離情報を有する原画像について、説明する。なお、以下においては、原画像はビットマップ画像であり、距離情報はデプスマップであるものとして説明する。
図２は、原画像の一例を示す説明図である。図２に示すように、原画像にはＦ１、Ｆ２、Ｆ３、Ｆ４の４つ被写体がある。被写体Ｆ１からＦ３は人であり、被写体Ｆ４は建物である。図３は、原画像における被写体の位置関係を示す説明図である。図３に示すように、被写体の位置には前後関係があり、手前が被写体Ｆ２及び被写体Ｆ３、その後に被写体Ｆ１、さらに後に被写体Ｆ４となっている。被写体Ｆ２及び被写体Ｆ３は、横に並んでおり前後の位置は同じであるものとする。 Next, an original image having distance information handled in the embodiment of the present invention will be described. In the following description, it is assumed that the original image is a bitmap image and the distance information is a depth map.
FIG. 2 is an explanatory diagram illustrating an example of an original image. As shown in FIG. 2, the original image has four subjects F1, F2, F3, and F4. Subjects F1 to F3 are people, and subject F4 is a building. FIG. 3 is an explanatory diagram showing the positional relationship of the subject in the original image. As shown in FIG. 3, the positions of the subjects are in a longitudinal relationship, and the front side is the subject F2 and the subject F3, then the subject F1, and then the subject F4. It is assumed that the subject F2 and the subject F3 are arranged side by side and the front and rear positions are the same.

図４は、原画像に対応したデプスマップを示す説明図である。複数のハッチングそれぞれに対応した奥行き値から明らかなように上述した被写体の前後の位置関係が表されている。 FIG. 4 is an explanatory diagram showing a depth map corresponding to the original image. As is clear from the depth values corresponding to each of the plurality of hatchings, the positional relationship between the front and rear of the subject is described.

図５は、デプスマップのデータ構造を概念的に示す説明図である。図５のＡは図４で示した原画像の各部分のデプスマップの値、すなわち画像の奥行き値を示したものであり、Ｄ１、Ｄ２、Ｄ３、Ｄ４の値を取るものとする。Ｄ１は被写体Ｆ４より奥にある被写体に対応した値、Ｄ２は被写体Ｆ４に対応した値、Ｄ３は被写体Ｆ１に対応した値、Ｄ４は被写体Ｆ２及びＦ３に対応した値である。
図５のＢ上段は、図５のＡの一部分を抜き出したものである。図５のＢ下段は、さらに、抜き出した部分の一番上の１ラインについて、デプスマップのデータ構造を表したものであり、左から右の順に画素毎の奥行き値を示している。一番左に示した値は、一番左の画素に対応しているのでＤ３の値を取り、右に進むほど、それに対応した右側の画素を値を示している。右に進むとＤ３からＤ１に変わり、さらにＤ１からＤ２に変わり、最も右側の画素はＤ２の値を取る。 FIG. 5 is an explanatory diagram conceptually showing the data structure of the depth map. A of FIG. 5 shows the depth map value of each part of the original image shown in FIG. 4, that is, the depth value of the image, and assumes the values of D1, D2, D3, and D4. D1 is a value corresponding to a subject behind the subject F4, D2 is a value corresponding to the subject F4, D3 is a value corresponding to the subject F1, and D4 is a value corresponding to the subjects F2 and F3.
The upper part of B in FIG. 5 is a part extracted from A in FIG. The lower part of FIG. 5B further shows the data structure of the depth map for the top line of the extracted portion, and shows the depth value for each pixel in order from left to right. Since the value shown on the left corresponds to the leftmost pixel, it takes the value of D3, and as it goes to the right, the value on the right pixel corresponding to it is shown. When going to the right, D3 changes to D1, D1 changes to D2, and the rightmost pixel takes the value of D2.

実施の形態１に係る画像処理装置１の動作について説明する。
制御部１１（閾値設定部）は、外部インタフェース部１２を介して撮像装置２から原画像及び原画像のデプスマップを受け付け、ＲＡＭ１１ｃに記憶する。制御部１１は、目的の被写体を抽出するための閾値を設定する。本発明において閾値は単一の値ではなく、原画像の横方向の位置によって異なるように設定する。なお、以下の説明において３次元の座標系を用いる。ＸＹ座標は原画像の座標系であり、画像の左上を原点とし、原点から右方向をＸ軸方向、原点から下方向をＹ軸方向とする。残りの一軸は奥行方向の軸であり、値が大きいほど、奥に位置していることを表している。 An operation of the image processing apparatus 1 according to the first embodiment will be described.
The control unit 11 (threshold setting unit) receives the original image and the depth map of the original image from the imaging device 2 via the external interface unit 12, and stores them in the RAM 11c. The control unit 11 sets a threshold value for extracting a target subject. In the present invention, the threshold value is not a single value, but is set differently depending on the position of the original image in the horizontal direction. In the following description, a three-dimensional coordinate system is used. The XY coordinates are the coordinate system of the original image, and the upper left of the image is the origin, the right direction from the origin is the X axis direction, and the lower direction from the origin is the Y axis direction. The remaining axis is the axis in the depth direction, and the larger the value, the deeper the position.

図６は、抽出範囲を概念的に示す説明図である。以下の説明においては、人である被写体Ｆ２のみを抽出するものとる。図６に示すように、被写体Ｆ２は、原画像においてＸ座標の値がＸ３からＸ４の範囲に位置している（Ｘ３＜Ｘ４）。なお、Ｘ３及びＸ４を求める手法については、後述する。
図７は、閾値の設定を概念的に示す説明図である。図３と同様に原画像における被写体の位置関係を示すと共に、説明に必要な情報を追記している。 FIG. 6 is an explanatory diagram conceptually showing the extraction range. In the following description, only the subject F2 that is a person is extracted. As shown in FIG. 6, the subject F2 has an X coordinate value in the range of X3 to X4 in the original image (X3 <X4). A method for obtaining X3 and X4 will be described later.
FIG. 7 is an explanatory diagram conceptually showing threshold setting. Similar to FIG. 3, the positional relationship of the subject in the original image is shown, and information necessary for explanation is added.

制御部１１（閾値設定部）は、デプスマップより、Ｘ座標値がＸ３からＸ４の値を取る全ての画素に対する奥行き値を取得する。次に、取得された値の平均値ｄ０を求める。さらに、ｄ１、ｄ２を以下の式（１）、（２）により求める。
ｄ１＝ｄ０−ｃ１（ｃ１＞０）…（１）
ｄ２＝ｄ０＋ｃ２（ｃ２＞０）…（２）
さらに、Ｘ３、Ｘ４から、Ｘ１、Ｘ２を以下の式（３）、（４）により求める。
Ｘ１＝Ｘ３−ｃ３（ｃ３≧０）…（３）
Ｘ２＝Ｘ４＋ｃ４（ｃ４≧０）…（４）
したがって、ｄ０からｄ２の大小関係はｄ１＜ｄ０＜ｄ２であり、Ｘ１からＸ４の大小関係は、Ｘ１≦Ｘ３＜Ｘ４≦Ｘ２となる。 The control unit 11 (threshold setting unit) acquires depth values for all the pixels whose X coordinate values take values from X3 to X4 from the depth map. Next, an average value d0 of the acquired values is obtained. Further, d1 and d2 are obtained by the following equations (1) and (2).
d1 = d0−c1 (c1> 0) (1)
d2 = d0 + c2 (c2> 0) (2)
Furthermore, X1 and X2 are obtained from the following equations (3) and (4) from X3 and X4.
X1 = X3-c3 (c3 ≧ 0) (3)
X2 = X4 + c4 (c4 ≧ 0) (4)
Therefore, the magnitude relationship from d0 to d2 is d1 <d0 <d2, and the magnitude relationship from X1 to X4 is X1 ≦ X3 <X4 ≦ X2.

上述のようにして求めた値を用いて、閾値の設定を以下の式（５）、（６）とする。
閾値＝ｄ１（０≦Ｘ＜Ｘ１又はＸ≧Ｘ２のとき）…（５）
閾値＝ｄ２（Ｘ１≦Ｘ＜Ｘ２のとき）…（６）
閾値の設定はＸ座標にのみ依存し、Ｙ座標の値に関わらず上述したように設定を行う。
このように設定した閾値を図示したのが、図７のＢ１である。制御部１１（閾値設定部）は、このような閾値の設定情報を、ＲＡＭ１１ｃに記憶する。 Using the values obtained as described above, the threshold values are set as the following equations (5) and (6).
Threshold = d1 (when 0 ≦ X <X1 or X ≧ X2) (5)
Threshold = d2 (when X1 ≦ X <X2) (6)
The setting of the threshold depends only on the X coordinate, and is set as described above regardless of the value of the Y coordinate.
The threshold value set in this way is illustrated as B1 in FIG. The control unit 11 (threshold setting unit) stores such threshold setting information in the RAM 11c.

制御部１１（抽出画像生成部）は、原画像、閾値の設定情報及びデプスマップを、ＲＡＭ１１ｃより読み出す。
制御部１１は、閾値の設定情報及びデプスマップに基づき、２値化画像を生成する。すなわち、各画素それぞれについて奥行き値と設定された閾値とを比較し、奥行き値の値が閾値以下であれば１を設定し、奥行き値の値が閾値より大きければ０を設定する。このようにして生成した２値化画像と原画像との論理積を取ることにより、目的としている被写体（本実施の形態においては被写体Ｆ２）のみの画像を抽出する。すなわち、原画像と２値化画像とを画素ごとに比較し、２値化画像の画素値が１の場合は原画像の画素値を採用し、２値化画像の画素値が０の場合は原画像の画素値を黒又は白に対応した値とする。それにより、被写体Ｆ２の部分はそのまま出力されるが、被写体Ｆ２以外の部分は、ベタ塗り（黒又は白）の状態で出力される。制御部１１（抽出画像生成部）は抽出した画像をＲＡＭ１１ｃに記憶する。図８は、抽出画像の一例を示す説明図である。上述したように被写体Ｆ２のみが抽出されている。 The control unit 11 (extracted image generation unit) reads the original image, threshold setting information, and depth map from the RAM 11c.
The control unit 11 generates a binarized image based on the threshold setting information and the depth map. That is, for each pixel, the depth value is compared with a set threshold value. If the depth value value is less than or equal to the threshold value, 1 is set, and if the depth value value is greater than the threshold value, 0 is set. By taking the logical product of the binarized image generated in this way and the original image, an image of only the target subject (subject F2 in this embodiment) is extracted. That is, the original image and the binarized image are compared for each pixel. When the pixel value of the binarized image is 1, the pixel value of the original image is adopted, and when the pixel value of the binarized image is 0, The pixel value of the original image is set to a value corresponding to black or white. Thereby, the portion of the subject F2 is output as it is, but the portion other than the subject F2 is output in a solid state (black or white). The control unit 11 (extracted image generation unit) stores the extracted image in the RAM 11c. FIG. 8 is an explanatory diagram illustrating an example of an extracted image. As described above, only the subject F2 is extracted.

制御部１１は、ＲＡＭ１１ｃから目的としている被写体のみを抽出した抽出画像を読み出す。また、抽出画像と合成する入替用画像を図示しない記憶装置から読み出す。入替用画像は、予め記憶装置に記憶しており、例えば風景写真、世界遺産や観光名所の写真など背景となる画像である。 The control unit 11 reads out an extracted image obtained by extracting only the target subject from the RAM 11c. Further, the replacement image to be combined with the extracted image is read from a storage device (not shown). The replacement image is stored in advance in the storage device, and is a background image such as a landscape photograph, a world heritage or a tourist attraction.

制御部１１は、抽出画像と入替用画像とを合成する。すなわち、制御部１１は、抽出画像のベタ塗りの部分に入替用画像をはめ込む。制御部１１は、合成した画像を表示インタフェース部１３を介して表示装置３に表示する。図９は、表示される合成画像の一例を示す説明図である。図９に示したように、抽出された被写体Ｆ２は、原画像（図２）とは異なる風景の中に立っているかのような画像を得ることができる。 The control unit 11 combines the extracted image and the replacement image. In other words, the control unit 11 inserts the replacement image into the solid portion of the extracted image. The control unit 11 displays the synthesized image on the display device 3 via the display interface unit 13. FIG. 9 is an explanatory diagram illustrating an example of the displayed composite image. As shown in FIG. 9, the extracted subject F2 can obtain an image as if standing in a different landscape from the original image (FIG. 2).

次に、公知の顔認識技術を用いて、原画像において抽出する被写体が位置する範囲を示す座標値、Ｘ３及びＸ４を求める手法について説明する。抽出すべき被写体Ｆ２の顔特徴量は画像処理装置１の図示しない記憶部に予め記憶されている。制御部１１は被写体Ｆ２の顔特徴量を用いて、原画像の複数の被写体Ｆ１からＦ４の中から被写体Ｆ２の顔を特定し、原画像において被写体Ｆ２の顔位置を求める。
図１０は、被写体を顔認識により認識する場合の説明図である。被写体Ｆ２の顔は、（Ｘ５，Ｙ５）、（Ｘ６，Ｙ５）、（Ｘ６，Ｙ６）、（Ｘ５，Ｙ６）の４点を頂点とする四角形内Ｃにあると、制御部１１は認識する（Ｘ５＜Ｘ６、Ｙ５＜Ｙ６）。この場合、Ｘ３及びＸ４を、以下の式（７）、（８）により求める。
Ｘ３＝Ｘ５−ｍ１（ｍ１＞０）…（７）
Ｘ４＝Ｘ６＋ｍ２（ｍ２＞０）…（８）
制御部１１は求めたＸ３及びＸ４の値を、ＲＡＭ１１ｃに記憶する。記憶されたＸ３及びＸ４の値が、上述したように閾値を設定する処理に用いられる。 Next, a method for obtaining coordinate values X3 and X4 indicating the range where the subject to be extracted in the original image is located using a known face recognition technique will be described. The face feature amount of the subject F2 to be extracted is stored in advance in a storage unit (not shown) of the image processing apparatus 1. The control unit 11 specifies the face of the subject F2 from the plurality of subjects F1 to F4 of the original image using the face feature amount of the subject F2, and obtains the face position of the subject F2 in the original image.
FIG. 10 is an explanatory diagram when a subject is recognized by face recognition. The control unit 11 recognizes that the face of the subject F2 is in a square C having four points (X5, Y5), (X6, Y5), (X6, Y6), and (X5, Y6) as vertices ( X5 <X6, Y5 <Y6). In this case, X3 and X4 are obtained by the following equations (7) and (8).
X3 = X5-m1 (m1> 0) (7)
X4 = X6 + m2 (m2> 0) (8)
The control unit 11 stores the obtained values of X3 and X4 in the RAM 11c. The stored values of X3 and X4 are used for the process of setting the threshold as described above.

次に、画像処理装置１が実施する画像処理方法について、説明する。図１１は、画像処理装置１が実施する画像処理方法の流れを示すフローチャートである。制御部１１は、外部インタフェース部１２を介して、原画像を撮像装置２より取得し、ＲＡＭ１１ｃに記憶する（Ｓ１１）。制御部１１は、外部インタフェース部１２を介して、デプスマップを撮像装置２より取得し、ＲＡＭ１１ｃに記憶する（Ｓ１２）。 Next, an image processing method performed by the image processing apparatus 1 will be described. FIG. 11 is a flowchart showing a flow of an image processing method performed by the image processing apparatus 1. The control unit 11 acquires an original image from the imaging device 2 via the external interface unit 12, and stores it in the RAM 11c (S11). The control unit 11 acquires a depth map from the imaging device 2 via the external interface unit 12, and stores it in the RAM 11c (S12).

制御部１１は、閾値の設定を行う（Ｓ１３）。図１２は、画像処理装置１が実施する閾値設定処理の流れを示すフローチャートである。制御部１１は、顔認識により抽出すべき被写体Ｆ２の原画像における位置する範囲を示す値、すなわち、Ｘ３及びＸ４の値を求める（Ｓ２１）。制御部１１は、求めたＸ３及びＸ４を用いて、ＲＡＭ１１ｃに記憶したデプスマップからＸ座標値がＸ３以上Ｘ４以下である画素の奥行き値を抽出する（Ｓ２２）。制御部１１は、抽出した奥行き値の平均値ｄ０を求める（Ｓ２３）。制御部１１は、求めたｄ０より、閾値の上限値（ｄ２）及び下限値（ｄ１）を求める（Ｓ２４）。 The control unit 11 sets a threshold value (S13). FIG. 12 is a flowchart illustrating a flow of threshold setting processing performed by the image processing apparatus 1. The control unit 11 obtains values indicating a range in the original image of the subject F2 to be extracted by face recognition, that is, values of X3 and X4 (S21). The controller 11 uses the obtained X3 and X4 to extract the depth value of the pixel whose X coordinate value is X3 or more and X4 or less from the depth map stored in the RAM 11c (S22). The control unit 11 obtains an average value d0 of the extracted depth values (S23). The control unit 11 obtains an upper limit value (d2) and a lower limit value (d1) of the threshold from the obtained d0 (S24).

制御部１１は、Ｓ２１で求めたＸ３及びＸ４より、閾値を変化させるＸ座標値、Ｘ１及びＸ２の値を求める（Ｓ２５）。制御部１１は、求めたＸ１、Ｘ２と、Ｓ２４で求めたｄ１、ｄ２を閾値の設定情報として、ＲＡＭ１１ｃに記憶する（Ｓ２６）。 The control unit 11 obtains X coordinate values and X1 and X2 values for changing the threshold from X3 and X4 obtained in S21 (S25). The control unit 11 stores the obtained X1 and X2 and d1 and d2 obtained in S24 as threshold setting information in the RAM 11c (S26).

図１１に戻り、制御部１１は、ＲＡＭ１１ｃに記憶したデプスマップ及び閾値の設定情報に基づき、被写体Ｆ２の画像を原画像から抽出するための２値化画像を生成し、生成した２値化画像とＲＡＭ１１ｃに記憶した原画像との論理積演算を行い、被写体Ｆ２の画像を抽出する（Ｓ１４）。制御部１１は、抽出した画像をＲＡＭ１１ｃに記憶する（Ｓ１５）。 Returning to FIG. 11, the control unit 11 generates a binarized image for extracting the image of the subject F2 from the original image based on the depth map and threshold setting information stored in the RAM 11c, and generates the generated binarized image. And the original image stored in the RAM 11c are subjected to a logical product operation to extract an image of the subject F2 (S14). The control unit 11 stores the extracted image in the RAM 11c (S15).

制御部１１は、図示しない記憶装置から入替用画像を取得する（Ｓ１６）。制御部１１は、抽出画像と入替用画像を合成し、合成した画像を表示インタフェース１３を介して表示装置３に表示させ（Ｓ１７）、処理を終了する。 The control unit 11 acquires a replacement image from a storage device (not shown) (S16). The control unit 11 combines the extracted image and the replacement image, displays the combined image on the display device 3 via the display interface 13 (S17), and ends the process.

なお、閾値を設定する際に用いたｃ１及びｃ２は、原画像の奥行き方向の人体の厚みを考慮して定めるべき値であり、予め手動設定しておいても良いし、原画像に応じて、すなわちＸ３、Ｘ４の値に応じて、或いは公知の画像認識により被写体である人の立ち方（正面を向いて立っているか、斜めに立っているかなど）や姿勢（直立しているか、屈曲しているかなど）を認識しそれに応じて、適切な値を制御部１１が設定することとしても良い。 Note that c1 and c2 used when setting the threshold are values that should be determined in consideration of the thickness of the human body in the depth direction of the original image, and may be manually set in advance or according to the original image. That is, depending on the values of X3 and X4 or by known image recognition, the person who is the subject stands (standing up front or standing diagonally) and posture (standing upright or bent). Or the like, and the controller 11 may set an appropriate value accordingly.

同様に、ｃ３及びｃ４は処理の誤差をそれぞれ考慮して定めるべき値であり、予め手動設定しておいても良いし、Ｘ３、Ｘ４の値に応じて適切な値を制御部１１が設定することとしても良い。 Similarly, c3 and c4 are values to be determined in consideration of processing errors, and may be manually set in advance, or the controller 11 sets appropriate values according to the values of X3 and X4. It's also good.

入替用画像は、予め記憶装置に記憶しているものとしたが、外部記憶装置を介してＣＤ−ＲＯＭ、ＤＶＤなどの記録媒体から読み込むこととしても良いし、インターネットなどの通信網を経由して取得するものとしても良い。 The replacement image is stored in the storage device in advance, but may be read from a recording medium such as a CD-ROM or DVD via an external storage device, or via a communication network such as the Internet. It may be obtained.

被写体の位置を認識する際に用いたｍ１、ｍ２は、人体の横幅を考慮した値であり、予め手動設定しておいても良いし、原画像に応じて、すなわちＸ５、Ｘ６、Ｙ５、Ｙ６の値に応じて、或いは顔認識の際に顔の向きを認識し顔の向きに応じて、適切な値を制御部１１が定めても良い。 M1 and m2 used when recognizing the position of the subject are values in consideration of the width of the human body, and may be manually set in advance, or according to the original image, that is, X5, X6, Y5, Y6. The controller 11 may determine an appropriate value in accordance with the value of the face or the face orientation during face recognition and the face orientation.

目的の被写体の位置を認識するために、ここでは顔認識技術を用いたが、これに限定されるものではなく、抽出すべき被写体Ｆ２が原画像のどこに位置しているかを認識できるものであれば、他の技術手法を用いても良い。 In order to recognize the position of the target subject, the face recognition technique is used here. However, the present invention is not limited to this, and it can recognize where the subject F2 to be extracted is located in the original image. For example, other technical methods may be used.

原画像において、抽出する被写体Ｆ２と、他の被写体Ｆ１、Ｆ３とは横並びしているので、閾値の設定はＸ座標にのみ依存するとしたが、Ｙ座標にのみに依存しても良く、Ｘ座標、Ｙ座標の両座標値により変化するようにしても良い。Ｙ座標にのみに依存するのは、肩車や組体操のトーテムポールを撮影した画像から特定の人の画像を抽出する場合が考えられる。Ｘ座標、Ｙ座標の両座標値により変化させる場合は、特定の人の特定の部位（特定の人の上半身）のみを抽出する場合が、考えられる。 Since the subject F2 to be extracted and the other subjects F1 and F3 are arranged side by side in the original image, the threshold value setting depends only on the X coordinate, but it may depend only on the Y coordinate. , And may change depending on both coordinate values of the Y coordinate. Depending on the Y coordinate only, a case where an image of a specific person is extracted from an image obtained by photographing a shoulder wheel or a totem pole of a gymnastic exercise can be considered. When changing by both coordinate values of the X coordinate and the Y coordinate, it is conceivable that only a specific part (upper body of a specific person) of a specific person is extracted.

抽出する被写体はＦ２としたが、これに限定されるものではなく、他の被写体（Ｆ１、Ｆ３）を抽出することとしても良い。この場合、抽出する被写体を定めるための規則（例えば、横並びの場合には常に中央の人を抽出する）を予め定めておくか、複数の被写体から抽出する被写体をユーザが指定する処理が必要である。 The subject to be extracted is F2, but the subject is not limited to this, and other subjects (F1, F3) may be extracted. In this case, a rule for determining a subject to be extracted (for example, in the case of side-by-side, a central person is always extracted) or a process for a user to specify a subject to be extracted from a plurality of subjects is required. is there.

被写体Ｆ２の画像を抽出するために２値化画像を用いたが、２値化画像を用いないで、原画像、デプスマップ及び閾値の設定情報を用いて抽出を行うことができる。各画素ごとに次の処理を行う。原画像から画素を取り出す。該画素に対応した奥行き値をデプスマップより取り出す。取り出した奥行き値と閾値とを比較し、奥行き値が閾値以下であれば、原画像から取り出した画素を出力画像の画素とする。奥行き値が閾値より大きければ、白をあらわす画素値を与えた画素を出力画像の画素とする。該処理をすべての画素に対して行うことにより、抽出する被写体Ｆ２を構成する画素のみが出力され、それ以外の位置の画素はすべて白となり、被写体Ｆ２の画像を抽出することができる。 Although the binarized image is used to extract the image of the subject F2, the extraction can be performed using the original image, the depth map, and the threshold setting information without using the binarized image. The following processing is performed for each pixel. Extract pixels from the original image. A depth value corresponding to the pixel is extracted from the depth map. The extracted depth value is compared with the threshold value, and if the depth value is equal to or smaller than the threshold value, the pixel extracted from the original image is set as the pixel of the output image. If the depth value is larger than the threshold value, a pixel given a pixel value representing white is set as a pixel of the output image. By performing this process on all the pixels, only the pixels constituting the subject F2 to be extracted are output, and the pixels at other positions are all white, and the image of the subject F2 can be extracted.

同様な処理により、被写体Ｆ２の画像抽出と入替用画像との合成を行うことも可能である。すなわち、原画像、デプスマップ及び閾値の設定情報、入替用画像を用いて抽出を行うことができる。各画素ごとに次の処理を行う。原画像から画素を取り出す。該画素に対応した奥行き値をデプスマップより取り出す。取り出した奥行き値と閾値とを比較し、奥行き値が閾値以下であれば、原画像から取り出した画素を出力画像の画素とする。奥行き値が閾値より大きければ、取り出した画素に対応した入替用画像の画素を抽出し、該画素を出力画像の画素とする。該処理をすべての画素に対して行うことにより、抽出する被写体Ｆ２を構成する画素のみが出力され、それ以外の位置の画素は入替用画像から抽出した画素となり、被写体Ｆ２の抽出と入替用画像の合成が行える。 It is also possible to combine the image extraction of the subject F2 and the replacement image by similar processing. That is, extraction can be performed using the original image, the depth map, the threshold setting information, and the replacement image. The following processing is performed for each pixel. Extract pixels from the original image. A depth value corresponding to the pixel is extracted from the depth map. The extracted depth value is compared with the threshold value, and if the depth value is equal to or smaller than the threshold value, the pixel extracted from the original image is set as the pixel of the output image. If the depth value is larger than the threshold value, the pixel of the replacement image corresponding to the extracted pixel is extracted, and the pixel is set as the pixel of the output image. By performing this process on all the pixels, only the pixels constituting the subject F2 to be extracted are output, and the pixels at other positions become pixels extracted from the replacement image, and the extraction of the subject F2 and the replacement image are performed. Can be synthesized.

また、上述の実施の形態１においては、画像は静止画像であることを前提として説明したが、これに限られるのではなく動画像でも良い。静止画像は１フレームの画像で構成され、動画像は時系列順の複数フレームの画像で構成されるので、動画の場合、時系列順に連続する各フレームの映像に対して静止画像と同様の処理を行うものとする。また、該画像は、所定の符号化方式、例えばＪＰＥＧ（Joint Photographic Experts Group）、ＭＰＥＧ−２（Moving Picture Expert Group phase2）等で圧縮されたものであっても、非圧縮のものであっても良い。符号化された画像を扱う構成である場合、予め画像を所定の符号化方式に従って、例えばＲＧＢ形式やＹＵＶ形式の画像に復号し、復号して得た画像が画像処理装置１に入力されるものとする。以下に示す実施の形態及び変形例においても同様であり、画像は静止画像に限られるのではなく動画像でも良い。 In the first embodiment, the description has been made on the assumption that the image is a still image. However, the present invention is not limited to this and may be a moving image. Since a still image is composed of one frame image and a moving image is composed of images of a plurality of frames in time series order, in the case of a moving image, the same processing as a still image is performed for each frame of video that is continuous in time series order. Shall be performed. The image may be compressed by a predetermined encoding method, for example, JPEG (Joint Photographic Experts Group), MPEG-2 (Moving Picture Expert Group phase 2) or the like, or may be uncompressed. good. In the case of a configuration that handles encoded images, the image is decoded in advance into an image in RGB format or YUV format, for example, according to a predetermined encoding method, and the image obtained by decoding is input to the image processing apparatus 1 And The same applies to the following embodiments and modifications. The image is not limited to a still image, and may be a moving image.

（変形例１）
上述の実施の形態１において、閾値の設定は、図７に示したように矩形波のような形となっているが、これに限定されるものではない。図１３は、変形例１に係る閾値の設定を概念的に示す説明図であり、閾値を曲線となるように設定するものである。実施の形態１との相違は、閾値の設定方法であるから、以下では主に該相違点について説明する。 (Modification 1)
In the first embodiment described above, the threshold value is set like a rectangular wave as shown in FIG. 7, but is not limited to this. FIG. 13 is an explanatory diagram conceptually showing the setting of the threshold value according to the modified example 1, and the threshold value is set to be a curve. Since the difference from the first embodiment is the threshold value setting method, the difference will be mainly described below.

変形例１においては、閾値の設定を曲線するために、曲線が通過すべき点を複数求め、求めた複数の点すべてを通過する補間曲線を求めるものである。
図１３に示したように曲線が通過する点をＰ１、Ｐ２、Ｐ３、Ｐ４、Ｐ５の５つの点とし、座標値をそれぞれ、Ｐ１（Ｘ２１，ｄ１）、Ｐ２（Ｘ２５，ｄ０）、Ｐ３（Ｘ２０，ｄ２）、Ｐ４（Ｘ２６，ｄ０）、Ｐ５（Ｘ２２，ｄ１）とする。
抽出する被写体Ｆ２の位置する範囲を示すＸ座標値Ｘ３、Ｘ４、奥行き値の下限値ｄ１、上限値ｄ２の求める方法は、上述した方法と同様であるので、省略する。 In the first modification, in order to curve the threshold setting, a plurality of points through which the curve should pass are obtained, and an interpolation curve that passes through all the obtained points is obtained.
As shown in FIG. 13, the points through which the curve passes are five points P1, P2, P3, P4, and P5, and the coordinate values are P1 (X21, d1), P2 (X25, d0), and P3 (X20), respectively. , D2), P4 (X26, d0), and P5 (X22, d1).
The method for obtaining the X coordinate values X3 and X4 indicating the range in which the subject F2 to be extracted is located, the lower limit value d1 of the depth value, and the upper limit value d2 is the same as the method described above, and will be omitted.

次に、Ｘ２０、Ｘ２１、Ｘ２２を以下の式（９）、（１０）、（１１）により求める。
Ｘ２０＝（Ｘ３＋Ｘ４）／２（ｃ２３＞０）…（９）
Ｘ２１＝Ｘ３−ｃ２３（ｃ２４＞０）…（１０）
Ｘ２２＝Ｘ４＋ｃ２４…（１１）
Ｘ３、Ｘ４、Ｘ２２、Ｘ２１の関係は、Ｘ２１＜Ｘ３＜Ｘ４＜Ｘ２２となる。
Ｘ２５は、Ｘ２１とＸ３との間の値を（Ｘ２１＜Ｘ２５＜Ｘ３）、Ｘ２６は、Ｘ４とＸ２２との間の値とをＸ２６とする（Ｘ４＜Ｘ２６＜Ｘ２２）。 Next, X20, X21, and X22 are obtained by the following equations (9), (10), and (11).
X20 = (X3 + X4) / 2 (c23> 0) (9)
X21 = X3-c23 (c24> 0) (10)
X22 = X4 + c24 (11)
The relationship among X3, X4, X22, and X21 is X21 <X3 <X4 <X22.
X25 is a value between X21 and X3 (X21 <X25 <X3), and X26 is a value between X4 and X22 as X26 (X4 <X26 <X22).

Ｐ１からＰ５の座標値が求まったら、すべての点を通る補間曲線を求め、それを閾値の設定とする。補間曲線は、ラグランジェ補間、スプライン補間などの周知の補間法を用いて求める。閾値はＸ座標の値にのみ依存し、Ｘ座標が同一の場合においてはＹ座標の値に関わらず、同一の閾値を取る。図１３に示したＢ２が閾値曲線の例である。求めた閾値曲線に従って、被写体Ｆ２の画像を抽出することができる。その他の処理については、上記実施の形態１と同様であるので、省略する。 When the coordinate values of P1 to P5 are obtained, an interpolation curve passing through all points is obtained, and this is set as a threshold value. The interpolation curve is obtained using a known interpolation method such as Lagrangian interpolation or spline interpolation. The threshold depends only on the value of the X coordinate. When the X coordinate is the same, the same threshold is taken regardless of the value of the Y coordinate. B2 shown in FIG. 13 is an example of a threshold curve. The image of the subject F2 can be extracted according to the obtained threshold curve. Other processes are the same as those in the first embodiment, and are omitted.

変形例１においても、上述した実施の形態１と同様に、抽出した被写体Ｆ２が、原画像とは異なる風景の中に立っているかのような画像を得ることができる。 Also in the first modification, an image as if the extracted subject F2 is standing in a landscape different from the original image can be obtained, as in the first embodiment.

なお、上述したｃ２３及びｃ２４の値は、実施の形態１におけるｃ３及びｃ４と同様に予め手動設定しておいても良いし、原画像に応じて制御部１１が定めても良い。
また、上述した実施の形態１と同様に閾値の設定はＸ座標にのみ依存するとしたが、Ｙ座標にのみ依存しても良く、Ｘ座標、Ｙ座標の両座標値により変化するようにしても良い。 Note that the values of c23 and c24 described above may be manually set in advance similarly to c3 and c4 in the first embodiment, or may be determined by the control unit 11 according to the original image.
Further, as in the first embodiment described above, the threshold value setting depends only on the X coordinate. However, it may depend only on the Y coordinate, and may change depending on both the X coordinate and Y coordinate values. good.

（変形例２）
上述した実施の形態１において、閾値設定部に入力される距離情報は、デプスマップのように値と実際との距離が比例するものであったが、これに限られるものではなく、値と距離が反比例するようなディスパリティマップを距離情報としても良い。ディスパリティマップは、視差値を画像全体の画素毎に求めたものである。視差値とは、ステレオ画像において、対応する画素が左画像と右画像とで比較した場合にどの位ずれているかを示す値である。視差値は撮像装置２と被写体が近いほど大きな値を取り、撮像装置２と被写体が遠いほど小さな値となる。
したがって、上述した実施の形態１において、デプスマップに換えてディスパリティマップを距離情報として用いる場合は、値の大小関係が異なるのみで、他の部分は同様であるので、該相違点に関連する事項について、主に説明する。 (Modification 2)
In the first embodiment described above, the distance information input to the threshold setting unit is such that the distance between the value and the actual value is proportional as in the depth map, but is not limited to this, and the value and the distance are not limited thereto. A disparity map in which is inversely proportional may be used as the distance information. The disparity map is obtained by obtaining a parallax value for each pixel of the entire image. The parallax value is a value indicating how much the corresponding pixel is shifted in the stereo image when the left image and the right image are compared. The parallax value takes a larger value as the imaging device 2 and the subject are closer, and becomes a smaller value as the imaging device 2 and the subject are farther from each other.
Therefore, in the above-described first embodiment, when the disparity map is used as distance information instead of the depth map, only the magnitude relationship of the values is different and the other parts are the same. The matter is mainly explained.

図１４は、変形例２に係る閾値の設定を概念的に示す説明図である。制御部１１（閾値設定部）は、ディスパリティマップより、Ｘ座標値がＸ３からＸ４の値を取る全ての画素に対する視差値を取得する（Ｘ３＜Ｘ４）。次に、取得された値の平均値ｄ１０を求める。次に、ｄ１１、ｄ１２を以下の式（１２）、（１３）により求める。
ｄ１１＝ｄ１０＋ｃ１１（ｃ１１＞０）…（１２）
ｄ１２＝ｄ１０−ｃ１２（ｃ１２＞０）…（１３）
ｄ１０、ｄ１１、ｄ１２の値の関係は、ｄ１１＞ｄ１０＞ｄ１２となる。
また、Ｘ３よりＸ１、Ｘ２よりＸ４をそれぞれ求める方法は、上述した実施の形態１と同様であるので、省略する（Ｘ１＜Ｘ３＜Ｘ４＜Ｘ２）。 FIG. 14 is an explanatory diagram conceptually illustrating threshold setting according to the second modification. The control unit 11 (threshold setting unit) obtains the parallax values for all the pixels whose X coordinate values take values from X3 to X4 from the disparity map (X3 <X4). Next, an average value d10 of the acquired values is obtained. Next, d11 and d12 are obtained by the following equations (12) and (13).
d11 = d10 + c11 (c11> 0) (12)
d12 = d10−c12 (c12> 0) (13)
The relationship between the values of d10, d11, and d12 is d11>d10> d12.
Further, the method of obtaining X1 from X3 and X4 from X2 is the same as that in the first embodiment, and is omitted (X1 <X3 <X4 <X2).

上述したように求めた値を用いて、閾値の設定を以下の式（１４）、（１５）とする。
閾値＝ｄ１１（０≦Ｘ＜Ｘ１又はＸ≧Ｘ２のとき）…（１４）
閾値＝ｄ１２（Ｘ１≦Ｘ＜Ｘ２のとき）…（１５）
閾値の設定はＸ座標にのみ依存し、Ｙ座標の値に関わらず上述したように設定を行う。このように設定した閾値を図に示したのが、図１４のＢ１１である。 Using the values obtained as described above, the threshold values are set as the following equations (14) and (15).
Threshold = d11 (when 0 ≦ X <X1 or X ≧ X2) (14)
Threshold = d12 (when X1 ≦ X <X2) (15)
The setting of the threshold depends only on the X coordinate, and is set as described above regardless of the value of the Y coordinate. The threshold value set in this way is shown in the figure as B11 in FIG.

制御部１１は、このような閾値の設定情報を、ＲＡＭ１１ｃに記憶する。この後の制御部１１での処理は、上述の実施の形態１と同様であるが、デプスマップとディスパリティマップとでは値の大小関係が逆になるので、２値化画像を生成する処理が異なる。すなわち、各画素それぞれについてディスパリティ値と設定された閾値とを比較し、ディスパリティ値が閾値以上であれば１を設定し、距離情報の値が閾値未満であれば０を設定する。 The control unit 11 stores such threshold setting information in the RAM 11c. The subsequent processing in the control unit 11 is the same as that in the first embodiment described above, but since the magnitude relationship between values is reversed between the depth map and the disparity map, processing for generating a binarized image is performed. Different. That is, the disparity value is compared with a set threshold value for each pixel, and 1 is set if the disparity value is greater than or equal to the threshold value, and 0 is set if the distance information value is less than the threshold value.

変形例２においては、距離情報としてディスパリティマップを用いている。ディスパリティマップを変換することなく、距離情報として用いることが可能であるので、デプスマップを用いた場合と同等な時間で閾値を設定する処理を行うことが可能となる。 In the second modification, a disparity map is used as distance information. Since the disparity map can be used as distance information without conversion, it is possible to perform a process of setting a threshold in the same time as when using a depth map.

なお、ｃ１１及びｃ１２は、ｃ１及びｃ２と同様、人体の厚みを考慮して定めるべき値であり、ｃ１３及びｃ１４は、ｃ３及びｃ４と同様、処理の誤差を考慮して定めるべき値である。ｃ１１からｃ１４のいずれの値についても、予め人が定めておいても良いし、原画像に応じて適切な値を制御部１１が設定することとしても良い。 Note that c11 and c12 are values that should be determined in consideration of the thickness of the human body, similarly to c1 and c2, and c13 and c14 are values that should be determined in consideration of processing errors, similarly to c3 and c4. Any value of c11 to c14 may be determined in advance by a person, or the controller 11 may set an appropriate value according to the original image.

また、変形例１についても、変形例２で説明したのと同様な変更を加えることにより、デプスマップに換えてディスパリティマップを距離情報として用いることが可能であることは、当業者であれば、自明なことである。 For those skilled in the art, it is possible for a person skilled in the art to use a disparity map as distance information in place of the depth map by making the same changes as described in Modification 2. It is obvious.

（実施の形態２）
実施の形態１において、制御プログラムは予めＲＯＭ１１ｂに記憶されているものとしたが、それに限られず外部より制御プログラムとしてのコンピュータプログラムを読み込むこととしても良い。
図１５は、本発明の実施の形態２に係る画像処理装置の構成例を示すブロック図である。本発明の実施の形態２に係る画像処理装置５は、本発明に係るコンピュータプログラム６ａを実行させることによって実現される。 (Embodiment 2)
In the first embodiment, the control program is stored in the ROM 11b in advance. However, the present invention is not limited to this, and a computer program as a control program may be read from the outside.
FIG. 15 is a block diagram showing a configuration example of an image processing apparatus according to Embodiment 2 of the present invention. The image processing apparatus 5 according to the second embodiment of the present invention is realized by executing the computer program 6a according to the present invention.

画像処理装置５は、実施の形態１で示した画像処理装置１において、さらに、外部記憶装置１５と通信部１６とを備えている。実施の形態１と同一の構成については、同一の符号を付し、説明を省略する。 The image processing device 5 is further provided with an external storage device 15 and a communication unit 16 in the image processing device 1 shown in the first embodiment. The same components as those in the first embodiment are denoted by the same reference numerals and description thereof is omitted.

外部記憶装置１５は、発明の実施の形態に係るコンピュータプログラム６ａを記録した記録媒体６、例えばＣＤ−ＲＯＭからコンピュータプログラム６ａを読み取る。通信部１６は、発明の実施の形態に係るコンピュータプログラム６ａを、例えばインターネットなどの通信網Ｎを通じて取得する。なお、コンピュータプログラム６ａは、外部記憶装置１５を通じて記録媒体６から取得しても良いし、通信部１６を通じて通信網Ｎから取得しても良い。また、プログラムモジュールごとに記録媒体６又は通信網Ｎから取得するものとして、記録媒体６及び通信網Ｎの両方を用いてコンピュータプログラム６ａを取得することとしても良い。 The external storage device 15 reads the computer program 6a from the recording medium 6 on which the computer program 6a according to the embodiment is recorded, for example, a CD-ROM. The communication unit 16 acquires the computer program 6a according to the embodiment of the invention through a communication network N such as the Internet. The computer program 6 a may be acquired from the recording medium 6 through the external storage device 15 or may be acquired from the communication network N through the communication unit 16. Moreover, as what is acquired from the recording medium 6 or the communication network N for every program module, it is good also as acquiring the computer program 6a using both the recording medium 6 and the communication network N.

制御部１１の処理手順は、図１１に示す通りであり、ステップＳ１１から１７の処理手順を実行する。該処理手順は、実施の形態１に係る画像処理装置１における処理内容と同様であるため、その詳細な説明を省略する。 The processing procedure of the control unit 11 is as shown in FIG. 11, and the processing procedure of steps S11 to S17 is executed. Since the processing procedure is the same as the processing content in the image processing apparatus 1 according to the first embodiment, detailed description thereof is omitted.

実施の形態２に係る画像処理装置５及びコンピュータプログラム６ａにあっては、本実施の形態に係る画像処理装置として機能し、また本実施の形態に係る画像処理方法を実施させることができ、本発明の実施の形態１と同様の効果を奏する。 The image processing apparatus 5 and the computer program 6a according to the second embodiment function as the image processing apparatus according to the present embodiment, and can execute the image processing method according to the present embodiment. The same effects as those of the first embodiment of the invention can be obtained.

（実施の形態３）
実施の形態１においては、原画像の全体にわたり同じ閾値の設定で画像の抽出を行ったが、実施の形態３においては、原画像を複数の領域に分割し、各領域毎に画像の抽出を行う点が実施の形態１、２と異なる。ハードウェアの構成は実施の形態１と同様である。以下では、主に該相違点について説明する。 (Embodiment 3)
In the first embodiment, the image is extracted with the same threshold setting throughout the original image. However, in the third embodiment, the original image is divided into a plurality of regions, and the image is extracted for each region. This is different from the first and second embodiments. The hardware configuration is the same as in the first embodiment. Below, this difference is mainly demonstrated.

制御部１１（領域分割部）は、原画像を複数の領域に分割し、各領域を示す座標情報をＲＡＭ１１ｃに記憶する。
制御部１１（閾値設定部、抽出画像生成部）は、ＲＡＭ１１ｃに記憶された各領域を示す座標情報を基に、原画像を複数に分割し、分割された各領域画像毎に処理を行う。制御部１１は分割された各領域毎に特定の被写体を抽出するために閾値の設定を行い、ＲＡＭ１１ｃに閾値情報を記憶する。制御部１１は、ＲＡＭ１１ｃに記憶された各領域の座標情報、各領域の閾値情報を基に各領域毎に被写体の抽出を行い、抽出画像をＲＡＭ１１ｃに記憶する。制御部１１は、ＲＡＭ１１ｃに記憶された各領域を示す座標情報、各領域毎の抽出画像を基に、各領域毎の入替用画像と各領域毎の抽出画像とを合成する。制御部１１は、すべての領域についての合成が済んだ後に、すべての領域の合成画像を１つの画像としてつなぎ合わせ、原画像と同一の大きさの画像を表示インタフェース部１３を介して表示装置３に出力する。 The control unit 11 (region dividing unit) divides the original image into a plurality of regions, and stores coordinate information indicating each region in the RAM 11c.
The control unit 11 (threshold setting unit, extracted image generation unit) divides the original image into a plurality based on the coordinate information indicating each region stored in the RAM 11c, and performs processing for each divided region image. The control unit 11 sets a threshold value for extracting a specific subject for each divided area, and stores the threshold information in the RAM 11c. The control unit 11 extracts a subject for each area based on the coordinate information of each area and the threshold information of each area stored in the RAM 11c, and stores the extracted image in the RAM 11c. The control unit 11 combines the replacement image for each area and the extracted image for each area based on the coordinate information indicating each area stored in the RAM 11c and the extracted image for each area. After the synthesis for all the regions is completed, the control unit 11 stitches the combined images of all the regions as one image, and displays an image having the same size as the original image via the display interface unit 13. Output to.

図１６は、本発明の実施の形態３において画像の領域分割の例を示す説明図である。被写体Ｆ１１、Ｆ１２、Ｆ１３、Ｆ１４からなる画像において、領域を上下２つ（領域Ａ１、Ａ２）に分割するものとする。 FIG. 16 is an explanatory diagram showing an example of image region division in the third embodiment of the present invention. In the image composed of the subjects F11, F12, F13, and F14, the area is divided into two areas (areas A1 and A2).

図１７は、本発明の実施の形態３に係る画像処理装置１が実施する処理の流れを示すフローチャートである。制御部１１が原画像を撮像装置２から取得し、ＲＡＭ１１ｃに記憶する（Ｓ３１）。制御部１１が距離情報を撮像装置２から取得し、ＲＡＭ１１ｃに記憶する（Ｓ３２）。ユーザが画像の領域分割を指定したか否かを、制御部１１は判定する（Ｓ３３）。領域分割が指定された場合（Ｓ３３でＹＥＳの場合）は、各領域の座標をＲＡＭ１１ｃに記憶し、領域毎の処理を行う（Ｓ３４）。領域毎の処理については、後述する。次に、ＲＡＭ１１ｃに記憶された各領域毎の座標情報と合成画像を取得し、すべての合成画像を１つの画像としてつなぎ合わせ、原画像と同一の大きさの画像に合成し（Ｓ３５）、表示インタフェース部１３を介して表示装置３に出力して処理を終了する。領域分割が行われていない場合（Ｓ３３でＮＯの場合）は、Ｓ３６からＳ４０の処理を行い終了する。Ｓ３６からＳ４０の処理は、それぞれ、実施の形態１におけるＳ１３からＳ１７の処理（図１１）と同様であるので、説明を省略する。 FIG. 17 is a flowchart showing the flow of processing performed by the image processing apparatus 1 according to Embodiment 3 of the present invention. The control unit 11 acquires the original image from the imaging device 2 and stores it in the RAM 11c (S31). The control part 11 acquires distance information from the imaging device 2, and memorize | stores it in RAM11c (S32). The control unit 11 determines whether or not the user has designated image segmentation (S33). When area division is designated (YES in S33), the coordinates of each area are stored in the RAM 11c, and processing for each area is performed (S34). The process for each area will be described later. Next, coordinate information and a composite image for each area stored in the RAM 11c are acquired, all the composite images are connected as one image, and combined into an image having the same size as the original image (S35) and displayed. It outputs to the display apparatus 3 via the interface part 13, and complete | finishes a process. If area division has not been performed (NO in S33), the process from S36 to S40 is performed and the process ends. Since the processes from S36 to S40 are the same as the processes from S13 to S17 (FIG. 11) in the first embodiment, the description thereof is omitted.

図１８は、実施の形態３において分割された各領域に対して行われる処理の流れを示すフローチャートである。制御部１１は、各領域の座標情報をＲＡＭ１１ｃより取得する（Ｓ４１）。取得した各領域の座標情報から、制御部１１は、未処理領域の座標情報を抽出し、処理対像領域の座標情報とする（Ｓ４２）。制御部１１は、座標情報を基に処理対像領域に対応した距離情報をＲＡＭ１１ｃより読み出す（Ｓ４３）。制御部１１は、原画像の処理対像領域部分の画像、読み出した距離情報を基に閾値設定を行う（Ｓ４４）。閾値設定は、図１２に示した処理と同様である。すなわち、原画像のうち処理対像領域部分の画像に対して、閾値設定の処理を行う。制御部１１は、処理対像領域の座標情報及び閾値設定を基に、原画像の処理対像領域部分から被写体を抽出し、ＲＡＭ１１ｃに記憶する（Ｓ４５）。制御部１１は、処理対像領域に対応した入替用画像を外部機器又は図示しない記憶装置から取得する（Ｓ４６）。制御部１１は、ＲＡＭ１１ｃから読み出した抽出画像と、取得した入替用画像とを合成する（Ｓ４７）。制御部１１は、処理対像領域の座標情報と、合成画像を対応付けて、ＲＡＭ１１ｃに記憶する（Ｓ４８）。制御部１１は、全ての領域について処理が完了したか否かを判定する（Ｓ４９）。制御部１１は全ての領域について処理が終わっていない場合（Ｓ４９においてＮＯ）、Ｓ４２ヘ戻る。制御部１１は全ての領域について処理が終わっている場合（Ｓ４９においてＹＥＳ）、領域毎の処理を終了し、図１７のＳ３５に進む。 FIG. 18 is a flowchart showing the flow of processing performed on each area divided in the third embodiment. The control unit 11 acquires coordinate information of each area from the RAM 11c (S41). From the acquired coordinate information of each area, the control unit 11 extracts the coordinate information of the unprocessed area and sets it as the coordinate information of the processing image area (S42). The control unit 11 reads distance information corresponding to the processing image area from the RAM 11c based on the coordinate information (S43). The control unit 11 performs threshold setting based on the image of the processing image area portion of the original image and the read distance information (S44). The threshold setting is the same as the processing shown in FIG. That is, a threshold setting process is performed on the image in the processing image area portion of the original image. Based on the coordinate information and threshold setting of the processing image area, the control unit 11 extracts a subject from the processing image area portion of the original image and stores it in the RAM 11c (S45). The control unit 11 acquires a replacement image corresponding to the processing image area from an external device or a storage device (not shown) (S46). The control unit 11 combines the extracted image read from the RAM 11c and the acquired replacement image (S47). The control unit 11 associates the coordinate information of the processing image area and the composite image and stores them in the RAM 11c (S48). The control unit 11 determines whether or not processing has been completed for all regions (S49). If the process has not been completed for all the regions (NO in S49), the control unit 11 returns to S42. If the processing has been completed for all the regions (YES in S49), the control unit 11 terminates the processing for each region and proceeds to S35 in FIG.

図１９は、本発明の実施の形態３において閾値設定の例を示す説明図である。図１９は、図１６に写っている被写体の前後関係を示すと共に、設定された閾値を示している。図１６に示した原画像の領域Ａ１に対しては、閾値曲線Ｂ３１を用い、領域Ａ２に対しては、閾値曲線Ｂ３２を用いることにより、領域Ａ１においては被写体Ｆ１３を、領域Ａ２においては被写体Ｆ１２をそれぞれ抽出することができる。閾値曲線Ｂ３１、Ｂ３２を求める手法としては、例えば、領域分割を行った場合には、各領域において、画像に占める面積が大きい人であって、より手前の人及びその人より更に手前の人を抽出するという規則を予め定めておく。パターンマッチング等の公知の技術を用いて、原画像内で人の認識を行い、各領域（Ａ１、Ａ２）において、上述した規則に該当する人の画像を抽出できるように閾値曲線を生成することにより、閾値曲線Ｂ３１、Ｂ３２を得ることが出来る。 FIG. 19 is an explanatory diagram showing an example of threshold setting in the third embodiment of the present invention. FIG. 19 shows the context of the subject shown in FIG. 16 and the set threshold value. The threshold curve B31 is used for the area A1 of the original image shown in FIG. 16, and the subject curve F32 is used for the area A2 and the subject F12 is used for the area A2 by using the threshold curve B32 for the area A2. Can be extracted respectively. As a method for obtaining the threshold curves B31 and B32, for example, in the case of region division, a person who has a large area in the image in each region, and a person in front and a person in front of that person are selected. A rule of extraction is set in advance. Using a known technique such as pattern matching, a person is recognized in the original image, and a threshold curve is generated in each region (A1, A2) so that an image of a person corresponding to the rules described above can be extracted. Thus, threshold curves B31 and B32 can be obtained.

図２０は、実施の形態３において出力画像の例を示す説明図である。図２０は、原画像を分割して得た２つの領域Ａ１、Ａ２のそれぞれにおいて、被写体Ｆ１２、被写体Ｆ１３を抽出後に、それぞれの領域において入替用画像を合成し、領域Ａ１と領域Ａ２の合成画像をつなぎ合わせ、原画像と同一の大きさの画像を合成したものである。実施の形態３においては、原画像に対して複数の領域を設定して被写体を抽出し、背景や被写体の間、被写体の前など任意の位置に任意の画像を入れ替えることが可能である。つまり、距離情報を利用することで２次元平面上に疑似的な３次元空間を作り出し、新たな画像表現が可能となる。 FIG. 20 is an explanatory diagram illustrating an example of an output image in the third embodiment. In FIG. 20, after extracting the subject F12 and the subject F13 in each of the two areas A1 and A2 obtained by dividing the original image, a replacement image is synthesized in each area, and a composite image of the areas A1 and A2 is obtained. Are combined to create an image having the same size as the original image. In the third embodiment, it is possible to extract a subject by setting a plurality of regions with respect to the original image, and to replace an arbitrary image at an arbitrary position such as between the background and the subject or in front of the subject. That is, by using the distance information, a pseudo three-dimensional space can be created on a two-dimensional plane, and a new image expression can be realized.

なお、上述の説明においては、原画像を上下に二分割したが、これに限定されず、左右に分割しても良く、領域の数についても３つ以上設定しても良い。また、領域の指定は、予め分割する領域の数及び配置を画像処理装置１に記憶しておき、それに基づき分割しても良いし、原画像の情報や、距離情報に基づいて、画像処理装置１が設定しても良い。あるいは、ユーザがマウスなどの操作装置４により指定しても良い。 In the above description, the original image is divided into two parts in the upper and lower directions. However, the present invention is not limited to this, and the original image may be divided into left and right, and the number of areas may be set to three or more. In addition, for the designation of the area, the number and arrangement of the areas to be divided in advance may be stored in the image processing apparatus 1 and divided based on the area. 1 may be set. Alternatively, the user may specify with the operation device 4 such as a mouse.

（変形例３）
上述した実施の形態３においては、画像処理装置１が所定の規則に基づいて、領域の分割及び各領域毎の閾値の設定を行ったが、ユーザが同様の分割及び設定を行っても良い。本変形例においては、分割する領域の設定及び各領域ごとの閾値の設定をユーザが操作装置４を用いて行う。
図２１から図２３は、変形例３におけるユーザ操作画面２０の例を示す説明図である。ユーザ操作画面２０は画像処理装置１の表示インタフェース部１３を介して表示装置３に表示されるものである。図２１に示すようにユーザ操作画面２０は、処理画像表示部２１、ポインタ２２、視差値表示欄２３、閾値入力欄２４ａ、２４ｂ、原画像名表示欄２５ａ、ディスパリティマップ名表示欄２５ｂ、入替画像ａ表示欄２５ｃ、入替画像ｂ表示欄２５ｄ、選択ボタン２６ａ、２６ｂ、２６ｃ、及び２６ｄから構成されている。 (Modification 3)
In the above-described third embodiment, the image processing apparatus 1 performs region division and threshold setting for each region based on a predetermined rule. However, the user may perform similar division and setting. In this modification, the user uses the operation device 4 to set the area to be divided and the threshold value for each area.
FIG. 21 to FIG. 23 are explanatory diagrams showing examples of the user operation screen 20 in the third modification. The user operation screen 20 is displayed on the display device 3 via the display interface unit 13 of the image processing apparatus 1. As shown in FIG. 21, the user operation screen 20 includes a processed image display unit 21, a pointer 22, a parallax value display column 23, threshold input columns 24a and 24b, an original image name display column 25a, a disparity map name display column 25b, and a replacement. An image a display field 25c, a replacement image b display field 25d, and selection buttons 26a, 26b, 26c, and 26d are configured.

処理画像表示部２１には、処理対像となる原画像が表示される。本変形例においては、被写体Ｆ２１、Ｆ２２、Ｆ２３からなる原画像及びポインタ２２が処理画像表示部２１に表示されている。原画像名表示欄２５ａは、処理したい画像の名前が表示される欄であり、選択ボタン２６ａを操作装置４で選択した場合、図示しない選択ウィンドウが開き、表示される画像一覧から処理したい画像を選択することにより、原画像名表示欄２５ａに、処理したい画像の名前が表示される。処理したい画像の名前が分かっている場合には、原画像名表示欄２５ａに画像の名前を直接入力することも可能である。ディスパリティマップ名表示欄２５ｂは、ディスパリティマップの名前が表示される欄である。原画像名と同様に選択ボタン２６ｂを操作装置４を用いて選択することにより表示される図示しない選択ウィンドウで原画像に対応したディスパリティマップを選択したり、ディスパリティマップ名表示欄２５ｂにディスパリティマップ名を直接入力することにより、処理に用いるディスパリティマップを指定する。入替画像ａ表示欄２５ｃ、入替画像ｂ表示欄２５ｄは、それぞれ入替画像名が表示される欄であり、それぞれに対して選択ボタン２６ｃ、２６ｄが対応している。入力方法については、原画像、ディスパリティマップと同様である。 The processed image display unit 21 displays an original image to be processed. In the present modification, an original image composed of subjects F21, F22, and F23 and a pointer 22 are displayed on the processed image display unit 21. The original image name display column 25a is a column in which the name of the image to be processed is displayed. When the selection button 26a is selected by the operation device 4, a selection window (not shown) is opened, and an image to be processed is displayed from the displayed image list. By selecting, the name of the image to be processed is displayed in the original image name display field 25a. If the name of the image to be processed is known, the name of the image can be directly input in the original image name display field 25a. The disparity map name display column 25b is a column in which the name of the disparity map is displayed. Similar to the original image name, a disparity map corresponding to the original image is selected in a selection window (not shown) displayed by selecting the selection button 26b using the operation device 4, and a disparity map name display field 25b displays a disparity map. By directly entering the parity map name, the disparity map used for processing is specified. The replacement image a display field 25c and the replacement image b display field 25d are fields in which replacement image names are displayed, respectively, and selection buttons 26c and 26d correspond to the respective fields. The input method is the same as that of the original image and disparity map.

視差値表示欄２３は、ユーザが指定した画素の視差値を表示する欄である。ポインタ２２により画像の任意の位置を指定し、マウスなどの操作装置４を用いて選択した場合、ポインタ２２が示している位置の視差値が視差値表示欄２３に表示される。閾値入力欄２４ａ及び閾値入力欄２４ｂは、被写体画像を抽出する際に用いる視差値の閾値を入力する欄である。 The parallax value display column 23 is a column for displaying the parallax value of the pixel designated by the user. When an arbitrary position of the image is designated by the pointer 22 and selected using the operation device 4 such as a mouse, the parallax value at the position indicated by the pointer 22 is displayed in the parallax value display field 23. The threshold value input field 24a and the threshold value input field 24b are fields for inputting a parallax value threshold value used when extracting a subject image.

次に、ユーザの操作について説明する。まず、ユーザは領域の指定を行う。図２１に示した例では、領域Ａ３が設定されている。このとき領域を設定しないことも可能であり、その場合には画像全体を一つの領域として選択したことと等しい。領域指定は四角や丸みのある図形を使用しても良いし、マウスなどの操作装置４を使用してフリーハンドで決定しても良い。 Next, a user operation will be described. First, the user designates an area. In the example shown in FIG. 21, the area A3 is set. At this time, it is possible not to set the area, in which case it is equivalent to selecting the entire image as one area. The area designation may use a square or a rounded figure, or may be determined freehand using the operation device 4 such as a mouse.

ポインタ２２により、抽出したい被写体Ｆ２２を選択すると、視差値表示欄２３に選択した位置の視差値１２７が表示される。この視差値を基準に、ユーザは設定領域Ａ３内における閾値を決定し、閾値入力欄２４ａにその閾値を入力する。例えば、被写体Ｆ２２の視差値よりも小さな値、つまり被写体Ｆ２２の後ろの視差値が閾値となるよう設定する。ここでは被写体Ｆ２２の視差値が１２７であるので閾値を１００と設定している（図２３）。さらに、設定領域Ａ３以外の領域の閾値も決定し、閾値入力欄２４ｂにその閾値を入力する。例えば、被写体Ｆ２１の視差値よりも小さな値、つまり被写体Ｆ２１の後ろの視差値が閾値となるよう設定する。ここでは図２２に示したように被写体Ｆ２１の視差値が６４であるため閾値を５０と設定している（図２３）。また、領域設定を行っていない場合には、閾値入力欄２４ａに閾値を入力すれば、画像全体を一つの領域として閾値が設定される。 When the subject F22 to be extracted is selected by the pointer 22, the parallax value 127 at the selected position is displayed in the parallax value display field 23. Based on this parallax value, the user determines a threshold in the setting area A3 and inputs the threshold in the threshold input field 24a. For example, a value smaller than the parallax value of the subject F22, that is, the parallax value behind the subject F22 is set as the threshold value. Here, since the parallax value of the subject F22 is 127, the threshold value is set to 100 (FIG. 23). Further, a threshold value of an area other than the setting area A3 is also determined, and the threshold value is input to the threshold value input field 24b. For example, a value smaller than the parallax value of the subject F21, that is, the parallax value behind the subject F21 is set as the threshold value. Here, as shown in FIG. 22, since the parallax value of the subject F21 is 64, the threshold is set to 50 (FIG. 23). In addition, when the area is not set, if the threshold is input to the threshold input field 24a, the threshold is set with the entire image as one area.

以上のように閾値を設定することで、図２２に示すように被写体Ｆ２１、Ｆ２２を抽出することができる。設定領域Ａ３内では被写体Ｆ２２が、設定領域Ａ３以外の領域では被写体Ｆ２１の身体の一部が抽出され、背景被写体Ｆ２３を除いた画像が、処理画像表示部２１に表示される。 By setting the threshold values as described above, the subjects F21 and F22 can be extracted as shown in FIG. A subject F22 is extracted in the setting area A3, a part of the body of the subject F21 is extracted in an area other than the setting area A3, and an image excluding the background subject F23 is displayed on the processed image display unit 21.

続いて、設定領域Ａ３の閾値入力欄２４ａで入力した閾値の距離に相当する位置に、入替画像ａ表示欄２５ｃで指定された画像が挿入される。すなわち、設定領域Ａ３において、入替画像ａ表示欄２５ｃで指定された画像が被写体Ｆ２２の背景画像となる。また、設定領域Ａ３以外の領域において、閾値入力欄２４ｂで入力した閾値の距離に相当する位置に、入替画像ｂ表示欄２５ｄに指定された画像が挿入される。すなわち、設定領域Ａ３以外の領域において、入替画像ｂ表示欄２５ｄで指定された画像が被写体Ｆ２１の背景画像となる。図２３に示したように、処理画像表示部２１には背景画像が入れ替わった合成画像が表示されている。この作成した画像は図示しない記憶装置に保存して表示装置３に表示しても良い。 Subsequently, the image specified in the replacement image a display field 25c is inserted at a position corresponding to the threshold distance input in the threshold value input field 24a of the setting area A3. That is, in the setting area A3, the image specified in the replacement image a display field 25c becomes the background image of the subject F22. In the area other than the setting area A3, the image specified in the replacement image b display field 25d is inserted at a position corresponding to the threshold distance input in the threshold value input field 24b. That is, in the area other than the setting area A3, the image specified in the replacement image b display field 25d becomes the background image of the subject F21. As shown in FIG. 23, the processed image display unit 21 displays a composite image in which the background image is replaced. The created image may be stored in a storage device (not shown) and displayed on the display device 3.

このように、例えばＧＵＩ（Graphical User Interface）を利用することで、保持していた画像と距離情報（視差情報）を用いてユーザが自由に画像処理を行い、その画像を任意の表示装置に表示することが可能である。本実施の形態では、例として選択領域内領域とそれ以外の領域の二つに分け、閾値を設定し画像入替を行う方法を述べたが、これを拡張すれば、選択領域を増やしそれぞれに閾値を設定し、画像を入れ替えることも可能である。また、閾値は一つの領域（面）で一つの値を設定したが、一つの領域内でも図１３に示した閾値曲線Ｂ２のように閾値を変化させることで、抽出する範囲をさらに限定することも可能である。 Thus, for example, by using a GUI (Graphical User Interface), the user can freely perform image processing using the held image and distance information (parallax information), and display the image on an arbitrary display device. Is possible. In the present embodiment, as an example, the method of dividing the image into two areas, that is, the area within the selected area and the other area, and setting the threshold value and performing the image replacement has been described. It is also possible to change the image. In addition, one value is set for one threshold value in one area (surface), but the range to be extracted is further limited by changing the threshold value in one area as shown in the threshold curve B2 in FIG. Is also possible.

（実施の形態４）
図２４は、本発明の実施の形態４に係る撮像装置の構成例を示すブロック図である。本実施の形態４に係る撮像装置７は、本発明の撮像装置の一例であり、実施の形態１に係る画像処理装置１において、さらに、撮像部１７、測距部１８、入替用画像記憶部１９を備える。上述した実施の形態１における画像処理装置１と同一のものについては、同一の符号を付し説明を省略する。 (Embodiment 4)
FIG. 24 is a block diagram illustrating a configuration example of an imaging apparatus according to Embodiment 4 of the present invention. The imaging device 7 according to the fourth embodiment is an example of the imaging device according to the present invention. In the image processing device 1 according to the first embodiment, the imaging unit 17, the distance measuring unit 18, and the replacement image storage unit are further provided. 19 is provided. The same components as those in the image processing apparatus 1 in the first embodiment described above are denoted by the same reference numerals and description thereof is omitted.

撮像部１７は、人物や背景などの被写体を撮像し画像を出力するものであり、受光した光を電気信号に変え画像とするＣＣＤ（Charge Coupled Device）やＣＭＯＳ（Complementary Metal Oxide Semiconductor）などの撮像素子や、被写体からの光を撮像素子に集光するためのレンズなどの光学系を備えている。撮像部１７は撮像した画像を制御部１１（閾値設定部、抽出画像生成部）に出力する。測距部１８は、撮像装置７と被写体の距離を測定するもので、ＴＯＦ（Time of Flight）などの測距手法により、被写体までの距離を測定し、測定結果を制御部１１（距離情報補間部）に出力する。制御部１１は測距部１８から取得した複数の距離情報について補間処理を行い、原画像の複数位置それぞれの奥行き方向の距離に係る距離情報を算出する。 The image capturing unit 17 captures an object such as a person or a background and outputs an image. The image capturing unit 17 captures an image such as a CCD (Charge Coupled Device) or a CMOS (Complementary Metal Oxide Semiconductor) using the received light as an electrical signal. An optical system such as an element and a lens for condensing light from the subject on the image sensor is provided. The imaging unit 17 outputs the captured image to the control unit 11 (threshold setting unit, extracted image generation unit). The distance measuring unit 18 measures the distance between the imaging device 7 and the subject. The distance measuring unit 18 measures the distance to the subject using a distance measuring method such as TOF (Time of Flight), and the measurement result is transmitted to the control unit 11 (distance information interpolation). Part). The control unit 11 performs an interpolation process on the plurality of distance information acquired from the distance measuring unit 18, and calculates distance information related to the distance in the depth direction at each of the plurality of positions of the original image.

入替用画像記憶部１９は、原画像より抽出した被写体画像と合成する入替用画像を記憶するものである。制御部１１に入替用画像を出力する。撮像装置７のその他の動作については、上述した実施の形態１における画像処理装置１の動作と同様であるので、省略する。 The replacement image storage unit 19 stores a replacement image to be combined with the subject image extracted from the original image. The replacement image is output to the control unit 11. The other operations of the imaging device 7 are the same as the operations of the image processing device 1 according to the first embodiment described above, and are therefore omitted.

上述したように、本実施の形態における撮像装置７は、撮像部１７、測距部１８を備えているので、ユーザは別途、撮像装置を用意しなくても、原画像及び原画像に対応した距離情報を取得できる。また、ユーザは入替用画像記憶部１９記憶された入替用画像を利用できるため、別途、入替用画像を用意することなく、画像処理を行うことが出来る。 As described above, since the imaging device 7 according to the present embodiment includes the imaging unit 17 and the distance measuring unit 18, the user can handle the original image and the original image without preparing the imaging device separately. Distance information can be acquired. Further, since the user can use the replacement image stored in the replacement image storage unit 19, image processing can be performed without separately preparing a replacement image.

（実施の形態５）
図２５は、本発明の実施の形態５に係る撮像装置の構成を示すブロック図である。本実施の形態に係る撮像装置８は、本発明の撮像装置の一例であり、実施の形態１に係る画像処理装置１において、さらに、一組の撮像部１７ａ、１７ｂ、入替用画像記憶部１９を備える。上述した実施の形態１における画像処理装置１と同一の構成については、同一の符号を付し、構成や各部の動作については、説明を省略する。 (Embodiment 5)
FIG. 25 is a block diagram showing a configuration of an imaging apparatus according to Embodiment 5 of the present invention. The imaging device 8 according to the present embodiment is an example of the imaging device according to the present invention. In the image processing device 1 according to the first embodiment, a pair of imaging units 17a and 17b and a replacement image storage unit 19 are further provided. Is provided. The same components as those of the image processing apparatus 1 according to the first embodiment described above are denoted by the same reference numerals, and the description of the components and the operation of each unit is omitted.

撮像部１７ａ、１７ｂはステレオ配置されている。ステレオ配置とは、撮像部１７ａ、１７ｂの二つの撮像部を横並びに光軸が略平行となるよう並べた配置を言う。本実施の形態では、例として二つの撮像部１７ａ、１７ｂには同じ構成のものを用いるが、二つの撮像部１７ａ、１７ｂで同領域を撮像し、画素間の対応を取ることが可能であれば、解像度や画角など構成の異なる撮像部を用いても構わない。撮像部１７ａ、１７ｂは、それぞれ人物や背景などの被写体を撮像し画像を出力するものであり、受光した光を電気信号に変え画像とするＣＣＤ（Charge Coupled Device）、ＣＭＯＳ（Complementary Metal Oxide Semiconductor）などの撮像素子、被写体からの光を撮像素子に集光するためのレンズなどの光学系を備えている。撮像部１７ａは撮像した画像を、制御部１１（距離算出部、閾値設定部、抽出画像生成部）に出力し、撮像部１７ｂは撮像した画像を制御部１１に出力する。 The imaging units 17a and 17b are arranged in stereo. The stereo arrangement refers to an arrangement in which the two imaging units 17a and 17b are arranged side by side so that their optical axes are substantially parallel. In the present embodiment, two imaging units 17a and 17b having the same configuration are used as an example. However, it is possible to capture the same area by two imaging units 17a and 17b and take correspondence between pixels. For example, imaging units having different configurations such as resolution and angle of view may be used. The imaging units 17a and 17b each capture a subject such as a person or a background and output an image. The received light is converted into an electrical signal and used as a CCD (Charge Coupled Device) or CMOS (Complementary Metal Oxide Semiconductor). And an optical system such as a lens for condensing light from the subject on the image sensor. The imaging unit 17 a outputs the captured image to the control unit 11 (distance calculation unit, threshold setting unit, extracted image generation unit), and the imaging unit 17 b outputs the captured image to the control unit 11.

制御部１１は撮像部１７ａ、１７ｂより受け取った２つの画像から、撮像装置８と被写体の距離を算出し、算出した距離に基づいて、デプスマップを生成する。続いて、制御部１１は必要に応じてデプスマップの補正を行い、補正後のデプスマップをＲＡＭ１１ｃに記憶する。入替用画像記憶部１９は、原画像より抽出した被写体画像と合成する入替用画像を記憶するものであり、制御部１１に入替用画像を出力する。その他の動作については、上述した実施の形態１と同様であるので、省略する。 The control unit 11 calculates the distance between the imaging device 8 and the subject from the two images received from the imaging units 17a and 17b, and generates a depth map based on the calculated distance. Subsequently, the control unit 11 corrects the depth map as necessary, and stores the corrected depth map in the RAM 11c. The replacement image storage unit 19 stores a replacement image to be combined with the subject image extracted from the original image, and outputs the replacement image to the control unit 11. Other operations are the same as those in the first embodiment described above, and are therefore omitted.

なお、ここでは、二つの撮像部１７ａ、１７ｂが左右方向に配置されているものとしたが、上下方向に配置されていても視差値の算出が可能である場合は、撮像画像の走査を水平方向に代えて垂直方向にすれば良い。また、視差値から距離を算出した例を示したが、距離を算出せずに視差値を距離情報とすることも可能である。この場合においては、制御部１１はデプスマップではなくディスパリティマップを生成することとなる。また、撮像装置８の動作は、上述の実施の形態１の変形例２において示した動作と同様である。 Here, the two imaging units 17a and 17b are arranged in the left-right direction. However, when the parallax value can be calculated even if arranged in the up-down direction, the captured image is scanned horizontally. Instead of the direction, the vertical direction may be used. Moreover, although the example which calculated distance from the parallax value was shown, it is also possible to use a parallax value as distance information, without calculating a distance. In this case, the control unit 11 generates a disparity map instead of a depth map. Further, the operation of the imaging device 8 is the same as the operation shown in the second modification of the first embodiment.

上述したように、本実施の形態における撮像装置８は、撮像部１７ａ、１７ｂを備えているので、実施の形態４同様に、ユーザが別途、撮像装置を用意しなくても、原画像及び原画像に対応した距離情報を取得できる。また、ユーザは入替用画像記憶部１９に記憶された入替用画像を利用できるため、別途、入替用画像を用意することなく、画像処理を行うことが出来る。 As described above, since the imaging device 8 in the present embodiment includes the imaging units 17a and 17b, the original image and the original image can be obtained even if the user does not separately prepare the imaging device, as in the fourth embodiment. Distance information corresponding to the image can be acquired. Further, since the user can use the replacement image stored in the replacement image storage unit 19, image processing can be performed without preparing a replacement image separately.

（実施の形態６）
上述した実施の形態において、画像処理装置１、５は、画像処理装置単体として、説明したが、画像処理装置１、５を他の機器、例えば、テレビジョン受像機、携帯電話機又はパーソナルコンピュータ（ＰＣ）等の情報処理装置に組み込むことも可能である。
図２６は、本発明の実施の形態６に係るテレビジョン受像機の構成例を示すブロック図である。実施の形態６に係るテレビジョン受像機９は、制御部１１、外部インタフェース部１２、チューナ部９１、信号処理部９２、音声出力部９３、表示部９４、及び操作部９５を備える。テレビジョン受像機９は、実施の形態１に係る画像処理装置１を組み込んだものである。なお、画像処理装置１と同様な構成については、同じ符号を付している。 (Embodiment 6)
In the above-described embodiment, the image processing apparatuses 1 and 5 have been described as the image processing apparatus alone. However, the image processing apparatuses 1 and 5 are replaced with other devices such as a television receiver, a cellular phone, or a personal computer (PC). It is also possible to incorporate it into an information processing apparatus such as
FIG. 26 is a block diagram showing a configuration example of a television receiver according to Embodiment 6 of the present invention. The television receiver 9 according to Embodiment 6 includes a control unit 11, an external interface unit 12, a tuner unit 91, a signal processing unit 92, an audio output unit 93, a display unit 94, and an operation unit 95. The television receiver 9 incorporates the image processing apparatus 1 according to the first embodiment. In addition, the same code | symbol is attached | subjected about the structure similar to the image processing apparatus 1. FIG.

制御部１１は、ＣＰＵ１１ａがＲＯＭ１１ｂに予め格納されている制御プログラムをＲＡＭ１１ｃに読み出して実行することにより、テレビジョン受像機９が備えるハードウェア各部の動作を制御して装置全体を本発明の画像処理装置として機能させる。
外部インタフェース部１２の機能は、画像処理装置１が備える外部インタフェース部１２と同様なので、説明を省略する。 The control unit 11 reads out the control program stored in advance in the ROM 11b to the RAM 11c and executes it by the CPU 11a, thereby controlling the operation of each hardware unit included in the television receiver 9 and controlling the entire apparatus according to the image processing of the present invention. To function as a device.
Since the function of the external interface unit 12 is the same as that of the external interface unit 12 included in the image processing apparatus 1, the description thereof is omitted.

チューナ部９１は、デジタルの放送信号を受信するためのデジタルチューナであり、アンテナＡＮに接続されている。チューナ部９１は、例えば操作部９５を介してユーザにより選択された放送チャンネルに応じて、アンテナＡＮが受信した電波を検波し、得られた放送波から放送信号を取得し、取得した放送信号を信号処理部９２へ送出する。 The tuner unit 91 is a digital tuner for receiving a digital broadcast signal, and is connected to the antenna AN. The tuner unit 91 detects the radio wave received by the antenna AN according to the broadcast channel selected by the user via the operation unit 95, for example, acquires the broadcast signal from the obtained broadcast wave, and acquires the acquired broadcast signal. It is sent to the signal processing unit 92.

信号処理部９２は、チューナ部９１が取得した放送信号を映像信号（ＲＧＢ信号（Ｒ：赤、Ｇ：緑、Ｂ：青））及び音声信号に分離する。信号処理部９２は、分離した音声信号に対して所定の復号伸張処理を実行し、得られた音声信号を音声出力部９３へ送出する。音声出力部９３は、信号処理部９２から送出されてきた音声信号を増幅し、音声信号に基づく音声を図示しないスピーカーにて出力する。 The signal processing unit 92 separates the broadcast signal acquired by the tuner unit 91 into a video signal (RGB signal (R: red, G: green, B: blue)) and an audio signal. The signal processing unit 92 performs a predetermined decoding / decompression process on the separated audio signal, and sends the obtained audio signal to the audio output unit 93. The audio output unit 93 amplifies the audio signal sent from the signal processing unit 92 and outputs audio based on the audio signal through a speaker (not shown).

信号処理部９２は、分離した映像信号に対して所定の復号伸張処理を実行する。信号処理部９２は、復号伸張処理により得られた映像信号（Ｙ信号及びＣ信号）をＲＧＢの映像信号（ＲＧＢ信号）に変換後、映像信号の各色成分毎に、各画素の入力階調（輝度）を、各入力階調に応じた出力階調（出力輝度）に変換し、得られた映像信号（出力階調）を表示部９４へ送出する。 The signal processing unit 92 performs a predetermined decoding / decompression process on the separated video signal. The signal processing unit 92 converts the video signal (Y signal and C signal) obtained by the decoding / decompression process into an RGB video signal (RGB signal), and then, for each color component of the video signal, the input gradation ( (Luminance) is converted into output gradation (output luminance) corresponding to each input gradation, and the obtained video signal (output gradation) is sent to the display unit 94.

表示部９４は、画像処理装置１が備えている表示インタフェース部１３と表示装置３とを一体としたものである。テレビジョン受像機９が通常のテレビジョン受像機として機能する場合、表示部９４は、信号処理部９２から送出されてきた映像信号を所定のタイミングに従って表示部９４が備える液晶モジュール等に表示させる。
テレビジョン受像機９が画像処理装置として機能する場合には、表示部９４は、原画像、抽出画像、合成画像又は操作画面を表示する。 The display unit 94 is obtained by integrating the display interface unit 13 and the display device 3 included in the image processing apparatus 1. When the television receiver 9 functions as a normal television receiver, the display unit 94 displays the video signal transmitted from the signal processing unit 92 on a liquid crystal module or the like provided in the display unit 94 according to a predetermined timing.
When the television receiver 9 functions as an image processing device, the display unit 94 displays an original image, an extracted image, a composite image, or an operation screen.

操作部９５は、画像処理装置１が備えている操作インタフェース部１４と操作装置４を一体化したものである。テレビジョン受像機９を通常のテレビジョン受像機として機能させるか、画像処理装置として機能させるかの機能切替操作を受付ける。
テレビジョン受像機９が通常のテレビジョン受像機として機能する場合、操作部９５は、放送チャンネルの選択、音量のコントロールなどの操作を受付ける。
テレビジョン受像機９が画像処理装置として機能する場合には、原画像の選択、デプスマップの選択、分割領域の指定などを行う。
分割領域の指定を行う場合は、操作部９５が備える十字ボタン（上方向ボタン、下方向ボタン、右方向ボタン、左方向ボタン、決定ボタン等からなる操作ボタン）を用いる。 The operation unit 95 is obtained by integrating the operation interface unit 14 included in the image processing apparatus 1 and the operation device 4. A function switching operation of whether the television receiver 9 functions as a normal television receiver or an image processing apparatus is accepted.
When the television receiver 9 functions as a normal television receiver, the operation unit 95 accepts operations such as selection of a broadcast channel and volume control.
When the television receiver 9 functions as an image processing device, selection of an original image, selection of a depth map, designation of divided areas, and the like are performed.
When designating a divided area, a cross button (an operation button including an up button, a down button, a right button, a left button, a determination button, and the like) provided in the operation unit 95 is used.

テレビジョン受像機９が、画像処理装置として機能する場合の動作は、上述した実施の形態１に係る画像処理装置１と同様であるので、説明を省略する。
本実施の形態と同様にして、上述した変形例１、変形例２、実施の形態２、実施の形態３及び変形例３に係る画像処理装置をテレビジョン受像機に組み込むことができる。
そして、実施の形態１、変形例１、変形例２、実施の形態２、実施の形態３及び変形例３に係る画像処理装置と同様の効果を奏する。 Since the operation when the television receiver 9 functions as an image processing apparatus is the same as that of the image processing apparatus 1 according to Embodiment 1 described above, the description thereof is omitted.
In the same manner as in this embodiment, the image processing apparatuses according to Modification 1, Modification 2, Embodiment 2, Embodiment 3, and Modification 3 described above can be incorporated into a television receiver.
The same effects as those of the image processing apparatus according to the first embodiment, the first modification, the second modification, the second embodiment, the third embodiment, and the third modification are obtained.

（実施の形態７）
上述の実施の形態６と同様に、撮像装置7、８を撮像装置単体ではなく、テレビジョン受像機、携帯電話機又はパーソナルコンピュータ（ＰＣ）等の情報処理装置に組み込むことが可能である。実施の形態６との違いは、組み込む装置が画像処理装置であるか、撮像装置であるかの違いのみであるので、以下では、主に相違点について説明する。
図２７は、本発明の実施の形態７に係るテレビジョン受像機の構成例を示すブロック図である。実施の形態７に係るテレビジョン受像機１０は、制御部１１、撮像部１７、測距部１８、入替用画像記憶部１９、チューナ部９１、信号処理部９２、音声出力部９３、表示部９４、及び操作部９５を備える。テレビジョン受像機１０は、実施の形態４に係る撮像装置７を組み込んだものである。なお、撮像装置７又はテレビジョン受像機９と同様な構成については、それぞれ同じ符号を付している。 (Embodiment 7)
As in the above-described sixth embodiment, the imaging devices 7 and 8 can be incorporated in an information processing device such as a television receiver, a mobile phone, or a personal computer (PC) instead of the imaging device alone. Since the difference from the sixth embodiment is only the difference between whether the apparatus to be incorporated is an image processing apparatus or an imaging apparatus, the difference will be mainly described below.
FIG. 27 is a block diagram showing a configuration example of a television receiver according to Embodiment 7 of the present invention. The television receiver 10 according to the seventh embodiment includes a control unit 11, an imaging unit 17, a distance measuring unit 18, a replacement image storage unit 19, a tuner unit 91, a signal processing unit 92, an audio output unit 93, and a display unit 94. And an operation unit 95. The television receiver 10 incorporates the imaging device 7 according to the fourth embodiment. In addition, the same code | symbol is attached | subjected about the structure similar to the imaging device 7 or the television receiver 9, respectively.

制御部１１は、撮像装置７が備える制御部１１と同様な機能を有すると共に、テレビジョン受像機１０を、通常のテレビジョン受像機として機能させるための制御を行う。
撮像部１７、測距部１８、及び入替用画像記憶部１９の機能は、それぞれ撮像装置７の備える撮像部１７、測距部１８、及び入替用画像記憶部１９と同様であるので、説明を省略する。
同様に、チューナ部９１、信号処理部９２、音声出力部９３及び表示部９４の機能は、それぞれ実施の形態６に係るテレビジョン受像機９と同様であるので、説明を省略する。 The control unit 11 has a function similar to that of the control unit 11 included in the imaging device 7 and performs control for causing the television receiver 10 to function as a normal television receiver.
The functions of the imaging unit 17, the distance measuring unit 18, and the replacement image storage unit 19 are the same as those of the imaging unit 17, the distance measuring unit 18, and the replacement image storage unit 19 included in the imaging device 7, respectively. Omitted.
Similarly, functions of the tuner unit 91, the signal processing unit 92, the audio output unit 93, and the display unit 94 are the same as those of the television receiver 9 according to the sixth embodiment, and thus the description thereof is omitted.

操作部９５は、上述のテレビジョン受像機９と同様な機能を有すると共に、ユーザからのシャッターを切るなどの撮像部へのコマンドを受け、制御部１１に送信する機能を備える。
テレビジョン受像機１０が、撮像処理装置として機能する場合の動作は、上述した実施の形態４に係る撮像装置７と同様であるので、説明を省略する。
本実施の形態と同様にして、上述した実施の形態５に係る撮像装置８をテレビジョン受像機に組み込むことができる。
そして、実施の形態４及び５に係る撮像装置と同様の効果を奏する。 The operation unit 95 has a function similar to that of the above-described television receiver 9 and also has a function of receiving a command from the user to the imaging unit such as releasing a shutter and transmitting the command to the control unit 11.
Since the operation when the television receiver 10 functions as an imaging processing apparatus is the same as that of the imaging apparatus 7 according to Embodiment 4 described above, the description thereof is omitted.
Similarly to the present embodiment, the above-described imaging device 8 according to the fifth embodiment can be incorporated into a television receiver.
And the same effect as the imaging device concerning Embodiment 4 and 5 is produced.

なお、上述した実施の形態はすべての点で例示であって、制限的なものではないと考えられるべきである。本発明の範囲は、上述した意味ではなく、特許請求の範囲によって示され、特許請求の範囲と均等の意味及び範囲内でのすべての変更が含まれることが意図される。 It should be understood that the above-described embodiment is illustrative in all respects and not restrictive. The scope of the present invention is defined by the terms of the claims, rather than the meanings described above, and is intended to include any modifications within the scope and meaning equivalent to the terms of the claims.

１、５画像処理装置
７、８撮像装置
１１制御部
１１ａＣＰＵ
１１ｂＲＯＭ
１１ｃＲＡＭ
１２外部インタフェース部
１３表示インタフェース部
１４操作インタフェース部
１５外部記憶装置
１６通信部
１７、１７ａ、１７ｂ撮像部
１８測距部
１９入替用画像記憶部
２撮像装置
３表示装置
４操作装置
６記録媒体
６ａコンピュータプログラム
Ｎ通信網
２０ユーザ操作画面
２１処理画像表示部
２２ポインタ
２３視差値表示欄
２４ａ、２４ｂ閾値入力欄
２５ａ原画像名表示欄
２５ｂディスパリティマップ名表示欄
２５ｃ入替画像ａ表示欄
２５ｄ入替画像ｂ表示欄
２６ａ、２６ｂ、２６ｃ、２６ｄ選択ボタン
９、１０テレビジョン受像機 1, 5 Image processing device 7, 8 Imaging device 11 Control unit 11a CPU
11b ROM
11c RAM
DESCRIPTION OF SYMBOLS 12 External interface part 13 Display interface part 14 Operation interface part 15 External storage device 16 Communication part 17, 17a, 17b Imaging part 18 Distance measuring part 19 Replacement image memory | storage part 2 Imaging device 3 Display apparatus 4 Operation apparatus 6 Recording medium 6a Computer Program N Communication network 20 User operation screen 21 Processed image display section 22 Pointer 23 Parallax value display field 24a, 24b Threshold value input field 25a Original image name display field 25b Disparity map name display field 25c Replacement image a display field 25d Replacement image b display Field 26a, 26b, 26c, 26d Selection button 9, 10 Television receiver

Claims

An extracted image obtained by acquiring an original image having distance information related to the distance in the depth direction of the image at each of a plurality of positions on the image, and a replacement image replacing a part of the original image, and extracting a part of the acquired original image And an image processing apparatus for synthesizing the replacement image,
A threshold setting unit that sets a threshold for the distance information based on each position on the original image;
An image processing apparatus comprising: an extracted image generation unit configured to generate the extracted image based on a set threshold and the distance information.

The image processing apparatus according to claim 1, wherein an image portion that has not been extracted from the original image is replaced with the replacement image.

An area dividing unit for dividing the original image into a plurality of areas;
The threshold setting unit sets a threshold for each divided area,
The image processing apparatus according to claim 1, wherein the extracted image generation unit generates an extracted image for each divided region.

An area dividing unit for dividing the original image into a plurality of areas;
The threshold setting unit sets a threshold for each divided area,
The extracted image generation unit generates an extracted image for each divided region,
The image processing apparatus according to claim 2, wherein a plurality of replacement images are acquired, and a different replacement image is used for each divided region.

An extracted image obtained by acquiring an original image having distance information related to the distance in the depth direction of the image at each of a plurality of positions on the image, and a replacement image replacing a part of the original image, and extracting a part of the acquired original image And an image processing method for synthesizing the replacement image,
Setting a threshold for the distance information based on each position on the original image;
An image processing method comprising: generating the extracted image based on a set threshold and the distance information.

A computer acquires an original image having distance information related to a distance in the depth direction of the image at each of a plurality of positions on the image, and a replacement image that replaces a part of the original image, and acquires a part of the acquired original image In the program for synthesizing the extracted image and the replacement image,
In the computer,
Setting a threshold for the distance information based on each position on the original image;
A program for executing the step of generating the extracted image based on a set threshold and the distance information.

An imaging unit for imaging a subject;
A replacement image storage unit for storing replacement images;
In an imaging device that synthesizes an extracted image obtained by extracting a part of a captured image and a replacement image that replaces a part of the captured image,
A distance measuring unit for measuring the distance to the subject at a plurality of positions;
A distance information interpolation unit that calculates distance information related to the distance in the depth direction of the image at each of a plurality of positions on the captured image by interpolating the distance values of the plurality of positions measured by the distance measuring unit;
A threshold setting unit that sets a threshold for the distance information based on each position on the captured image;
An image pickup apparatus comprising: an extracted image generation unit configured to generate the extracted image based on a set threshold and the distance information.

A plurality of imaging units for imaging a subject;
A replacement image storage unit for storing replacement images;
In an imaging device that synthesizes an extracted image obtained by extracting a part of a captured image and a replacement image that replaces a part of the captured image,
A distance information calculation unit that calculates distance information related to the distance in the depth direction of the image at each of a plurality of positions on the captured image using a plurality of images captured by the plurality of imaging units;
A threshold setting unit that sets a threshold for the distance information based on each position on the captured image;
An image pickup apparatus comprising: an extracted image generation unit configured to generate the extracted image based on a set threshold and the distance information.

An image processing apparatus according to any one of claims 1 to 4,
A tuner for receiving television broadcasts;
A display unit for displaying an image related to television broadcast received by the tuner unit,
The display unit is configured to display an image acquired, extracted, generated or synthesized by the image processing apparatus.

The imaging device according to claim 7 or 8,
A tuner for receiving television broadcasts;
A display unit for displaying an image related to television broadcast received by the tuner unit,
The television receiver, wherein the display unit displays an image captured, extracted, generated or synthesized by the imaging device.