JP2013175821A

JP2013175821A - Image processing device, image processing method, and program

Info

Publication number: JP2013175821A
Application number: JP2012037714A
Authority: JP
Inventors: Yosuke Takada; 洋佑高田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2012-02-23
Filing date: 2012-02-23
Publication date: 2013-09-05

Abstract

PROBLEM TO BE SOLVED: To allow for refocus processing capable of reproducing a smooth blurring while suppressing a calculation cost.SOLUTION: In step S401, a user operates a shutter button 111 to photograph an image. In step S402, the photographed image is input to an image input unit 301, and development processing is performed by a development processing unit 302 to generate a multi-viewpoint image. In step S403, a depth map is generated from the generated multi-viewpoint image. In step S404, the number of intermediate viewpoint images to be generated is calculated using the generated depth map. In step S405, intermediate viewpoint images are generated on the basis of the calculated number of intermediate viewpoint images using the generated multi-viewpoint image and the generated depth map. In step S408, the images are synthesized by an image synthesizing unit 308 using the multi-viewpoint images generated in step S402, the intermediate viewpoint images generated in step S405, and refocus parameters set in step S407.

Description

本発明は、画像処理装置、画像処理方法およびプログラムに関し、より具体的には視点位置の異なる複数の画像を用いて画像を生成する画像処理方法、画像処理装置およびプログラムに関するものである。 The present invention relates to an image processing apparatus, an image processing method, and a program, and more specifically to an image processing method, an image processing apparatus, and a program that generate an image using a plurality of images having different viewpoint positions.

従来から、二つ以上の視点から撮影した画像（以下、多視点画像）を合成することによって、新たな画像を生成する技術がある。例えば、多視点画像に対して、仮想的なピント面やボケ量を設定し、画像処理によって合成することで新たな画像を生成するリフォーカス技術が提案されている（例えば、特許文献１や非特許文献１参照）。リフォーカス技術では、設定したピント面において被写体位置が一致するように各画像を合成することで、設定したピント面においてピントの合った画像を生成する。一方、非ピント面では位置をずらして合成することによって、非ピント面においてボケ像が再現できる。また、合成する枚数や重みを調整することによって、ボケの大きさを制御することが可能となる。特許文献１では、多視点画像撮影時の光線情報に基づいて、重み付けして合成することでボケの大きさ調整や特徴的なボケの再現を行う。また、非特許文献１では、多視点画像の中間視点画像を推定することで、滑らかなボケを再現することが可能となる。 Conventionally, there is a technique for generating a new image by combining images taken from two or more viewpoints (hereinafter, multi-viewpoint images). For example, a refocus technique has been proposed in which a virtual focus plane and a blur amount are set for a multi-viewpoint image and are synthesized by image processing to generate a new image (for example, Patent Document 1 and Non-Patent Document 1). Patent Document 1). In the refocus technique, each image is synthesized so that the subject position matches on the set focus plane, thereby generating an image focused on the set focus plane. On the other hand, a blurred image can be reproduced on the non-focus surface by shifting the position on the non-focus surface and compositing. Further, the size of blur can be controlled by adjusting the number of sheets to be combined and the weight. In Japanese Patent Laid-Open No. 2004-228688, the size of blur is adjusted and characteristic blur is reproduced by weighting and synthesizing based on light ray information at the time of multi-viewpoint image shooting. In Non-Patent Document 1, it is possible to reproduce smooth blur by estimating an intermediate viewpoint image of a multi-viewpoint image.

特開２００９−１５９３５７号公報JP 2009-159357 A

楠本夏未他、"未較正合成開口撮影法によるデフォーカスコントロール"、MIRU2008画像の認識・理解シンポジウム論文集、日本、2008年、1057-1062頁Natsumoto Enomoto et al., “Defocus Control by Uncalibrated Synthetic Aperture Imaging”, MIRU 2008 Image Recognition and Understanding Symposium, Japan, 2008, pp. 1057-1062

しかしながら、特許文献１の方法の場合、計算コストを抑えることができるが、多視点撮影時の光線情報のみに基づいてボケの再現を行っているため、被写体に奥行がある場合に滑らかなボケを再現できないという問題がある。一方、非特許文献１の方法は、被写体に奥行があったとしても、滑らかなボケを再現することができるが、すべての多視点画像間において中間視点画像を生成するため、生成される画像が多数となり計算コストが増大するという問題がある。 However, in the case of the method disclosed in Patent Document 1, although the calculation cost can be reduced, since the blur is reproduced based only on the light ray information at the time of multi-viewpoint shooting, smooth blur is obtained when the subject has a depth. There is a problem that it cannot be reproduced. On the other hand, the method of Non-Patent Document 1 can reproduce smooth blur even if the subject has a depth, but generates an intermediate viewpoint image between all the multi-viewpoint images. There is a problem that the calculation cost increases due to a large number.

本発明は、画像処理装置であって、複数の撮像部を有する撮像装置により取得された多視点画像データを入力する多視点画像入力手段と、多視点画像の特徴量に応じて、予め定められた特性を満たす合成画像を得るために生成が必要な中間視点画像の生成数を算出する中間視点画像生成数算出手段と、中間視点画像生成数算出手段にて算出された生成数の、多視点画像データの中間視点画像データを生成する中間視点画像生成手段と、中間視点画像生成手段により生成された中間視点画像データを用いて合成画像データを生成する画像合成手段とを有することを特徴とする。 The present invention is an image processing apparatus, which is determined in advance according to multi-view image input means for inputting multi-view image data acquired by an image capturing apparatus having a plurality of image capturing units, and feature quantities of the multi-view image. Intermediate viewpoint image generation number calculating means for calculating the number of intermediate viewpoint images that need to be generated in order to obtain a composite image satisfying the above characteristics, and the multi-viewpoints of the generation numbers calculated by the intermediate viewpoint image generation number calculating means An intermediate viewpoint image generation unit that generates intermediate viewpoint image data of image data, and an image synthesis unit that generates composite image data using the intermediate viewpoint image data generated by the intermediate viewpoint image generation unit .

本発明は、多視点画像間の特徴量に基づいて中間視点画像の生成数を算出し、その値に応じて中間視点画像の疎密を切り替えることで計算コストを抑えながら滑らかなボケの再現が可能なリフォーカス処理が可能となる。 The present invention calculates the number of intermediate viewpoint images generated based on the feature values between multi-viewpoint images, and can switch between the density of intermediate viewpoint images according to the value to reproduce smooth blur while reducing the calculation cost. Refocus processing is possible.

実施例１における撮像装置の一例を示す図である。1 is a diagram illustrating an example of an imaging device in Embodiment 1. FIG. 実施例１における撮像装置の内部構成を示すブロック図である。1 is a block diagram illustrating an internal configuration of an imaging apparatus according to Embodiment 1. FIG. 実施例１における画像処理の構成を示すブロック図である。3 is a block diagram illustrating a configuration of image processing in Embodiment 1. FIG. 実施例１における処理の流れを示すフローチャートである。3 is a flowchart illustrating a flow of processing in the first embodiment. 実施例１におけるリフォーカスパラメータ設定のユーザインタフェース例である。3 is an example of a user interface for setting a refocus parameter in the first embodiment. 実施例１の奥行マップ生成における処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process in the depth map production | generation of Example 1. FIG. 実施例１における撮像位置間隔と被写体までの距離との関係を示す図である。6 is a diagram illustrating a relationship between an imaging position interval and a distance to a subject in Embodiment 1. FIG. 実施例１の中間視点画像の生成数算出における処理の流れを示すフローチャートである。7 is a flowchart illustrating a processing flow in calculating the number of intermediate viewpoint images generated according to the first exemplary embodiment. 実施例１における多視点画像間のシフト量算出の例を示す図であるFIG. 10 is a diagram illustrating an example of calculating a shift amount between multi-viewpoint images in the first embodiment. 実施例１における中間視点画像の生成数の例を示す図である。6 is a diagram illustrating an example of the number of intermediate viewpoint images generated in Embodiment 1. FIG. 実施例１の画像合成部の処理の流れを示すフローチャートである。3 is a flowchart illustrating a processing flow of an image composition unit according to the first exemplary embodiment. 実施例１におけるピント位置及びシフト量の概要を示す図である。FIG. 6 is a diagram illustrating an overview of a focus position and a shift amount in the first embodiment. 実施例１における重み係数を説明するための図である。It is a figure for demonstrating the weighting coefficient in Example 1. FIG. 実施例２における処理の流れを示すフローチャートである。10 is a flowchart illustrating a processing flow in the second embodiment. 実施例２の画像間の類似度算出における処理の流れを示すフローチャートである。10 is a flowchart illustrating a flow of processing in calculating similarity between images according to the second exemplary embodiment. 実施例２における中間視点画像の生成数の例を示す図である。FIG. 10 is a diagram illustrating an example of the number of intermediate viewpoint images generated in the second embodiment. 実施例３における処理の流れを示すフローチャートである。10 is a flowchart showing a flow of processing in Embodiment 3. 実施例３における画像の周波数解析を示すフローチャートである。10 is a flowchart illustrating frequency analysis of an image in Example 3. 実施例３における多視点画像の周波数特性を示す模式図である。10 is a schematic diagram illustrating frequency characteristics of a multi-viewpoint image in Embodiment 3. FIG. 実施例３における中間視点画像の生成数の例を示す図である。It is a figure which shows the example of the production | generation number of the intermediate viewpoint image in Example 3. FIG. 実施例１における撮像装置の裏面の一例を示す図である。2 is a diagram illustrating an example of a back surface of the imaging apparatus according to Embodiment 1. FIG.

［実施例１］
図１および２１は、本実施例における撮像装置の一例を示す図である。図１に示すように、撮像装置の筺体１０１は、多眼撮像部１０２〜１１０を備える。また、図２１に示すように、撮像装置１０１の上部には動作を開始するための撮影ボタン１１１が備えられている。また、撮像装置１０１の裏面には撮像装置１０１を操作するための操作ボタン１１２および画像やグラフィックユーザーインタフェースを表示するための表示部１１３が備えられている。 [Example 1]
1 and 21 are diagrams illustrating an example of an imaging apparatus according to the present embodiment. As illustrated in FIG. 1, the housing 101 of the imaging apparatus includes multi-eye imaging units 102 to 110. Further, as shown in FIG. 21, an imaging button 111 for starting an operation is provided on the upper part of the imaging apparatus 101. Further, an operation button 112 for operating the image pickup apparatus 101 and a display unit 113 for displaying an image and a graphic user interface are provided on the back surface of the image pickup apparatus 101.

図２は、本実施例における撮像装置１０１の内部構成を示すブロック図である。多眼撮像部１０２〜１１０は、ズームレンズ、フォーカスレンズ、ブレ補正レンズ、絞り、シャッター、光学ローパスフィルタ、ｉＲカットフィルタ、カラーフィルタ、及びＣＭＯＳまたはＣＣＤの撮像センサなどから構成される。このような構成により、レンズを通りフォーカスされた被写体の光量を検知することができる。Ａ／Ｄ変換部２０１は、多眼撮像部１０２〜１１０にて検知した被写体の光量をデジタル信号値に変換する。画像処理部２０２は、変換されたデジタル信号値に対して、画像処理を行うが、詳細については後述する。Ｄ／Ａ変換部２０３は、上記デジタル信号値をアナログ信号に変換する。エンコーダ部２０４は、上記デジタル信号値をＪｐｅｇなどの本技術分野で知られた種々のファイルフォーマットに変換処理を行う。メディアＩ／Ｆ２０５は、ＰＣ／メディア２０６（例えば、ハードディスク、メモリカード、ＣＦカード、ＳＤカード、ＵＳＢメモリなど）につなぐためのインタフェースである。 FIG. 2 is a block diagram illustrating an internal configuration of the imaging apparatus 101 in the present embodiment. The multi-view imaging units 102 to 110 include a zoom lens, a focus lens, a shake correction lens, an aperture, a shutter, an optical low-pass filter, an iR cut filter, a color filter, a CMOS or CCD image sensor, and the like. With such a configuration, it is possible to detect the amount of light of the subject focused through the lens. The A / D conversion unit 201 converts the amount of light of the subject detected by the multiview imaging units 102 to 110 into a digital signal value. The image processing unit 202 performs image processing on the converted digital signal value, details of which will be described later. The D / A converter 203 converts the digital signal value into an analog signal. The encoder unit 204 converts the digital signal value into various file formats known in this technical field such as Jpeg. The media I / F 205 is an interface for connecting to a PC / media 206 (for example, a hard disk, a memory card, a CF card, an SD card, a USB memory, etc.).

ＣＰＵ２０７は、各構成の処理全てに関わり、ＲＯＭ２０８やＲＡＭ２０９に格納された命令を順に読み込み、解釈し、その結果に従って処理を実行する。また、撮像系制御部２１０は、多眼撮像部１０２〜１１０に対して、フォーカスを合わせ、シャッターを開き、絞りを調整するなど、ＣＰＵ２０７から指示された制御を行う。制御部２１１は、シャッターボタン１１１や操作ボタン１１２などからのユーザ指示によって、全体処理の開始及び終了などの種々の制御を行う。キャラクタージェネレーション部２１２は、文字やグラフィックなどを生成し、表示部１１３に表示する。表示部１１３は、通常は液晶ディスプレイなどが用いられ、キャラクタージェネレーション部２１２やＤ／Ａ変換部２０３から受け取った画像データ、文字、グラフィックインタフェースなどを表示する。また、表示部１１３にはタッチスクリーン機能をもたせるようにすることもでき、その場合は、タッチスクリーンを介したユーザ指示を制御部２１１の入力とすることも可能である。尚、本実施例では、図1に示すように多眼撮像装部として９個を例として挙げたが、２つ以上の撮像部をもつものであればこれに限定されない。 The CPU 207 is involved in all the processes of each configuration, reads the instructions stored in the ROM 208 and the RAM 209 in order, interprets them, and executes the processes according to the results. In addition, the imaging system control unit 210 performs control instructed by the CPU 207 such as adjusting the focus, opening the shutter, and adjusting the aperture for the multi-eye imaging units 102 to 110. The control unit 211 performs various controls such as start and end of the entire process according to a user instruction from the shutter button 111 or the operation button 112. The character generation unit 212 generates characters and graphics and displays them on the display unit 113. The display unit 113 typically uses a liquid crystal display or the like, and displays image data, characters, graphic interfaces, and the like received from the character generation unit 212 and the D / A conversion unit 203. In addition, the display unit 113 can be provided with a touch screen function. In this case, a user instruction via the touch screen can be input to the control unit 211. In the present embodiment, as shown in FIG. 1, nine multi-view imaging units are exemplified, but the present invention is not limited to this as long as it has two or more imaging units.

図３は、本実施例の画像処理を実行するための機能を示すブロック図である。多視点画像入力手段である画像入力部３０１は、多視点撮像部１０２〜１１０にて撮像し、Ａ／Ｄ変換部２０１にてデジタル信号値として生成された画像データを受け取る。現像処理部３０２は、画像入力部３０１にて入力した画像データに対して、デモザイク、ホワイトバランス、ノイズリダクション、ガンマ、シャープネスなどの処理を行う。 FIG. 3 is a block diagram illustrating functions for executing image processing according to the present embodiment. An image input unit 301 serving as a multi-viewpoint image input unit receives image data captured by the multi-viewpoint imaging units 102 to 110 and generated as a digital signal value by the A / D conversion unit 201. The development processing unit 302 performs processing such as demosaic, white balance, noise reduction, gamma, and sharpness on the image data input by the image input unit 301.

中間視点画像生成数算出部３０３は、最適な中間視点画像の生成数、つまり各多視点画像に応じて、滑らかなボケを再現するために必要な最低限の生成数を算出する。すなわち、中間視点画像の数が少ない場合、多視点画像データと中間視点画像データを合成した際に格子状のノイズが発生してしまう。また、一般に多視点画像によって必要な中間視点画像の数が異なるにもかかわらず、従来のように一律に多くの中間視点画像データを算出すると、滑らかなボケは再現できるが無駄な計算コストがかかってしまう。そこで、このような問題を解決するために、最適な中間視点画像の生成数を算出する指標として多視点画像もしくは多視点画像間の特徴量を算出し、算出した特徴量を基に多視点画像間を内挿する中間視点画像の最適な生成数を算出する。 The intermediate viewpoint image generation number calculation unit 303 calculates the optimum number of intermediate viewpoint images generated, that is, the minimum number of generations necessary to reproduce smooth blur according to each multi-viewpoint image. That is, when the number of intermediate viewpoint images is small, grid-like noise is generated when multi-viewpoint image data and intermediate viewpoint image data are combined. In general, even if the number of required intermediate viewpoint images differs depending on the multi-viewpoint image, if a large amount of intermediate viewpoint image data is calculated as in the past, smooth blur can be reproduced, but a wasteful calculation cost is required. End up. Therefore, in order to solve such a problem, a feature value between multi-view images or multi-view images is calculated as an index for calculating the optimum number of intermediate viewpoint images, and a multi-view image is calculated based on the calculated feature value. The optimal number of intermediate viewpoint images to be interpolated is calculated.

中間視点画像生成部３０４は、中間視点画像生成数算出部３０３にて算出された生成数に基づいて現像処理された多視点画像データから中間視点画像を生成する。リフォーカスパラメータ入力部３０５は、ユーザによって操作ボタン１１２を用いて指示されたピント位置３０６やボケの大きさ３０７を受けとる。画像合成部３０８は、多視点画像、中間視点画像、ピント位置３０６及びボケの大きさ３０７を用いて画像合成を行う。画像出力部３０９は、合成画像データをＤ／Ａ変換部２０３にてアナログ信号に変換し表示部１１３に表示し、もしくはエンコーダ部２０４にてエンコードし、ＰＣ／メディア２０６に出力する。 The intermediate viewpoint image generation unit 304 generates an intermediate viewpoint image from the multi-viewpoint image data that has been developed based on the number of generations calculated by the intermediate viewpoint image generation number calculation unit 303. The refocus parameter input unit 305 receives a focus position 306 and a blur size 307 designated by the user using the operation buttons 112. The image composition unit 308 performs image composition using the multi-viewpoint image, the intermediate viewpoint image, the focus position 306, and the blur size 307. The image output unit 309 converts the composite image data into an analog signal by the D / A conversion unit 203 and displays it on the display unit 113 or encodes it by the encoder unit 204 and outputs it to the PC / media 206.

図４は、本実施例の撮像装置の処理の流れを示すフローチャートである。本実施例では、特徴量として奥行マップを用いて中間視点画像の生成数を算出する。ここで、奥行マップとは、奥行情報、つまり画像に記録された被写体の撮像装置からの距離に関する情報を画像上にマッピングしたものであり、本実施例では中間視点画像の生成数を定めるためにこれを用いるが、これに限られない。すなわち、本実施例では被写体間の距離に応じてボケの処理の制度を最適化できるよう中間視点画像の生成数を決定するため、被写体間の距離を示す、奥行マップのような何らかの指標を用いるが、本技術分野で知られたその他の指標も用いることができる。 FIG. 4 is a flowchart illustrating a processing flow of the imaging apparatus according to the present exemplary embodiment. In this embodiment, the number of intermediate viewpoint images generated is calculated using a depth map as a feature amount. Here, the depth map is obtained by mapping the depth information, that is, information on the distance of the subject recorded in the image from the imaging device on the image, and in this embodiment, in order to determine the number of intermediate viewpoint images generated. This is used, but is not limited to this. That is, in this embodiment, in order to determine the number of intermediate viewpoint images generated so that the blur processing system can be optimized according to the distance between subjects, some index such as a depth map indicating the distance between subjects is used. However, other indicators known in the art can also be used.

ステップＳ４０１では、ユーザがシャッターボタン１１１を操作して撮影を行う。ステップＳ４０２では、ステップＳ４０１にて撮影した画像データを画像入力部３０１に入力し、現像処理部３０２にて現像処理を行うことで多視点画像データを生成する。 In step S401, the user operates the shutter button 111 to perform shooting. In step S402, the image data captured in step S401 is input to the image input unit 301, and development processing is performed in the development processing unit 302, thereby generating multi-viewpoint image data.

ステップＳ４０３では、ステップＳ４０２にて生成した多視点画像から奥行マップを生成する。ステップＳ４０４では、ステップＳ４０３にて生成した奥行マップを用いて、中間視点画像の生成数を算出する（尚、ステップＳ４０３、ステップＳ４０４の詳細については後述する）。 In step S403, a depth map is generated from the multi-viewpoint image generated in step S402. In step S404, the number of intermediate viewpoint images generated is calculated using the depth map generated in step S403 (details of steps S403 and S404 will be described later).

ステップＳ４０５では、ステップＳ４０４にて算出された中間視点画像の生成数を基に、ステップＳ４０２にて生成した多視点画像データ、ステップＳ４０３にて生成した奥行マップとを用いて、中間視点画像データを生成する。具体的には、まず多視点画像を任意のサイズのブロック領域に分け、ブロック単位で対応領域を探索するパターンマッチングを行い、対象画像中から注目画素に対応する画素（対応画素）を求める。対応画素の探索方法としては、これらに限定されるものではなく、多視点画像について対応画素を探索する方法であれば何でもよい。次に、探索した対応画素を用いて、仮想カメラ位置における中間視点画像データを生成する。 In step S405, based on the number of intermediate viewpoint images generated in step S404, intermediate viewpoint image data is generated using the multi-viewpoint image data generated in step S402 and the depth map generated in step S403. Generate. Specifically, first, the multi-viewpoint image is divided into block areas of an arbitrary size, and pattern matching for searching the corresponding area in block units is performed to obtain a pixel (corresponding pixel) corresponding to the target pixel from the target image. The search method for the corresponding pixel is not limited to these, and any method may be used as long as the corresponding pixel is searched for the multi-viewpoint image. Next, intermediate viewpoint image data at the virtual camera position is generated using the searched corresponding pixel.

仮想カメラ位置における中間視点画像データの生成方法は、対応画素から仮想カメラ位置における視点画像データを生成する方法であれば何でもよい。例えば、対応するカメラ位置が既知であることから、三角測量法により対応画素までの奥行を算出し、仮想カメラ位置における任意の画像位置については、算出した周囲の対応画素の奥行情報を用いて算出する。ステップＳ４０６では、ユーザがリフォーカスパラメータを設定する際に使用するための画像を表示部１１３に表示する。ここで、表示する画像は対象被写体全体を撮影した画像であることが望ましく、例えば、多眼撮像装置１０６によって撮影した画像を表示する。ステップＳ４０７では、リフォーカスに必要なパラメータであるピント位置とボケの大きさの設定を行う。 The intermediate viewpoint image data generation method at the virtual camera position may be any method as long as the viewpoint image data at the virtual camera position is generated from the corresponding pixel. For example, since the corresponding camera position is known, the depth to the corresponding pixel is calculated by triangulation, and the arbitrary image position at the virtual camera position is calculated using the calculated depth information of the corresponding corresponding pixel. To do. In step S406, an image for use when the user sets the refocus parameter is displayed on the display unit 113. Here, the image to be displayed is preferably an image obtained by photographing the entire target subject. For example, an image photographed by the multi-eye imaging device 106 is displayed. In step S407, the focus position and the blur size, which are parameters necessary for refocusing, are set.

図５は、リフォーカスに必要なパラメータを設定するためのユーザインタフェースの一例である。図５に示すようにステップＳ４０６にて、表示したパラメータ設定用の画像に加え、表示部にボケの大きさを指定するためのバー５０１や、ピント位置５０２を設定するためのポインタを同時に表示する。ユ−ザは、これらに対して、操作ボタン１１２や、表示部１１３に表示されているバー５０１やポインタ５０２を操作することで、リフォーカスに必要なパラメータであるピント位置３０６やボケの大きさ３０７を設定する。 FIG. 5 is an example of a user interface for setting parameters necessary for refocusing. As shown in FIG. 5, in step S406, in addition to the displayed parameter setting image, a bar 501 for specifying the size of the blur and a pointer for setting the focus position 502 are simultaneously displayed on the display unit. . The user operates the operation button 112 or the bar 501 or the pointer 502 displayed on the display unit 113 to thereby adjust the focus position 306 and the size of the blur that are parameters necessary for refocusing. 307 is set.

ステップＳ４０８では画像合成部３０８にて、ステップＳ４０２において生成した多視点画像データ、ステップＳ４０５において作成した中間視点画像データ、ステップＳ４０７において設定したリフォーカスパラメータを用いて画像合成を行う。尚、ステップＳ４０８の詳細については後述する。最後に、ステップＳ４０９では、ステップＳ４０８において合成した画像データを画像出力部３０９にて表示部１１３に表示し、あるいはＰＣ／メディア２０６に送信するが、これに限られず、本技術分野で知られた出力形態で出力する。 In step S408, the image composition unit 308 performs image composition using the multi-viewpoint image data generated in step S402, the intermediate viewpoint image data created in step S405, and the refocus parameter set in step S407. Details of step S408 will be described later. Finally, in step S409, the image data synthesized in step S408 is displayed on the display unit 113 by the image output unit 309 or transmitted to the PC / media 206. However, the present invention is not limited to this, and is known in this technical field. Output in output form.

＜奥行マップの生成＞
ここでは、ステップＳ４０３で行う奥行マップの生成について説明する。奥行マップは、多視点画像をもとに、撮像したシーンの奥行を推定することで生成される。奥行推定方法には既存の方法を利用する。例えば、ステレオ法、マルチベースラインステレオ法などが適用可能である。本実施形態では、ステレオ法によって奥行推定を行い、奥行マップを生成する。 <Generation of depth map>
Here, the generation of the depth map performed in step S403 will be described. The depth map is generated by estimating the depth of the captured scene based on the multi-viewpoint image. The existing method is used for the depth estimation method. For example, a stereo method or a multi-baseline stereo method can be applied. In this embodiment, depth estimation is performed by a stereo method, and a depth map is generated.

図６は、奥行マップ生成処理の流れを示すフローチャートである。ステップＳ６０１において、多視点画像から処理に使用する２画像を選択する。この際、全ての多視点画像のペアについて選択し、後述する処理を全ペアに対して行ってもよいし、２画像内に同一被写体が写っており、基線長が最も離れている１組のペアに対してのみ行ってもよい。一般的にステレオ法において、奥行の変化に対して視差の変化が大きいほど計測精度は高くなる。したがって、基線長が離れているほど視差が大きくなるため、後述する奥行マップの生成精度が向上する。本実施形態では撮像装置１０１の左中央のカメラユニット１０５で撮像された画像と、右中央のカメラユニット１０７で撮像された画像とから奥行マップを算出する場合を例に説明する。以下、前者を基準画像、後者を対象画像とする。 FIG. 6 is a flowchart showing the flow of depth map generation processing. In step S601, two images to be used for processing are selected from the multi-viewpoint images. At this time, all pairs of multi-viewpoint images may be selected, and the processing described later may be performed on all pairs. The same subject is reflected in two images, and the set of the most distant baseline lengths. It may be done only for pairs. In general, in the stereo method, the greater the change in parallax with respect to the change in depth, the higher the measurement accuracy. Therefore, since the parallax increases as the baseline length increases, the depth map generation accuracy described later is improved. In the present embodiment, a case where a depth map is calculated from an image captured by the left center camera unit 105 of the image capturing apparatus 101 and an image captured by the right center camera unit 107 will be described as an example. Hereinafter, the former is a reference image and the latter is a target image.

ステップＳ６０２において、以降の処理を行う注目画像の奥行値を初期化する。ステップＳ６０３において、S６０５で算出される奥行値が全画素で求められたかを判定する。全画素で奥行値が求められている場合はステップＳ６０７へ進んで出力処理を実行し、奥行値の求められていない画素がある場合はステップＳ６０４へ進み注目が祖について奥行値を求める。ステップＳ６０４では、多視点画像を任意のサイズのブロック領域に分け、ブロック単位で対応領域を探索するパターンマッチングを行い、対象画像中から注目画素に対応する画素（対応画素）を求める。 In step S602, the depth value of the image of interest to be processed thereafter is initialized. In step S603, it is determined whether the depth value calculated in S605 has been obtained for all pixels. If the depth value is obtained for all the pixels, the process proceeds to step S607 to execute the output process, and if there is a pixel for which the depth value is not obtained, the process proceeds to step S604 to obtain the depth value for the ancestor. In step S604, the multi-viewpoint image is divided into block regions of an arbitrary size, and pattern matching for searching for the corresponding region in block units is performed to obtain a pixel (corresponding pixel) corresponding to the target pixel from the target image.

次にステップＳ６０５において、注目画素とＳ６０４で求めた対応画素を基に注目画素に対応する奥行値ｄを求める。図７を参照すると理解できるように、奥行値ｄは、α、β、ｂを基に以下の式で表される。ここで、αはカメラユニット１０５の水平画角、基準画像の撮像位置および注目画素の座標から算出される。βはカメラユニット１０７の水平画角、対象画像の撮像位置および対象画素の座標から算出される。ｂはカメラユニット１０５および１０７間の距離であり、基準画像および対象画像の撮像位置より算出される。またｘ、ｚはそれぞれ、水平方向、奥行方向を示す軸である。 Next, in step S605, a depth value d corresponding to the target pixel is obtained based on the target pixel and the corresponding pixel obtained in S604. As can be understood with reference to FIG. 7, the depth value d is expressed by the following expression based on α, β, and b. Here, α is calculated from the horizontal angle of view of the camera unit 105, the imaging position of the reference image, and the coordinates of the target pixel. β is calculated from the horizontal angle of view of the camera unit 107, the imaging position of the target image, and the coordinates of the target pixel. b is the distance between the camera units 105 and 107, and is calculated from the imaging position of the reference image and the target image. X and z are axes indicating the horizontal direction and the depth direction, respectively.

ステップＳ６０６において、注目画素を更新しＳ６０３へ戻り、同様の処理を実行する。ステップＳ６０３で全ての画素につき奥行値が求められたと判断されると、ステップＳ６０７において、各画素値を基準画像の奥行値とする奥行マップを出力する。 In step S606, the target pixel is updated, and the process returns to S603 to execute the same processing. If it is determined in step S603 that the depth value has been obtained for all the pixels, a depth map having each pixel value as the depth value of the reference image is output in step S607.

＜中間視点画像の生成数算出＞
図４のステップS４０４の中間視点画像の生成数算出について説明する。本実施例における中間視点画像の生成数算出は、ステップS４０３で生成した奥行マップから奥行範囲を求め、算出した奥行範囲に基づいて行う。奥行範囲が大きい画像は視差が大きいため、その分中間視点画像を密に生成しなければ精度の高いボケの再現ができない。逆に奥行範囲が小さい画像は視差が小さいため、中間視点画像は少なくてよい。以下、図８に示すフローチャートを参照して、被写体の奥行の程度を示す奥行範囲を算出し、中間視点画像の生成数算出を説明する。 <Calculation of the number of intermediate viewpoint images generated>
The calculation of the number of intermediate viewpoint images generated in step S404 in FIG. 4 will be described. The number of intermediate viewpoint images generated in the present embodiment is calculated based on the depth range obtained from the depth map generated in step S403. Since an image with a large depth range has a large parallax, it is impossible to reproduce blur with high accuracy unless the intermediate viewpoint images are generated densely. Conversely, an image with a small depth range has a small parallax, so that there are fewer intermediate viewpoint images. Hereinafter, with reference to the flowchart shown in FIG. 8, the depth range indicating the depth of the subject is calculated, and calculation of the number of intermediate viewpoint images generated will be described.

本実施形態では、撮像装置１０１の中央のカメラユニット１０６で撮像された画像と、右中央のカメラユニット１０７で撮像された画像を例に、図９を参照して説明する。本実施形態では、奥行範囲として撮像装置から被写体までの最大の距離と最小の拒理戸に基づき以下のように算出される奥行範囲Dを使用するが、これに限られない。図９において、ｂは多視点画像間の距離、ｗは水平画素数、θ_wは水平画角、ｄは奥行値を表している。まず、ステップＳ８０１で、奥行マップを入力する。次に、ステップＳ８０２で、ステップＳ８０１にて入力した奥行マップに対して、奥行マップの各画素（ｉ，ｊ）の画素値を距離値ｄ（ｉ，ｊ）とし、最小距離値ｄ_min及び最大距離値ｄ_maxを算出する。そして、ステップＳ８０３では、ステップＳ８０２において、算出した最小距離値ｄ_minと最大距離値ｄ_maxを基に多視点画像間に何画素の視差があるかをシフト量として算出する。ｄ_minとなる被写体、すなわちシーン内において最も手前にある被写体のシフト量S _minは次式にて求めることができる。 In the present embodiment, an example of an image captured by the center camera unit 106 of the image capturing apparatus 101 and an image captured by the right center camera unit 107 will be described with reference to FIG. In the present embodiment, the depth range D calculated as follows based on the maximum distance from the imaging device to the subject and the minimum rejection door is used as the depth range, but the present invention is not limited to this. In FIG. 9, b represents a distance between multi-viewpoint images, w represents the number of horizontal pixels, θ _w represents a horizontal angle of view, and d represents a depth value. First, in step S801, a depth map is input. Next, in step S802, with respect to the depth map input in step S801, the pixel value of each pixel (i, j) of the depth map is set as the distance value d (i, j), and the minimum distance value _dmin and the maximum A distance value d _max is calculated. In step S803, in step S802, the calculating the how many pixels disparity between multi-viewpoint images the calculated minimum distance value d _min and the maximum distance value d _max based on the shift amount. The shift amount S _min of the subject that is d _min , that is, the subject that is closest to the scene can be obtained by the following equation.

同様にｄ_maxとなる被写体、すなわちシーン内において最も奥にある被写体のシフト量S _maxは次式にて求めることができる。 Similarly, the shift amount S _max of the subject that is d _max , that is, the subject that is the deepest in the scene can be obtained by the following equation.

したがって、次式によりS _minとS _maxから奥行範囲D[pix]を算出する。 Therefore, the depth range D [pix] is calculated from S _min and S _max by the following equation.

本実施例では、この奥行範囲Ｄ[pix]を切り上げた整数値を中間視点画像の生成数とする。本実施例では、水平方向に並んだカメラユニット１０６および１０７で撮像された多視点画像を例に中間視点画像の生成数の算出を行った。しかし、垂直方向に並んだ場合は、式（２）および（３）において、水平画素数ｗを垂直画素数ｈ、水平画角θ_wを垂直画角θ_hに置き換えることにより同様に奥行範囲Ｄ[pix]を計算して中間視点画像の生成数を得ることができる。 In the present embodiment, an integer value obtained by rounding up the depth range D [pix] is set as the number of intermediate viewpoint images to be generated. In the present embodiment, the number of intermediate viewpoint images generated is calculated using multi-viewpoint images captured by the camera units 106 and 107 arranged in the horizontal direction as an example. However, if arranged in the vertical direction, in the formula (2) and (3), likewise depth range D by replacing the number of vertical pixels the number of horizontal pixels w h, the horizontal angle theta _w in the vertical angle theta _h The number of intermediate viewpoint images generated can be obtained by calculating [pix].

図１０に、奥行範囲と中間視点画像の生成数との関係をグラフを用いて示す。図１０において、実線で描かれた四角形は多視点画像を表し、点線で描かれた円は中間視点画像を表している。すなわち、元の多視点画像でカバーできない部分を、点線で描いた中間視点画像で補完していくというイメージとなり、（ｂ）よりも（Ｃ）の方が中間視点画像を多く使用するので、より緻密に補完が可能となり滑らかなボケを再現することができる。またグラフの横軸は奥行範囲を表し、縦軸は中間視点画像の生成数を表しており、奥行範囲が広いほど中間支店画像の生成数を増加させることが理解できる。 FIG. 10 is a graph showing the relationship between the depth range and the number of intermediate viewpoint images generated. In FIG. 10, a rectangle drawn with a solid line represents a multi-viewpoint image, and a circle drawn with a dotted line represents an intermediate viewpoint image. In other words, the portion that cannot be covered by the original multi-viewpoint image is complemented by the intermediate viewpoint image drawn with a dotted line, and (C) uses more intermediate viewpoint images than (b). Complementation is possible precisely and smooth blur can be reproduced. Further, the horizontal axis of the graph represents the depth range, and the vertical axis represents the number of intermediate viewpoint images generated. It can be understood that the number of intermediate branch images generated increases as the depth range increases.

したがって、本実施例では、図１０に示すグラフの（ｂ）、（ｃ）に示すように奥行範囲Ｄが広い程中間視点画像の生成数を多くするので、画像（Ｂ）、（Ｃ）に示すように奥行範囲Ｄが広い程より蜜に中間視点画像データが生成される。ここでは図１０（ｂ）、（ｃ）を例に、それぞれ３および５としたが、中間視点画像の生成数はこれらに限定されない。 Therefore, in this embodiment, as the depth range D is wider as shown in (b) and (c) of the graph shown in FIG. As shown in the drawing, the intermediate viewpoint image data is generated more deeply as the depth range D is wider. Here, taking FIG. 10B and FIG. 10C as examples, the number of intermediate viewpoint images is not limited to these.

一方、図１０（ａ）に示すように、奥行範囲Ｄが限りなく狭い場合には中間視点画像は必要ない。以上より、奥行範囲Ｄに応じて中間視点画像の生成数を算出することで必要な分だけ中間視点画像データを生成すればいいので、計算コストを最小限に抑えることができる。本実施例では、奥行範囲Ｄと中間視点画像の生成数の関係が線形且つ単調増加となる場合、すなわち図１０のグラフが右肩上がりの直線となる場合を説明したが、単調増加であればよいため、必ずしも線形である必要はなく非線形でもよい。 On the other hand, as shown in FIG. 10A, when the depth range D is as narrow as possible, an intermediate viewpoint image is not necessary. As described above, since the intermediate viewpoint image data has only to be generated by calculating the number of intermediate viewpoint images generated according to the depth range D, the calculation cost can be minimized. In the present embodiment, the case where the relationship between the depth range D and the number of intermediate viewpoint images generated is linear and monotonically increasing, that is, the case where the graph of FIG. Therefore, it is not always necessary to be linear, and may be nonlinear.

＜画像合成＞
図４のステップＳ４０８の画像合成について詳細に説明する。図１１は、合成処理の流れを示すフローチャートである。また、図１２は、ピント位置及びシフト量の概要を示す図である。まず、ステップＳ１１０１では、ステップＳ４０２において作成した多視点画像データと、ステップＳ４０５において成した中間視点画像データを入力する。ステップＳ１１０２では、ステップＳ４０７において設定したピント位置３０６に基づいて、図１２（ａ）に示すようにユーザによって指示されたピント位置５０２を取得する。図５で指定されたピント位置５０２は、図１２では、ピント位置１２１１および１２１２に対応する。ステップＳ１１０３では、ステップＳ１１０２にて取得したピント位置に基づいて、図１２（ｂ）に示すような各多視点画像及び中間視点画像間のピント位置１２１１および１２１２間のシフト量Δｘ_i、Δｙ_iを次式にて算出する。 <Image composition>
The image composition in step S408 in FIG. 4 will be described in detail. FIG. 11 is a flowchart showing the flow of the composition process. FIG. 12 is a diagram showing an outline of the focus position and the shift amount. First, in step S1101, the multi-viewpoint image data created in step S402 and the intermediate viewpoint image data formed in step S405 are input. In step S1102, based on the focus position 306 set in step S407, the focus position 502 instructed by the user is acquired as shown in FIG. The focus position 502 specified in FIG. 5 corresponds to the focus positions 1211 and 1212 in FIG. In step S1103, based on the focus position acquired in step S1102, shift amounts Δx _i and Δy _i between the focus positions 1211 and 1212 between the multi-viewpoint images and intermediate viewpoint images as shown in FIG. Calculate with the following formula.

ここで、ｂ_mおよびｃ_mは、それぞれ多視点画像および中間視点画像間の水平距離、垂直距離、ｄは奥行値、θ_wは水平画角、θ_hは垂直画角、ｗは水平画素数、ｈは垂直画素数を表している。ステップＳ１１０４では、ステップＳ４０７にて設定したボケの大きさの制御値に基づいて、重み係数を算出する。 Here, b _m and c _m is the horizontal distance between the multi-view image and the intermediate-viewpoint images respectively, the vertical distance, d is the depth value, theta _w horizontal field angle, theta _h is the vertical field angle, w is the number of horizontal pixels , H represents the number of vertical pixels. In step S1104, a weighting factor is calculated based on the blur size control value set in step S407.

一般的に、多視点カメラにおいて撮影した画像は、カメラ間の距離が大きいほど視差が大きくなる。あるピント位置において画像を正確に合わせるとき、多視点画像と中間視点画像間のシフト量Δｘ_i、Δｙ_iを基に画像をシフトさせる。このとき、ピント位置以外では、ずれて重なる。その結果、ずれた画像を重ね合わせた場合、「ボケ」となる。そこで、合成の基準とするカメラ（以下、合成基準カメラ）と、どの程度はなれたカメラで取得した画像を、どの程度合成するかによってボケを制御することができる。すなわち、ボケの制御値として、合成基準カメラから離れたカメラの合成基準カメラとの距離に応じた重み係数を考慮して合成することで、ボケの大きさの制御が可能となる。 In general, the parallax of an image captured by a multi-view camera increases as the distance between the cameras increases. When the images are accurately aligned at a certain focus position, the images are shifted based on the shift amounts Δx _i and Δy _i between the multi-viewpoint image and the intermediate viewpoint image. At this time, they overlap and deviate except at the focus position. As a result, when the shifted images are superimposed, the result is “blurred”. Therefore, it is possible to control the blur depending on how much an image acquired by a camera used as a reference for synthesis (hereinafter referred to as a “composite reference camera”) and an image obtained by a camera that is far apart are synthesized. That is, the blur size can be controlled by combining the blur control values in consideration of the weighting factor corresponding to the distance between the camera that is far from the synthesis reference camera and the synthesis reference camera.

図１３は、ボケの大きさの制御値に対応する重み係数ｗ_kの一例である。本実施例の重み係数ｗ_kは、合成基準カメラを中心としたガウス関数に従った値を持っている。合成基準カメラに近いほど、図１３に示すように重み係数ｗ_kは大きくなり、合成基準カメラから離れるほど重み係数ｗ_kは小さくなる。ここで、多視点画像データと中間視点画像データとを合成して合成画像データを生成する際、重み係数ｗ_kの和が１に正規化されるように、各画像の重み係数ｗ_kを全画像の重み係数ｗ_kの総和で割る。最後に、ステップＳ１１０５では、ステップＳ１１０３にて、算出した各多視点画像及び中間画像間のピント位置のシフト量Δｘ_i、Δｙ_iと撮影モードとボケの大きさの制御値に応じた重み係数ｗ_kを用いて、合成画像I(x,y)は次式によって算出される。 FIG. 13 is an example of the weighting factor w _k corresponding to the control value of the blur size. The weighting factor w _k of the present embodiment has a value according to a Gaussian function centered on the synthesis reference camera. As shown in FIG. 13, the weight coefficient w _k increases as the distance from the synthesis reference camera increases, and the weight coefficient w _k decreases as the distance from the synthesis reference camera increases. Here, when the multi-view image data and the intermediate-viewpoint image data are synthesized to generate synthesized image data, so that the sum of the weighting factor w _k is normalized to 1, the weighting factor w _k of each image all Divide by the sum of the image weighting factors w _k . Finally, in step S1105, the weighting coefficient w according to the control values of the focus position shift amounts Δx _i , Δy _i between the multi-viewpoint images and the intermediate images calculated in step S1103 and the shooting mode and the blur size. _{Using k} , the composite image I (x, y) is calculated by the following equation.

ここで、Ｊ_iは、多視点画像及び中間視点画像、Ｄは多視点画像及び中間視点画像の中央画像からの距離、ｍは多視点画像及び中間視点画像の総数である。尚、本実施例ではシーン全体の奥行マップから中間視点画像の生成数を算出したが、シーン内の領域ごとに中間視点画像の生成数を算出してもよい。 Here, J _i is a multi-view image and intermediate viewpoint image, D is the distance from the central image of the multi-view image and intermediate viewpoint image, and m is the total number of multi-view images and intermediate viewpoint images. In this embodiment, the number of intermediate viewpoint images generated is calculated from the depth map of the entire scene. However, the number of intermediate viewpoint images generated may be calculated for each region in the scene.

以上により、本発明は奥行マップの奥行範囲に応じて、中間視点画像の生成数を算出し、その値に応じた中間視点画像データの生成を行う。これにより、余計な中間視点画像データを生成せずに演算コストを抑えることができる撮像装置を提供する。 As described above, the present invention calculates the number of intermediate viewpoint images generated according to the depth range of the depth map, and generates intermediate viewpoint image data according to the value. This provides an imaging apparatus that can reduce the calculation cost without generating extra intermediate viewpoint image data.

［実施例２］
実施例１では、画像解析部３０３において多視点画像をもとに生成した奥行マップから奥行範囲を算出し、算出された値に応じて中間視点画像の生成数を算出する方法について説明した。本実施例においては、画像解析部３０３において多視点画像間の類似度をもとに中間視点画像の生成数を算出する。 [Example 2]
In the first embodiment, the method of calculating the depth range from the depth map generated based on the multi-viewpoint image in the image analysis unit 303 and calculating the number of intermediate viewpoint images generated according to the calculated value has been described. In this embodiment, the image analysis unit 303 calculates the number of intermediate viewpoint images generated based on the similarity between multi-viewpoint images.

本実施例における画像処理部の内部構成は実施例１と同じである。本実施例の撮像装置の処理の流れを図１４に示す。ステップＳ１４０１、Ｓ１４０２、Ｓ１４０５〜Ｓ１４０９はそれぞれ実施例１で説明した図４のステップＳ４０１、ステップＳ４０２、ステップＳ４０５〜ステップＳ４０９と同様の処理である。すなわち、ステップＳ１４０３および１４０４が実施例１と異なる。ステップＳ１４０３では、ステップＳ１４０２で生成された多視点画像から画像間の類似度を算出する。ステップＳ１４０４では、ステップＳ１４０３で算出した類似度をもとに中間視点画像の生成数を算出する。 The internal configuration of the image processing unit in the present embodiment is the same as that in the first embodiment. FIG. 14 shows the flow of processing of the image pickup apparatus of the present embodiment. Steps S1401, S1402, and S1405 to S1409 are the same processes as steps S401, S402, and S405 to S409 of FIG. 4 described in the first embodiment. That is, steps S1403 and 1404 are different from the first embodiment. In step S1403, the similarity between images is calculated from the multi-viewpoint image generated in step S1402. In step S1404, the number of intermediate viewpoint images generated is calculated based on the similarity calculated in step S1403.

＜画像間の類似度算出＞
本実施例における画像間の類似度の算出方法について以下に説明する。本実施例における画像間の類似度の算出は、多視点画像を用いて行う。以下、図１５に示すフローチャートを参照して、画像間の類似度の算出方法を説明する。 <Calculation of similarity between images>
A method for calculating the similarity between images in this embodiment will be described below. In this embodiment, the similarity between images is calculated using multi-viewpoint images. Hereinafter, a method for calculating the similarity between images will be described with reference to the flowchart shown in FIG.

まずステップＳ１５０１において、多視点画像から、処理に使用する２画像を選択する。ただし、これら２画像内に同一被写体が写っているものとする。本実施例では撮像装置１０１の中央のカメラユニット１０６で撮像された画像と、それと横方向に隣接するカメラユニット１０７で撮像された画像とを例に説明する。以下、前者を基準画像、後者を対象画像とする。 First, in step S1501, two images used for processing are selected from the multi-viewpoint images. However, it is assumed that the same subject appears in these two images. In the present exemplary embodiment, an image captured by the camera unit 106 at the center of the image capturing apparatus 101 and an image captured by the camera unit 107 that is adjacent thereto in the horizontal direction will be described as an example. Hereinafter, the former is a reference image and the latter is a target image.

ステップＳ１５０２において、基準画像I(x,y)と対象画像J(x,y)の全画素について、同一画素の画素値の差分値Ｋを次式にて算出する。 In step S1502, a difference value K of pixel values of the same pixel is calculated by the following equation for all pixels of the reference image I (x, y) and the target image J (x, y).

ステップＳ１５０３において、基準画像I(x,y)と対象画像J(x,y)との類似度を算出する。類似度の算出方法には本技術分野で知られたいずれの方法も利用することができる。例えば、ＰＳＮＲ、相関係数、平均二乗誤差などが適用可能である。本実施例では、ＰＳＮＲを類似度とする。ＰＳＮＲは、次式によって算出される。 In step S1503, the similarity between the reference image I (x, y) and the target image J (x, y) is calculated. Any method known in this technical field can be used as a method for calculating the similarity. For example, PSNR, correlation coefficient, mean square error, etc. can be applied. In this embodiment, PSNR is set as the similarity. PSNR is calculated by the following equation.

ここで、ＭＡＸは基準画像がとりうる最大画素値であり、ＭＳＥは平均二乗誤差である。またｍ、ｎが基準画像の縦、横のサイズである。 Here, MAX is the maximum pixel value that the reference image can take, and MSE is the mean square error. M and n are the vertical and horizontal sizes of the reference image.

＜中間視点画像の生成数算出＞
本実施例における中間視点画像の生成数算出について説明する。本実施例では、画像間の類似度に基づいて中間視点画像の生成数を算出する。 <Calculation of the number of intermediate viewpoint images generated>
Calculation of the number of intermediate viewpoint images generated in the present embodiment will be described. In this embodiment, the number of intermediate viewpoint images generated is calculated based on the similarity between images.

図１６は、中間視点画像の生成数算出を説明するための画像である。図１６において、実線で描かれた四角形は多視点画像を表し、点線で描かれた円は中間視点画像を表している。また、グラフの横軸は類似度（PSNR）を表し、縦軸は中間視点画像の生成数を表している。類似度が小さい場合、多視点画像間において差異が大きいため、類似度が大きい場合に比べて中間視点画像を密に生成する必要がある。例えば、ＰＳＮＲが２０、３０の場合、図１６（ａ）（ｂ）に示すように類似度が低いほど中間視点画像が多く必要となり、図１６（Ａ）、（Ｂ）に示すようにより多くの中間視点画像でより密に処理される。 FIG. 16 is an image for explaining calculation of the number of intermediate viewpoint images generated. In FIG. 16, a square drawn with a solid line represents a multi-viewpoint image, and a circle drawn with a dotted line represents an intermediate viewpoint image. The horizontal axis of the graph represents similarity (PSNR), and the vertical axis represents the number of intermediate viewpoint images generated. When the degree of similarity is small, there is a large difference between the multi-viewpoint images, so it is necessary to generate intermediate viewpoint images more densely than when the degree of similarity is large. For example, when the PSNR is 20 or 30, as the similarity is lower as shown in FIGS. 16 (a) and 16 (b), more intermediate viewpoint images are required, and as shown in FIGS. 16 (A) and 16 (B), more intermediate viewpoint images are required. The intermediate viewpoint image is processed more densely.

これに対し、類似度が大きい場合、多視点画像間における差異が小さいため、作成する中間視点位置としては均一であって、それほど密である必要はない。例えば、ＰＳＮＲが５０の場合、類似度は非常に高いため、図１６（ｃ）に示すように中間視点画像はなく、図１６（Ｃ）に示すように元の多視点画像のみとなる。本実施例では、多視点画像間の類似度と中間視点画像の生成数の関係が線形且つ単調増加となる場合、すなわち図１６のグラフが右肩下がりの直線となる場合を説明したが、単調増加であればよいため、必ずしも線形である必要はなく非線形でもよい。 On the other hand, when the degree of similarity is large, the difference between the multi-viewpoint images is small, so that the intermediate viewpoint position to be created is uniform and does not need to be so dense. For example, when the PSNR is 50, the similarity is very high, so there is no intermediate viewpoint image as shown in FIG. 16C, and only the original multi-view image as shown in FIG. 16C. In the present embodiment, the case where the relationship between the similarity between the multi-viewpoint images and the number of intermediate viewpoint images generated is linear and monotonically increasing, that is, the case where the graph of FIG. Since it only needs to increase, it does not necessarily need to be linear and may be non-linear.

尚、本実施例ではシーン全体の類似度から中間視点画像の生成数を算出したが、シーン内の領域ごとに中間視点画像の生成数を算出してもよい。 In this embodiment, the number of intermediate viewpoint images generated is calculated from the similarity of the entire scene. However, the number of intermediate viewpoint images generated may be calculated for each region in the scene.

以上により、本発明は画像間の類似度に応じて、中間視点画像の生成数を算出し、その値に応じた中間視点画像の生成を行う。これにより、余計な中間視点画像を生成せずに演算コストを抑えることができる撮像装置を提供する。 As described above, the present invention calculates the number of intermediate viewpoint images generated according to the similarity between the images, and generates an intermediate viewpoint image according to the value. This provides an imaging apparatus that can reduce the calculation cost without generating an extra intermediate viewpoint image.

［実施例３］
本実施例では、画像解析部３０３において多視点画像データを周波数解析して得られる周波数成分をもとに中間視点画像の生成数を算出する。本実施例における画像処理部の内部構成は実施例１、２と同じである。本実施例の撮像装置の処理の流れを図１７に示す。 [Example 3]
In this embodiment, the number of intermediate viewpoint images generated is calculated based on frequency components obtained by frequency analysis of multi-viewpoint image data in the image analysis unit 303. The internal configuration of the image processing unit in the present embodiment is the same as in the first and second embodiments. FIG. 17 shows a processing flow of the imaging apparatus of the present embodiment.

ステップＳ１７０１、Ｓ１７０２、Ｓ１７０５〜Ｓ１７０９はそれぞれ実施例１で説明した図４のステップＳ４０１、ステップＳ４０２、ステップＳ４０５〜ステップＳ４０９と同様の処理である。すなわち、ステップＳ１７０３および１７０４が実施例１と異なる。ステップＳ１７０３では、ステップＳ１７０２で生成された多視点画像データを周波数解析し、空間周波数領域に変換する。ステップＳ１７０４では、ステップＳ１７０３で変換した空間周波数領域の空間周波数成分に応じて中間視点画像の生成数を算出する。 Steps S1701, S1702, and S1705 to S1709 are the same as steps S401, S402, and S405 to S409 of FIG. 4 described in the first embodiment. That is, steps S1703 and 1704 are different from the first embodiment. In step S1703, the multi-viewpoint image data generated in step S1702 is frequency-analyzed and converted into a spatial frequency domain. In step S1704, the number of intermediate viewpoint images generated is calculated according to the spatial frequency component of the spatial frequency domain converted in step S1703.

＜画像の周波数解析＞
ここでは、本実施例における画像の周波数解析について説明する。以下、図１８に示すフローチャートを参照して、画像データの周波数解析を説明する。 <Frequency analysis of images>
Here, the frequency analysis of the image in a present Example is demonstrated. Hereinafter, frequency analysis of image data will be described with reference to the flowchart shown in FIG.

まずステップＳ１８０１において、多視点画像データを順に空間周波数領域へ変換する。ここで空間周波数領域とは、画像データを正弦波に分解して周波数ごとに並び変えたものである。 First, in step S1801, multi-viewpoint image data is sequentially converted to the spatial frequency domain. Here, the spatial frequency region is obtained by decomposing image data into sine waves and rearranging them for each frequency.

次にステップＳ１８０２において、画像データの周波数特性を算出する。画像データを空間周波数領域へ変換すると、その画像の性質に応じて周波数成分が得られる。全体的に濃淡変化の少ない画像の場合、低周波成分が多く含まれ（図１９の曲線（ａ））、変化の大きい領域を含む画像の場合、高周波成分が多く含まれる（図１９の曲線（ｂ））。本来、画像データの場合は水平方向および垂直方向の成分が存在するが、ここでは単純に両方の周波数成分の和を１つの周波数成分としており、横軸を周波数成分、縦軸を各周波数成分が持つエネルギー値としている。 In step S1802, the frequency characteristics of the image data are calculated. When image data is converted to the spatial frequency domain, frequency components are obtained according to the properties of the image. In the case of an image having a small overall change in shading, many low-frequency components are included (curve (a) in FIG. 19), and in the case of an image including a region with a large change, many high-frequency components are included (curve in FIG. b)). Originally, in the case of image data, there are horizontal and vertical components, but here the sum of both frequency components is simply one frequency component, with the horizontal axis representing the frequency component and the vertical axis representing each frequency component. It has energy value.

＜中間視点画像の生成数算出＞
本実施例における中間視点画像の生成数算出について説明する。図２０は、中間視点画像の生成数算出を説明するための画像である。本実施例における中間視点画像の生成数算出は、画像を周波数解析し、その結果得られた周波数成分に基づいて行う。高周波成分を多く含む画像は複雑なシーンであるため、中間視点画像データは密に生成する必要がある。逆に高周波成分をあまり含まない場合は単調なシーンであるため、中間視点画像は少なくてよい。つまり、周波数成分の内、高周波成分をどれだけ含むかによって中間視点画像の生成数を決定することができる。 <Calculation of the number of intermediate viewpoint images generated>
Calculation of the number of intermediate viewpoint images generated in the present embodiment will be described. FIG. 20 is an image for explaining the calculation of the number of intermediate viewpoint images generated. The calculation of the number of intermediate viewpoint images generated in the present embodiment is performed based on the frequency component obtained by frequency analysis of the image. Since an image containing many high-frequency components is a complex scene, intermediate viewpoint image data must be generated densely. On the contrary, when the high frequency component is not included so much, it is a monotonous scene, so that the intermediate viewpoint image may be small. That is, the number of intermediate viewpoint images to be generated can be determined depending on how many high frequency components are included in the frequency components.

図２０に、本実施例の高周波成分と中間視点画像の生成数の関係をグラフで表した例を示す。図２０において実線で描かれた四角形は多視点画像を表し、点線で描かれた円は中間視点画像を表している。また、グラフの横軸は周波数成分を表し、縦軸は中間視点画像の生成数を表している。図２０（ａ）のように各多視点画像データの高周波成分がほとんどない場合は、図２０（Ａ）に示すように中間視点画像データを生成しなくてもよい。一方、図２０（ｂ）、（ｃ）のように各多視点画像データが高周波成分を多く含むほど、図２０（Ｂ）、（Ｃ）に示すように中間視点画像データを多く生成する必要がある。 FIG. 20 shows an example of a graph representing the relationship between the high-frequency component and the number of intermediate viewpoint images generated in this embodiment. In FIG. 20, a square drawn with a solid line represents a multi-viewpoint image, and a circle drawn with a dotted line represents an intermediate viewpoint image. The horizontal axis of the graph represents frequency components, and the vertical axis represents the number of intermediate viewpoint images generated. When there is almost no high-frequency component of each multi-viewpoint image data as shown in FIG. 20A, intermediate viewpoint image data need not be generated as shown in FIG. On the other hand, as the multi-viewpoint image data includes more high-frequency components as shown in FIGS. 20B and 20C, it is necessary to generate more intermediate viewpoint image data as shown in FIGS. 20B and 20C. is there.

本実施例では、周波数成分と中間視点画像の生成数の関係が線形且つ単調増加の場合、すなわち図２０のグラフが右肩上がりの直線となる場合を説明したが、単調増加であればよいため、必ずしも線形である必要はなく非線形でもよい。尚、本実施例ではシーン全体の周波数成分の内、高周波成分に基づいて中間視点画像の生成数を算出したが、シーン内の領域ごとに中間視点画像の生成数を算出してもよい。 In the present embodiment, the case where the relationship between the frequency component and the number of intermediate viewpoint images generated is linear and monotonically increasing, that is, the case where the graph of FIG. However, it is not necessarily linear and may be non-linear. In this embodiment, the number of intermediate viewpoint images generated is calculated based on the high-frequency component of the frequency components of the entire scene. However, the number of intermediate viewpoint images generated may be calculated for each region in the scene.

以上により、本発明は画像データの周波数成分に応じて、中間視点画像の生成数を算出し、その値に応じた中間視点画像の生成を行う。これにより、余計な中間視点画像データを生成せずに演算コストを抑えることができる撮像装置を提供する。 As described above, the present invention calculates the number of intermediate viewpoint images generated according to the frequency component of the image data, and generates an intermediate viewpoint image according to the value. This provides an imaging apparatus that can reduce the calculation cost without generating extra intermediate viewpoint image data.

［その他の実施例］
実施例１ないし３において多視点画像もしくは多視点画像間の特徴量に基づいて中間視点画像の生成数を算出したが、これらの実施例を任意に組み合わせて用いてもよい。また、ＰＣなどで処理しても良い。 [Other Examples]
In the first to third embodiments, the number of intermediate viewpoint images generated is calculated based on the multi-view images or the feature values between the multi-view images. However, these embodiments may be used in any combination. Moreover, you may process by PC etc.

また、本発明は、以下の処理を実行することによっても実現される。すなわち、上述した実施例の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。また、本発明は、複数のプロセッサが連携して処理を行うことによっても実現できるものである。 The present invention can also be realized by executing the following processing. That is, software (program) for realizing the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, etc.) of the system or apparatus reads the program. It is a process to be executed. The present invention can also be realized by a plurality of processors cooperating to perform processing.

Claims

Multi-viewpoint image input means for inputting multi-viewpoint image data acquired by an imaging device having a plurality of imaging units;
Intermediate viewpoint image generation number calculating means for calculating the number of intermediate viewpoint images that need to be generated in order to obtain a composite image that satisfies a predetermined characteristic according to the feature amount of the multi-viewpoint image;
Intermediate viewpoint image generation means for generating intermediate viewpoint image data of the multi-viewpoint image data, the number of generations calculated by the intermediate viewpoint image generation number calculation means;
An image processing apparatus comprising: image synthesis means for generating synthesized image data using the intermediate viewpoint image data generated by the intermediate viewpoint image generation means.

The image processing apparatus according to claim 1, wherein the feature amount is a depth range indicating a depth of a subject obtained from the multi-viewpoint image.

The image processing apparatus according to claim 2, wherein the number of intermediate viewpoint images generated is greater than the number of intermediate viewpoint images generated at an intermediate viewpoint position where the depth range is narrower.

The image processing apparatus according to claim 1, wherein the feature amount is a similarity between the multi-viewpoint images.

The image processing apparatus according to claim 4, wherein the number of intermediate viewpoint images generated is greater than the number of intermediate viewpoint images generated at an intermediate viewpoint position having a higher similarity.

The image processing apparatus according to claim 1, wherein the feature amount is a frequency component obtained by frequency analysis of the multi-viewpoint image data.

The image processing apparatus according to claim 6, wherein the number of intermediate viewpoint images generated is greater than the number of intermediate viewpoint images generated at an intermediate viewpoint position at which the frequency component does not include a high frequency component.

The image processing apparatus according to claim 1, wherein the predetermined characteristic is an accuracy of blurring of a non-focus surface when performing a refocusing process.

A multi-viewpoint image input step of inputting multi-viewpoint image data acquired by an imaging device having a plurality of imaging units;
An intermediate viewpoint image generation number calculating step for calculating the number of intermediate viewpoint images that need to be generated in order to obtain a composite image satisfying a predetermined characteristic according to the feature amount of the multi-viewpoint image;
An intermediate viewpoint image generation step of generating intermediate viewpoint image data of the multi-viewpoint image data, the number of generations calculated in the intermediate viewpoint image generation number calculation step;
An image composition step of generating composite image data using the intermediate viewpoint image data generated by the intermediate viewpoint image generation step.

A program for causing a computer to function as the image processing apparatus according to any one of claims 1 to 8.