JP2024005590A

JP2024005590A - Image processing device, imaging device, image processing method, and program

Info

Publication number: JP2024005590A
Application number: JP2022105837A
Authority: JP
Inventors: 太省森; Taisho Mori
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2022-06-30
Filing date: 2022-06-30
Publication date: 2024-01-17

Abstract

PROBLEM TO BE SOLVED: To provide image processing technology with which it is possible to acquire depth information regarding the depth direction of a subject with greater accuracy.

SOLUTION: An imaging element 101c of an imaging device 100 includes a plurality of photoelectric conversion units that respectively perform photoelectric conversion on light having passed through different pupil portion regions in an imaging optical system. The imaging device 100 is capable of generating an image corresponding to each of the photoelectric conversion units and acquiring a plurality of viewpoint images. The imaging device 100 changes the opening amount of a diaphragm 101b and captures the image of a subject, then performs processing regarding the difference between a viewpoint image captured with a first opening amount and a viewpoint image captured with a second opening amount that is smaller than the first opening amount, so as to generate a difference image. The imaging device 100 is capable of searching for a corresponding point of images using a plurality of images that are free of blurs and largely misaligned and acquiring depth information concerning the subject in the images.

SELECTED DRAWING: Figure 1

Description

本発明は、被写体の奥行方向の深度情報を取得する技術に関する。 The present invention relates to a technique for acquiring depth information of a subject in the depth direction.

撮像装置に具備される撮像素子が測距機能を有する場合、観賞用画像および被写体までの距離情報を取得することが可能である。距離情報に基づいて生成される３次元（以下、「３Ｄ」と記す）データは、ＶＲ（ＶｉｒｔｕａｌＲｅａｌｉｔｙ）、ＡＲ（ＡｕｇｍｅｎｔｅｄＲｅａｌｉｔｙ）、ＭＲ（ＭｉｘｅｄＲｅａｌｉｔｙ）の表示用コンテンツとして用いることができる。 When the image sensor included in the imaging device has a distance measurement function, it is possible to obtain an ornamental image and distance information to a subject. Three-dimensional (hereinafter referred to as "3D") data generated based on distance information can be used as display content for VR (Virtual Reality), AR (Augmented Reality), and MR (Mixed Reality).

撮像光学系の異なる瞳部分領域をそれぞれ通過する光を２種類の画素部で捉え、それぞれの画素部により取得される画像から像ズレ量を算出することで距離情報の取得が可能である。２種類の画素部のうちの一方をＡ画素と表記し、他方をＢ画素と表記する。複数のＡ画素の出力から生成される画像をＡ画像といい、複数のＢ画素の出力から生成される画像をＢ画像という。例えば、Ａ画像とＢ画像との相関演算結果に基づいて撮像光学系の焦点検出および焦点調節制御を行うことができる。またＡ画像およびＢ画像を加算した加算画像に基づいて観賞用画像を生成することができる。特許文献１では、Ａ画像およびＢ画像におけるボケを有する像の像ズレ量を算出するために、絞りの開口の大きさ（開口量）を変化させて適した像ズレの方向を探索する方法が開示されている。 Distance information can be obtained by capturing the light passing through different pupil partial regions of the imaging optical system using two types of pixel sections and calculating the amount of image shift from the images obtained by the respective pixel sections. One of the two types of pixel portions will be referred to as an A pixel, and the other will be referred to as a B pixel. An image generated from the outputs of a plurality of A pixels is called an A image, and an image generated from the outputs of a plurality of B pixels is called a B image. For example, focus detection and focus adjustment control of the imaging optical system can be performed based on the correlation calculation result between the A image and the B image. Furthermore, an ornamental image can be generated based on the added image obtained by adding the A image and the B image. Patent Document 1 discloses a method of searching for an appropriate direction of image shift by changing the aperture size (opening amount) of an aperture in order to calculate the amount of image shift of images with blur in images A and B. Disclosed.

特開２０１９－２００３４８号公報JP 2019-200348 Publication

従来の技術では、合焦範囲から外れた位置にある被写体の像に発生するボケが大きい場合に像ズレ方向の探索を行った際、どちらの方向の像ズレをもってしても像ズレ量の検出精度を高めることが困難である。
本発明は、深度情報をより高精度に取得することが可能な画像処理技術の提供を目的とする。 With conventional technology, when searching for the direction of image shift when there is large blur in the image of a subject located outside the focus range, it is difficult to detect the amount of image shift regardless of which direction the image shift occurs. It is difficult to increase accuracy.
An object of the present invention is to provide an image processing technique that can obtain depth information with higher accuracy.

本発明の実施形態の画像処理装置は、視点の異なる複数の視点画像から画像内における被写体の奥行方向の深度情報を取得することが可能な画像処理装置であって、撮像光学系における異なる瞳部分領域をそれぞれ通過する光に基づく複数の画像を取得する取得手段と、前記撮像光学系の絞りの開口量を変更する制御を行い、絞りの開口量が異なる状態で撮像された複数の画像を用いて前記深度情報を取得する制御を行う制御手段と、を備える。前記取得手段は、第１の開口量で撮像された第１および第２の視点画像と、前記第１の開口量よりも小さい第２の開口量で撮像された第３および第４の視点画像を取得し、前記制御手段は、前記第１の視点画像と前記第３の視点画像から第１の差分画像を生成し、前記第２の視点画像と前記第４の視点画像から第２の差分画像を生成し、前記第１および第２の差分画像のうちの１つ以上を用いて前記深度情報を取得する制御を行う。 An image processing device according to an embodiment of the present invention is an image processing device capable of acquiring depth information in the depth direction of a subject in an image from a plurality of viewpoint images having different viewpoints, and which an acquisition unit that acquires a plurality of images based on light passing through each region; and a control that changes the aperture amount of the aperture of the imaging optical system, and uses a plurality of images captured with different aperture amounts of the aperture; and control means for controlling the acquisition of the depth information. The acquisition means is configured to acquire first and second viewpoint images captured with a first aperture, and third and fourth viewpoint images captured with a second aperture smaller than the first aperture. The control means generates a first difference image from the first viewpoint image and the third viewpoint image, and generates a second difference image from the second viewpoint image and the fourth viewpoint image. Control is performed to generate an image and obtain the depth information using one or more of the first and second difference images.

本発明によれば、深度情報をより高精度に取得することが可能な画像処理技術を提供することができる。 According to the present invention, it is possible to provide an image processing technique that allows depth information to be acquired with higher accuracy.

本実施形態に係る撮像装置の構成例を示す図である。1 is a diagram showing a configuration example of an imaging device according to an embodiment. 本実施形態に係る撮像装置の外観を模式的に示す図である。FIG. 1 is a diagram schematically showing the appearance of an imaging device according to the present embodiment. 瞳分割型撮像素子の説明図である。FIG. 2 is an explanatory diagram of a pupil division type image sensor. 被写体からの光を画素部が受光する様子を示す図である。FIG. 3 is a diagram showing how a pixel section receives light from a subject. 非合焦被写体からの光を画素部が受光する様子を示す図である。FIG. 3 is a diagram showing how a pixel section receives light from an out-of-focus object. 対応箇所が精度よく算出できない場合の画像例を示す図である。FIG. 7 is a diagram illustrating an example of an image when corresponding locations cannot be calculated with high accuracy. 非合焦被写体からの光を画素部が受光する様子の別例を示す図である。FIG. 7 is a diagram showing another example of how a pixel section receives light from an out-of-focus object. 本実施形態における撮影状態を模式的に示す図である。FIG. 3 is a diagram schematically showing a photographing state in this embodiment. 本実施形態における画像例を示す図である。It is a figure showing an example of an image in this embodiment. ケラレの発生の様子を模式的に示す図である。FIG. 3 is a diagram schematically showing how vignetting occurs. 本実施形態の処理を説明するフローチャートである。It is a flowchart explaining the processing of this embodiment. 本実施形態に係る表示部の画面例を示す図である。It is a figure showing the example of a screen of the display part concerning this embodiment. 画像内の複数の領域とウィンドウを模式的に示す図である。FIG. 3 is a diagram schematically showing a plurality of regions and windows in an image. 絞り小としたときの絞りと光束との関係を示す図である。FIG. 7 is a diagram showing the relationship between the aperture and the luminous flux when the aperture is made small.

以下、本発明の実施形態について、図面を参照して詳細に説明する。図１は、画像処理装置の適用例としての撮像装置の構成例を示すブロック図である。撮像装置１００は視差画像に基づいて深度情報を取得することが可能である。視差画像は視点の異なる複数の画像（以下、視点画像という）により構成される。深度情報は、被写体の奥行方向の深さを表す情報である。 Embodiments of the present invention will be described in detail below with reference to the drawings. FIG. 1 is a block diagram showing a configuration example of an imaging device as an application example of an image processing device. The imaging device 100 can acquire depth information based on parallax images. A parallax image is composed of a plurality of images from different viewpoints (hereinafter referred to as viewpoint images). Depth information is information representing the depth of the subject in the depth direction.

撮像装置１００は撮像部１０１、演算部１０２、記憶部１０３、シャッターボタン１０４、操作部１０５、表示部１０６、制御部１０７を備える。撮像部１０１は、レンズ１０１ａ、絞り１０１ｂ、撮像素子１０１ｃを備える。レンズ１０１ａ、絞り１０１ｂは撮像光学系（結像光学系）を構成する。被写体からの光は撮像光学系を通過してから撮像素子１０１ｃにより受光される。撮像素子１０１ｃは光学像に対する光電変換を行い、電気信号を演算部１０２と制御部１０７に出力する。 The imaging device 100 includes an imaging section 101, a calculation section 102, a storage section 103, a shutter button 104, an operation section 105, a display section 106, and a control section 107. The imaging unit 101 includes a lens 101a, an aperture 101b, and an image sensor 101c. The lens 101a and the aperture 101b constitute an imaging optical system (imaging optical system). Light from the subject passes through the imaging optical system and is then received by the imaging element 101c. The image sensor 101c performs photoelectric conversion on the optical image and outputs an electrical signal to the calculation unit 102 and the control unit 107.

演算部１０２は、撮像された画像に対して現像処理を行い、現像処理後のデータを記憶部１０３に記憶させる。記憶部１０３は、例えば着脱可能な記憶媒体（ＳＤカード等）、あるいは、撮像装置１００の内部に備えられた記憶媒体を有する。 The calculation unit 102 performs development processing on the captured image, and stores the data after the development processing in the storage unit 103. The storage unit 103 includes, for example, a removable storage medium (such as an SD card) or a storage medium provided inside the imaging device 100.

シャッターボタン１０４は、撮像装置１００の使用者が押下することにより撮影指示を行うための操作部材であり、操作信号を制御部１０７に出力する。操作部１０５は、使用者が撮像装置１００への操作指示を行う際に使用可能な操作部材を備え、操作指示の信号を制御部１０７に出力する。 The shutter button 104 is an operation member that is pressed by the user of the imaging apparatus 100 to issue a shooting instruction, and outputs an operation signal to the control unit 107 . The operation unit 105 includes an operation member that can be used when a user issues an operation instruction to the imaging apparatus 100, and outputs a signal of the operation instruction to the control unit 107.

表示部１０６は、液晶式ディスプレイ等の表示デバイスを有する。表示部１０６は、制御部１０７の制御指令にしたがって撮像画像やＧＵＩ（グラフィック・ユーザ・インターフェース）用画像等を画面に表示する。 The display unit 106 includes a display device such as a liquid crystal display. The display unit 106 displays captured images, GUI (graphic user interface) images, and the like on the screen according to control commands from the control unit 107.

制御部１０７は、例えばＣＰＵ（中央演算処理装置）を備え、撮像装置全体の制御を行う。制御部１０７は各種インターフェース処理や演算部の制御を行う。 The control unit 107 includes, for example, a CPU (central processing unit), and controls the entire imaging device. A control unit 107 performs various interface processing and controls the calculation unit.

本実施形態では、撮像装置の本体部に一体化された撮像光学系を有する構成例を示すが、撮像装置の本体部に、撮像光学系を有するレンズユニットを装着可能な構成への適用が可能である。 Although this embodiment shows an example of a configuration in which the imaging optical system is integrated into the main body of the imaging device, it is also possible to apply it to a configuration in which a lens unit having an imaging optical system can be attached to the main body of the imaging device. It is.

図２は、撮像装置１００の外観を模式的に示す図である。図２（Ａ）は撮像装置１００を正面から見た場合の外観を示す。レンズ１０１ａは被写体からの光を集光し、撮像素子１０１ｃ上に光学像を生成する。撮像素子１０１ｃは光学像に対する光電変換によって電気信号を出力する。撮像光学系における集光範囲は、制御部１０７が絞り１０１ｂを制御することによって決定される。つまり集光範囲を通過した光が撮像素子１０１ｃに到達する光となる。使用者がシャッターボタン１０４を押下することにより撮影が開始される。撮影された画像のデータは演算部１０２によって現像処理された後、記憶部１０３に記憶される。 FIG. 2 is a diagram schematically showing the appearance of the imaging device 100. FIG. 2A shows the appearance of the imaging device 100 when viewed from the front. The lens 101a collects light from the subject and generates an optical image on the image sensor 101c. The image sensor 101c outputs an electrical signal by photoelectrically converting the optical image. The light collection range in the imaging optical system is determined by the controller 107 controlling the aperture 101b. In other words, the light that has passed through the condensing range becomes the light that reaches the image sensor 101c. Photographing is started when the user presses the shutter button 104. The data of the photographed image is developed by the calculation unit 102 and then stored in the storage unit 103.

図２（Ｂ）は、撮像装置１００を背面から見た場合の外観を示す。撮像装置１００の背面部に設けられた操作部１０５の一部を示す。例えば、使用者は撮影条件の設定や、高精度測距モードの開始や終了等の指示を行うときに操作部１０５を使用する。表示部１０６は、撮影時の構図を表示し、また各種設定時に項目や映像検索結果等を表示する。また、タッチパネルを有する表示デバイスの場合、シャッターボタン１０４や操作部１０５の機能を表示部１０６が有する機能によって兼用することができる。使用者は、表示部１０６の画面へのタッチ操作によって撮影や設定の操作を行うことができる。シャッターボタン１０４や操作部１０５等のハードウェア部品を装備する必要がないので、表示部１０６の大画面化や操作性の向上に好適である。 FIG. 2(B) shows the appearance of the imaging device 100 when viewed from the back. A part of the operation unit 105 provided on the back side of the imaging device 100 is shown. For example, the user uses the operation unit 105 when setting photographing conditions or instructing to start or end the high-precision ranging mode. The display unit 106 displays the composition at the time of shooting, and also displays items, video search results, etc. at the time of various settings. Further, in the case of a display device having a touch panel, the functions of the shutter button 104 and the operation unit 105 can be shared by the function of the display unit 106. The user can perform shooting and setting operations by touching the screen of the display unit 106. Since there is no need to equip hardware components such as the shutter button 104 and the operation section 105, this is suitable for increasing the screen size of the display section 106 and improving operability.

図３（Ａ）は、瞳分割型の撮像素子１０１ｃにおける画素部の内部構造を示す模式図である。撮像素子１０１ｃの受光面にわたって多数配列されるマイクロレンズのうち、ひとつのマイクロレンズ１１１を有する画素部１０８を示す。画素部１０８はサブ画素１０９とサブ画素１１０を備える。 FIG. 3A is a schematic diagram showing the internal structure of a pixel portion in the split-pupil image sensor 101c. A pixel section 108 having one microlens 111 among a large number of microlenses arranged over the light-receiving surface of the image sensor 101c is shown. The pixel section 108 includes a sub-pixel 109 and a sub-pixel 110.

被写体から発した光のうち、撮像光学系における異なる瞳部分領域をそれぞれ通過した光を、サブ画素１０９とサブ画素１１０によって捉え、それぞれの画像における被写体像のズレ量から距離を算出することができる。以下、サブ画素１０９をＡ画素と呼び、サブ画素１１０をＢ画素と呼ぶこととする。 Among the light emitted from the subject, the light that has passed through different pupil partial regions in the imaging optical system is captured by the sub-pixel 109 and the sub-pixel 110, and the distance can be calculated from the amount of shift of the subject image in each image. . Hereinafter, the sub-pixel 109 will be referred to as an A-pixel, and the sub-pixel 110 will be referred to as a B-pixel.

図３（Ｂ）は、撮像素子１０１ｃの画素配列を示す。Ａ画素、Ｂ画素は撮像面の全面にわたって交互に配列されている。Ａ画素、Ｂ画素によって撮像光学系における異なる瞳部分領域を通過した光をそれぞれ受光することができる。つまり複数のＡ画素の出力に基づくＡ画像（第１の視点画像）と、複数のＢ画素の出力に基づくＢ画像（第２の視点画像）を取得することが可能である。Ａ画像、Ｂ画像における被写体の像をそれぞれＡ像、Ｂ像と呼ぶことにする。 FIG. 3B shows a pixel array of the image sensor 101c. A pixels and B pixels are arranged alternately over the entire imaging surface. The A pixel and the B pixel can each receive light that has passed through different pupil partial regions in the imaging optical system. That is, it is possible to obtain an A image (first viewpoint image) based on the output of a plurality of A pixels and a B image (second viewpoint image) based on the output of a plurality of B pixels. The images of the subject in the A image and the B image will be referred to as the A image and the B image, respectively.

画素部１０８は、共有のマイクロレンズ１１１を通過した光をＡ画素、Ｂ画素で受光するように設計されている。光がＡ画素に受光されるか、またはＢ画素に受光されるかは入射角１１２によって決定される。図３（Ａ）にて入射角１１２の正負に関しては光軸方向を基準として、右方向からの入射光の角度符号をプラスとし、左方向からの入射光の角度符号をマイナスとして定義する。Ａ画素、Ｂ画素は入射光に対する受光感度の角度特性が異なっており、互いに異なる瞳部分領域を通過した光のみを受光することができる。 The pixel unit 108 is designed so that the A pixel and the B pixel receive the light that has passed through the shared microlens 111. Whether light is received by the A pixel or the B pixel is determined by the angle of incidence 112. In FIG. 3A, regarding the sign of the incident angle 112, with the optical axis direction as a reference, the angle sign of incident light from the right direction is defined as plus, and the angle sign of incident light from the left direction is defined as minus. The A pixel and the B pixel have different angular characteristics of light reception sensitivity with respect to incident light, and can only receive light that has passed through different pupil partial regions.

図３（Ｃ）は、光の入射角に対するＡ画素とＢ画素の受光感度を例示したグラフである。横軸は入射角を表し、縦軸は受光感度を表す。実線の受光感度曲線１１３はＡ画素の受光感度を表す。入射角がプラスの範囲（右方向からの入射光の角度範囲）において受光感度が相対的に高く、入射角がマイナスの範囲（左方向からの入射光の角度範囲）において受光感度が相対的に低い。これに対して、破線の受光感度曲線１１４はＢ画素の受光感度を表す。入射角がマイナスの範囲において受光感度が相対的に高く、入射角がプラスの範囲において受光感度が相対的に低い。画素部１０８内部の導波路内の適当な位置に光を吸収する部材を配置することで、このような受光感度曲線を得ることができる。 FIG. 3C is a graph illustrating the light receiving sensitivity of the A pixel and the B pixel with respect to the incident angle of light. The horizontal axis represents the incident angle, and the vertical axis represents the light receiving sensitivity. A solid line light-receiving sensitivity curve 113 represents the light-receiving sensitivity of the A pixel. The light receiving sensitivity is relatively high in the range where the incident angle is positive (angle range of incident light from the right direction), and the light receiving sensitivity is relatively high in the range where the incident angle is negative (angle range of incident light from the left direction). low. On the other hand, a broken line light-receiving sensitivity curve 114 represents the light-receiving sensitivity of the B pixel. The light-receiving sensitivity is relatively high in a negative incident angle range, and the light-receiving sensitivity is relatively low in a positive incident angle range. By arranging a light absorbing member at an appropriate position within the waveguide inside the pixel portion 108, such a light reception sensitivity curve can be obtained.

撮像装置１００は、撮像光学系における異なる瞳部分領域をそれぞれ通過する光を、撮像素子１０１ｃが有する複数の光電変換部によって画像信号に変換する。あたかも視点が異なる位置にある２台のカメラで撮影したかのように、被写体までの距離に応じた像ズレを有する視差画像を取得することができる。被写体に対するＡ像とＢ像のウィンドウマッチング処理等により対応箇所を検出し、画像上の像ズレ量に基づいて被写体までの距離情報を算出することができる。本実施形態では、深度情報の一例として、撮像装置から被写体までの距離情報を取得する処理を示す。その他の深度情報には像ズレ量マップ、デフォーカス量マップがある。像ズレ量マップは複数の視点画像から算出することができ、デフォーカス量マップは像ズレ量に所定の変換係数を乗算して算出することができる。 The imaging device 100 converts light that passes through different pupil partial regions in the imaging optical system into image signals using a plurality of photoelectric conversion units included in the imaging element 101c. It is possible to obtain a parallax image having an image shift according to the distance to the subject, as if the images were captured by two cameras having different viewpoints. Corresponding locations can be detected by window matching processing of images A and B for the subject, and distance information to the subject can be calculated based on the amount of image shift on the image. In this embodiment, a process of acquiring distance information from an imaging device to a subject will be described as an example of depth information. Other depth information includes an image shift amount map and a defocus amount map. The image shift amount map can be calculated from a plurality of viewpoint images, and the defocus amount map can be calculated by multiplying the image shift amount by a predetermined conversion coefficient.

図４（Ａ）は、被写体からの光をＡ画素、Ｂ画素でそれぞれ受光する様子を示す。被写体から発した光のうち、レンズ１０１ａにて第１の領域（右側領域）を通過する光束１１５と、レンズ１０１ａの第２の領域（左側領域）を通過する光束１１６を示す。光束１１５に関しては、画素部１０８に対してプラスの入射角で光が入射するので、Ａ画素は受光感度が高いが、Ｂ画素は受光感度が低い（図３（Ｃ）の受光感度曲線１１３，１１４参照）。これとは逆に光束１１６に関しては、画素部１０８に対してマイナスの入射角で光が入射するので、Ｂ画素は受光感度が高いが、Ａ画素は受光感度が低い。Ａ画素の出力から生成されるＡ画像と、Ｂ画素の出力から生成されるＢ画像はそれぞれレンズ１０１ａの異なる領域を通過した光束に基づく像が写った画像である。 FIG. 4A shows how light from an object is received by the A pixel and the B pixel, respectively. Of the light emitted from the subject, a light beam 115 passing through the first area (right side area) of the lens 101a and a light beam 116 passing through the second area (left side area) of the lens 101a are shown. Regarding the light flux 115, since the light enters the pixel portion 108 at a positive incident angle, the A pixel has high light receiving sensitivity, but the B pixel has low light receiving sensitivity (the light receiving sensitivity curve 113 in FIG. 3(C), 114). On the contrary, regarding the light beam 116, since the light enters the pixel portion 108 at a negative incident angle, the B pixel has high light receiving sensitivity, but the A pixel has low light receiving sensitivity. The A image generated from the output of the A pixel and the B image generated from the B pixel output are images based on light fluxes that have passed through different areas of the lens 101a, respectively.

図４（Ａ）は、焦点が合った被写体（合焦被写体）に対してＡ画素とＢ画素が受光する様子を示す。合焦被写体からの光束１１５と光束１１６は撮像素子１０１ｃ上で集光するので、実質的に像ズレが発生しない。 FIG. 4A shows how the A pixel and the B pixel receive light from a focused subject (focused subject). Since the light beams 115 and 116 from the focused object are converged on the image sensor 101c, substantially no image shift occurs.

次に、本発明の基本原理となるＡ画像とＢ画像との間での像ズレとボケとの関係について説明する。図４（Ｂ）を参照して、焦点が合っていない被写体の撮影について説明する。この場合、光束１１５と光束１１６は撮像素子１０１ｃからずれた位置に集光する。 Next, the relationship between image shift and blur between the A image and the B image, which is the basic principle of the present invention, will be explained. Photographing an out-of-focus subject will be described with reference to FIG. 4(B). In this case, the light flux 115 and the light flux 116 are focused at a position shifted from the image sensor 101c.

図４（Ｂ）は、図４（Ａ）の合焦位置よりもレンズ１０１ａに近い位置にある被写体（非合焦被写体）に対してＡ画素とＢ画素が受光する様子を示す。非合焦被写体からの光束１１５と光束１１６は、集光前に撮像素子１０１ｃで受光される。このため、光束１１５の受光画素（Ａ画素）と光束１１６の受光画素（Ｂ画素）とが離れた位置となるので、像ズレ２０１が発生する。像ズレ２０１の大きさ（像ズレ量）は、被写体に焦点が合う合焦位置からのズレ量が大きい程大きくなる。像ズレ量を取得して被写体までの距離に変換する処理が行われる。しかし、合焦位置からのズレ量が大きいた被写体に対しては、像ズレ量を正しく算出することが困難となる。 FIG. 4(B) shows how the A pixel and the B pixel receive light from a subject (unfocused subject) located at a position closer to the lens 101a than the in-focus position in FIG. 4(A). The light beams 115 and 116 from the unfocused object are received by the image sensor 101c before being condensed. Therefore, the light-receiving pixel (A pixel) for the light beam 115 and the light-receiving pixel (B pixel) for the light beam 116 are separated from each other, so that image shift 201 occurs. The magnitude of the image shift 201 (image shift amount) increases as the shift amount from the in-focus position where the subject is focused increases. Processing is performed to obtain the amount of image shift and convert it into a distance to the subject. However, for a subject that has a large amount of deviation from the in-focus position, it is difficult to accurately calculate the amount of image deviation.

図５は、図４（Ｂ）よりもさらにレンズ１０１ａに近い位置にある非合焦被写体に対してＡ画素とＢ画素が受光する様子を示す。図４（Ｂ）の場合、光束１１５と光束１１６はいずれも撮像素子上で集光していないのでＡ像とＢ像にはボケが発生するが、ボケ量は小さいため、像ズレ量を算出することは可能である。これに対して、図５の場合にはＡ像とＢ像には大きなボケが発生している。それぞれのボケ量３０１とボケ量３０２は大きいので、ウィンドウマッチング処理等によりＡ画像とＢ画像との間で対応箇所を高精度に探索することが困難となる。例えば、光束１１５と光束１１６のうち、レンズ１０１ａの中心側を通過する光線に基づく像の位置を対応箇所として像ズレ量３０３が算出される。像ズレ量３０３は比較的小さい値になる。また光束１１５と光束１１６のうち、レンズ１０１ａの中心から離れた位置を通過する光線に基づく像の位置を対応箇所として像ズレ量３０４が算出される。像ズレ量３０４は大きい値になり、Ａ画像とＢ画像との間の対応箇所が精度よく定まらない。 FIG. 5 shows how the A pixel and the B pixel receive light from an out-of-focus subject located at a position closer to the lens 101a than in FIG. 4(B). In the case of FIG. 4(B), since neither the light flux 115 nor the light flux 116 is focused on the image sensor, blurring occurs in images A and B, but since the amount of blur is small, the amount of image shift is calculated. It is possible to do so. On the other hand, in the case of FIG. 5, large blurring occurs between the A image and the B image. Since the amount of blur 301 and the amount of blur 302 are large, it becomes difficult to search for corresponding locations between the A image and the B image with high precision by window matching processing or the like. For example, the image shift amount 303 is calculated by setting the position of the image based on the light beam passing through the center side of the lens 101a among the light beams 115 and 116 as corresponding points. The image shift amount 303 has a relatively small value. Further, the image shift amount 304 is calculated by setting the positions of the images based on the light beams that pass through a position away from the center of the lens 101a among the light beams 115 and 116 as corresponding positions. The image shift amount 304 becomes a large value, and the corresponding location between the A image and the B image cannot be determined with high accuracy.

図６は、Ａ画像とＢ画像との間で対応箇所が精度よく算出できない場合の例を説明する図である。上側にＡ画像４０１を示し、下側にＢ画像４０２を示す。Ａ画像４０１とＢ画像４０２にはボケ状態で被写体が写っており、Ａ画像４０１とＢ画像４０２との間の像ズレ量３０３を示す。 FIG. 6 is a diagram illustrating an example of a case where the corresponding locations cannot be calculated with high accuracy between the A image and the B image. An A image 401 is shown on the upper side, and a B image 402 is shown on the lower side. The A image 401 and the B image 402 show the subject in a blurred state, and the amount of image shift 303 between the A image 401 and the B image 402 is shown.

図７は、図４（Ｂ）よりもレンズ１０１ａから遠い位置にある非合焦被写体に対してＡ画素とＢ画素が受光する様子を示す。光束１１５と光束１１６は左右方向にて逆転して撮像素子１０１ｃに到達する。Ａ像とＢ像との像ズレの方向は図５の場合とは逆になるものの、Ａ画像とＢ画像との間で対応箇所の探索の精度が低下することは前記の説明と同様である。 FIG. 7 shows how the A pixel and the B pixel receive light from an out-of-focus subject located further from the lens 101a than in FIG. 4(B). The light flux 115 and the light flux 116 are reversed in the left-right direction and reach the image sensor 101c. Although the direction of the image shift between the A image and the B image is opposite to that in the case of FIG. 5, the accuracy of searching for the corresponding location between the A image and the B image is reduced, as explained above. .

そこで、本実施形態の画像処理装置は光束１１５と光束１１６に関して、レンズ１０１ａの中心から離れた位置を通過する光線のみを用いてＡ像またはＢ像を生成する。ボケが少なくかつ像ズレ量の大きい画像を用いて像同士の対応箇所を探索することで、探索精度を向上させることができる。本実施形態の画像処理方法について、以下に説明する。 Therefore, the image processing apparatus of this embodiment generates the A image or the B image using only the light beams that pass through a position away from the center of the lens 101a regarding the light beams 115 and 116. Search accuracy can be improved by searching for corresponding locations between images using images with little blur and a large amount of image shift. The image processing method of this embodiment will be described below.

本実施形態では絞り１０１ｂの開口量を変更して２回の撮影が行われ、それぞれＡ画像とＢ画像が取得される。撮像装置１００が絞り１０１ｂを大きい開口量にして撮影したときのＡ画像データとＢ画像データをそれぞれ、Ａ_絞り大とＢ_絞り大と表記する。撮像装置１００が絞り１０１ｂを小さい開口量にして撮影したときのＡ画像データとＢ画像データをそれぞれ、Ａ_絞り小とＢ_絞り小と表記する。２つのＡ画像データの差をＡ_差と表記し、２つのＢ画像データの差をＢ_差と表記する。差分画像データＡ_差とＢ_差は、式（１）から求めることができる。
制御部１０７は、（式１）により生成されたＡ_差とＢ_差を用いて像の対応箇所を探索し、像ズレ量を算出する。図８を参照して具体的に説明する。 In this embodiment, two images are taken by changing the aperture amount of the aperture 101b, and an A image and a B image are obtained respectively. A image data and B image data obtained when the imaging device 100 shoots an image with the aperture 101b set to a large aperture are expressed as A _{large aperture} and B _{aperture large} , respectively. A image data and B image data obtained when the image capturing apparatus 100 shoots an image with the aperture 101b set to a small aperture are expressed as A _{small aperture} and B _{aperture small} , respectively. The difference between two A image data is expressed as A _difference , and the difference between two B image data is expressed as B _difference . The difference image data A _difference and B _difference can be obtained from equation (1).
The control unit 107 uses the A _difference and the B _difference generated by (Equation 1) to search for a corresponding part of the image, and calculates the amount of image shift. This will be explained in detail with reference to FIG.

図８は、撮像装置１００による撮影状態を模式的に示す図である。左上側に第１の撮影６０１の状態を示し、右上側に第２の撮影６０２の状態を示す。第１の撮影６０１では絞り１０１ｂが第１の開口量に設定されて撮影が行われ、Ａ_絞り大とＢ_絞り大が取得される。第２の撮影６０２では絞り１０１ｂが第２の開口量に設定されて撮影が行われる。第２の開口量は第１の開口量よりも小さい。第２の撮影６０２ではＡ_絞り小とＢ_絞り小が取得される。 FIG. 8 is a diagram schematically showing a shooting state by the imaging device 100. The state of the first photograph 601 is shown on the upper left side, and the state of the second photograph 602 is shown on the upper right side. In the first photographing 601, photographing is performed with the aperture 101b set to the first aperture amount, and _{large aperture} A and _{large aperture} B are obtained. In the second photographing 602, the aperture 101b is set to the second aperture amount and photographing is performed. The second opening amount is smaller than the first opening amount. In the second photographing 602, A _{small aperture} and B _{aperture small} are acquired.

光束１１５と光束１１６について第１の撮影６０１と第２の撮影６０２とを対比して説明する。第１の撮影６０１に比べて、第２の撮影６０２ではレンズ１０１ａの中心から離れた位置を通る光線が絞りによって部分的に遮断されている。第３の撮影６０３は、Ａ_差およびＢ_差と同等のＡ画像およびＢ画像が取得される場合の光学的状態を模式的に表現している。第１の撮影６０１と第２の撮影６０２は撮像装置１００によって実際に行われる撮影であるが、第３の撮影６０３は実際に行われる撮影でないこと（仮想的な撮影）に注意を要する。 The luminous flux 115 and the luminous flux 116 will be explained by comparing the first photographing 601 and the second photographing 602. Compared to the first photograph 601, in the second photograph 602, the light rays passing through a position away from the center of the lens 101a are partially blocked by the aperture. The third photograph 603 schematically represents the optical state when an A image and a B image equivalent to the A _difference and the B _difference are acquired. Although the first imaging 601 and the second imaging 602 are imaging actually performed by the imaging apparatus 100, it should be noted that the third imaging 603 is not an actual imaging (virtual imaging).

第３の撮影６０３では、第１の撮影６０１のときの光束１１５と光束１１６のうち、レンズ１０１ａの中心から離れた位置を通過する光線だけがそれぞれＡ_差とＢ_差の生成に寄与することが分かる。Ａ像およびＢ像のボケ量６０４は小さく、また像ズレ量６０５は大きくなる。したがって像ズレ量の計算をより高精度に行うことができる。 In the third photographing 603, of the light beams 115 and 116 from the first photographing 601, only the light beams passing through a position away from the center of the lens 101a contribute to the generation of the A _difference and the B _difference , respectively. I understand. The amount of blur 604 of the A image and the B image is small, and the amount of image shift 605 is large. Therefore, the amount of image shift can be calculated with higher precision.

図９は、対応箇所を精度よく算出できるときのＡ画像とＢ画像の例を示す。Ａ画像７０１とＢ画像７０２はそれぞれＡ_差とＢ_差から生成された画像であり、図６のＡ画像４０１やＢ画像４０２と比べて鮮明な画像である。Ａ画像７０１とＢ画像７０２との間の像ズレ量７０３を示す。Ａ画像７０１およびＢ画像７０２はいずれも、レンズ１０１ａの中心から離れた位置を通過する光線だけから生成される画像であり、像ズレ量７０３は像ズレ量３０３と比べて大きくなっている。したがって、被写体の像の対応箇所を、より高精度に算出することができる。 FIG. 9 shows an example of images A and B when corresponding locations can be calculated with high accuracy. The A image 701 and the B image 702 are images generated from the A _difference and the B _difference, respectively, and are clearer than the A image 401 and the B image 402 in FIG. The amount of image shift 703 between the A image 701 and the B image 702 is shown. Both the A image 701 and the B image 702 are images generated only from light rays passing through a position far from the center of the lens 101a, and the image shift amount 703 is larger than the image shift amount 303. Therefore, the corresponding locations in the image of the subject can be calculated with higher precision.

図８にて複数回の撮影では、シャッタースピードやＩＳＯ感度等の絞り量（Ｆ値）以外の撮影条件を揃えることが好ましいが、レンズ１０１ａの中心付近を通過する光線によって生成されるＡ画像同士、Ｂ画像同士で条件が同じであればよい。例えば、第１および第２の撮影のうちでシャッタースピードが速い方の撮影に対しては、ＩＳＯ感度を上げる設定等により、画像の明るさを補うための露出制御が行われる。 In FIG. 8, when shooting multiple times, it is preferable to match shooting conditions other than the aperture amount (F number) such as shutter speed and ISO sensitivity, but images A generated by light rays passing near the center of the lens 101a are different from each other. , B images may have the same conditions. For example, for the first and second shots with a faster shutter speed, exposure control is performed to compensate for the brightness of the image by, for example, setting to increase the ISO sensitivity.

ところで、説明を簡素化するためにレンズ１０１ａとして１枚のレンズを図示したが、実際の撮像光学系は複数枚のレンズで構成される場合が多い。この場合、レンズ１０１ａの中心軸から離れた位置を通過する、被写体から発した光に対して、光束の範囲がレンズ１０１ａによって決まる、いわゆる光線のケラレが発生する。 By the way, although one lens is illustrated as the lens 101a to simplify the explanation, an actual imaging optical system is often composed of a plurality of lenses. In this case, so-called vignetting of light rays occurs, in which the range of the light flux is determined by the lens 101a, with respect to light emitted from the subject that passes through a position away from the central axis of the lens 101a.

図１０は、ケラレの発生の様子を模式的に示す図である。図１０では、レンズ１０１ａの中心軸から外れた位置にある被写体の撮影にて、前側の第１レンズ１０１ａ_1および後側の第２レンズ１０１ａ_2を有する構成例を示す。光束１１５は絞り１０１ｂによって範囲が決められるのに対し、光束１１６は第２レンズ１０１ａ_2の外縁部８０１で範囲が決められている。このとき、絞り１０１ｂを変化させて撮影してもＢ像は変化しない。絞り１０１ｂを変化させた画像の差を算出すると、Ａ像は鮮明になるがＢ像は像がなくなってしまう。この場合には、Ａ_差とＢ_差を用いて像の対応箇所を探索できないので、制御部１０７はＡ_差とＢ_絞り大またはＡ_差とＢ_絞り小を用いて像の対応箇所の探索処理を実行する。 FIG. 10 is a diagram schematically showing how vignetting occurs. FIG. 10 shows a configuration example having a first lens 101a_1 on the front side and a second lens 101a_2 on the rear side when photographing a subject located at a position off the central axis of the lens 101a. The range of the luminous flux 115 is determined by the aperture 101b, whereas the range of the luminous flux 116 is determined by the outer edge portion 801 of the second lens 101a_2. At this time, even if the aperture 101b is changed and the image is photographed, the B image does not change. When the difference between the images obtained by changing the aperture 101b is calculated, the A image becomes clear, but the B image disappears. In this case, since it is not possible to search for a corresponding part of the image using the A _difference and B _difference , the control unit 107 searches for a corresponding part of the image using the A _difference and B _{large aperture} or the A _difference and B _{small aperture} . Execute.

図１１は、本実施形態の処理を説明するフローチャートである。Ｓ９０１からＳ９１０の処理は、制御部１０７のＣＰＵがプログラムを実行することにより実現される。Ｓ９０１では、使用者が高精度距離取得モードの開始を指示することにより処理が開始する。 FIG. 11 is a flowchart illustrating the processing of this embodiment. The processes from S901 to S910 are realized by the CPU of the control unit 107 executing a program. In S901, the process starts when the user instructs to start the high-precision distance acquisition mode.

図１２（Ａ）は、高精度距離取得モードの開始指示に係るインターフェースを示す図であり、撮像装置１００の表示部１０６による表示例を示す。表示部１０６は画面上に操作表示領域１００１を表示する。使用者は操作部１０５を操作して操作表示領域１００１内の「開始」を選択することにより、高精度距離取得モードの開始を指示する。表示部１０６がタッチパネルを備える場合には、使用者は操作表示領域１００１内に「開始」が表示されている部分を手指でタッチ操作することにより選択する。高精度距離取得モードの処理が開始されると、使用者に対して撮影時の注意事項を通知する表示処理が実行される。 FIG. 12A is a diagram showing an interface related to an instruction to start the high-precision distance acquisition mode, and shows a display example on the display unit 106 of the imaging device 100. The display unit 106 displays an operation display area 1001 on the screen. The user instructs the start of the high-precision distance acquisition mode by operating the operation unit 105 and selecting "start" in the operation display area 1001. When the display unit 106 includes a touch panel, the user selects a portion of the operation display area 1001 where “Start” is displayed by touching the area with a finger or finger. When the process of the high-precision distance acquisition mode is started, a display process is executed to notify the user of precautions when taking a photograph.

図１２（Ｂ）は、撮影時の注意事項の通知例を示す。表示部１０６は通知表示１１０１を使用者に提示し、撮影時にカメラが動かないように動きを抑制するための注意喚起を促す。この注意喚起は、絞り１０１ｂの開口量を変更して２回以上の撮影を行う必要があるので、その撮影の間に手振れ等によって像が動かないようにするためである。使用者に注意喚起を促す方法には、スピーカにより音声出力を行う方法や、触覚デバイスにより触感を提示する方法等がある。なお、撮像装置１００が手振れ等による像ブレの補正機能を有する場合には、その機能をＯＮに設定することで像ブレ補正が行われるので、使用者の撮影時の制限が緩和される。また、絞り１０１ｂの開口量を変更する撮影動作が高速に行える場合には、手振れ等による像の動きは緩和される。 FIG. 12(B) shows an example of notification of precautions when photographing. The display unit 106 presents a notification display 1101 to the user, and urges the user to be careful about restraining the movement of the camera so that it does not move during shooting. This warning is to prevent the image from moving due to camera shake or the like during the shooting, since it is necessary to change the aperture of the diaphragm 101b and take two or more shots. Methods for alerting the user include a method of outputting audio using a speaker and a method of presenting a tactile sensation using a tactile device. Note that if the imaging device 100 has a function of correcting image blur caused by hand shake or the like, image blur correction is performed by setting that function to ON, so that restrictions on the user when taking pictures are relaxed. Furthermore, if the photographing operation of changing the aperture of the diaphragm 101b can be performed at high speed, image movement due to camera shake or the like is alleviated.

図１１のＳ９０２で撮像装置１００は、絞り１０１ｂの開口量を第１の開口量として撮影を行う。その結果取得されるＡ画像とＢ画像のデータはそれぞれＡ_絞り大とＢ_絞り大として記憶部１０３に記憶される。次にＳ９０３で撮像装置１００は、絞り１０１ｂの開口量を、第１の開口量よりも小さい第２の開口量として撮影を行う。その結果取得されるＡ画像とＢ画像のデータはそれぞれＡ_絞り小とＢ_絞り小として記憶部１０３に記憶される。例えば、撮像装置１００がＳ９０２で絞り１０１ｂを開放にして撮影し、Ｓ９０３では開放よりも一段絞って撮影することで、大きな効果が得られることが多い。その理由は、レンズ１０１ａの中心からできるだけ離れた位置を通過する光線を用いることによって、像ズレ量を大きくすることができるからである。Ｓ９０３の次にＳ９０４の処理に進む。 In S902 of FIG. 11, the imaging apparatus 100 performs imaging using the aperture amount of the aperture 101b as the first aperture amount. The data of the A image and the B image obtained as a result are stored in the storage unit 103 as A _{large aperture} and B _{large aperture} , respectively. Next, in S903, the imaging apparatus 100 performs imaging by setting the aperture amount of the aperture 101b to the second aperture amount, which is smaller than the first aperture amount. The data of the A image and the B image obtained as a result are stored in the storage unit 103 as A _{small aperture} and B _{small aperture} , respectively. For example, a great effect can often be obtained by the imaging device 100 shooting with the aperture 101b wide open in S902, and shooting with the aperture 101b stopped down one step further than the wide open aperture in S903. The reason for this is that the amount of image shift can be increased by using a light beam that passes through a position as far away as possible from the center of the lens 101a. After S903, the process advances to S904.

Ｓ９０４で制御部１０７は、Ｓ９０２とＳ９０３でそれぞれ取得された画像データの差を（式１）に基づいて計算し、Ａ_差とＢ_差を生成する。ここで留意すべきことは、使用者の手振れや被写体の動きに伴って、Ｓ９０２とＳ９０３とで撮影される画像が動いてしまう場合への対処法である。その場合、制御部１０７はＡ_絞り大とＡ_絞り小またはＢ_絞り大とＢ_絞り小を比較して、像の対応箇所を探し出すことによって、像の動きを算出して像の動き量にしたがって画像をずらす処理を行う。例えば、Ａ_絞り大における像に対してＡ_絞り小における対応箇所が１０画素分だけ右に動いてしまっている場合を想定する。制御部１０７は、Ａ_絞り小を１０画素分だけ左にずらした画像データに変更する。Ａ_絞り小のうち横幅が１０画素小さくなった分についてはＡ_絞り大のうちの右側の１０画素分を削除することによって、両画像のサイズを揃えることができる。このとき、生成されたＡ_差とＢ_差のサイズはいずれも、横幅が１０画素分小さいサイズとなる。 In S904, the control unit 107 calculates the difference between the image data acquired in S902 and S903 based on (Equation 1), and generates an A _difference and a B _difference . What should be noted here is how to deal with the case where the images taken in S902 and S903 move due to the user's camera shake or the movement of the subject. In that case, the control unit 107 compares the A _{large aperture} and the A _{small aperture} , or the B _{aperture large} and the B _{aperture small} , calculates the movement of the image by finding a corresponding part of the image, and adjusts the image according to the amount of image movement. Perform processing to shift the . For example, assume that the corresponding location at _{small aperture} A has moved to the right by 10 pixels with respect to the image at _{large aperture} A. The control unit 107 changes the A _{small aperture} to image data shifted to the left by 10 pixels. By deleting the 10 pixels on the right side of the _{large A aperture} for the 10 pixel width reduction in the small A _aperture , the sizes of both images can be made the same. At this time, the sizes of the generated A _difference and B _difference are both 10 pixels smaller in width.

Ｓ９０５で制御部１０７は、ウィンドウのループ処理を開始する。本実施形態ではＡ像とＢ像との対応箇所の探索にてウィンドウマッチング処理を行うものとして説明するが、特徴点マッチング等を用いてもよい。次にＳ９０６で制御部１０７は、Ａ画像とＢ画像の組み合わせを決定する。図１３を参照して具体例を説明する。 In S905, the control unit 107 starts window loop processing. Although this embodiment will be described as performing window matching processing by searching for corresponding locations between images A and B, feature point matching or the like may also be used. Next, in S906, the control unit 107 determines a combination of the A image and the B image. A specific example will be explained with reference to FIG.

図１３は、画像内の複数の領域とウィンドウを模式的に示す図である。画像１２０１はＡ画像またはＢ画像であり、画像１２０１内の注目ウィンドウ１２０２を示す。注目ウィンドウ１２０２の位置はループ処理において変更される。画像１２０１は水平方向に３分割されており、中央部を含む第１の領域１２０３、左側に位置する第２の領域１２０４、右側に位置する第３の領域１２０５を示す。第１の領域１２０３では被写体がレンズ１０１ａの中心付近に対応する位置であるので、被写体から発した光に関して光束１１５と光束１１６は両方とも絞り１０１ｂで範囲が決定される。そのため、Ａ画像としてＡ_差が用いられ、Ｂ画像としてＢ_差が用いられる。これに対して、第２の領域１２０４では被写体がレンズ１０１ａの中心に対応する位置から外れているので、光束１１６ではレンズ１０１ａによるケラレが発生する。この場合、Ａ画像としてＡ_差が用いられ、Ｂ画像としてＢ_絞り大またはＢ_絞り小が用いられる。これとは逆に、第３の領域１２０５では光束１１５でレンズ１０１ａによるケラレが発生する。この場合、Ａ画像としてＡ_絞り大またはＡ_絞り小が用いられ_、Ｂ画像としてＢ_差が用いられる。図１３の例では注目ウィンドウ１２０２が第１の領域１２０３内にあるので、制御部１０７はＡ画像としてＡ_差、Ｂ画像としてＢ_差を選択してウィンドウマッチング処理を実行することになる。領域１２０３、領域１２０４、領域１２０５の横幅の配分については撮像装置１００の設計時に光束とレンズ１０１ａによるケラレとの関係から決定することができる。図１１のＳ９０６の次にＳ９０７の処理に進む。 FIG. 13 is a diagram schematically showing a plurality of regions and windows within an image. Image 1201 is either A image or B image, and indicates a window of interest 1202 within image 1201. The position of the window of interest 1202 is changed in loop processing. The image 1201 is divided into three parts in the horizontal direction, and shows a first area 1203 including the center, a second area 1204 located on the left side, and a third area 1205 located on the right side. In the first region 1203, since the subject is located near the center of the lens 101a, the range of both the light flux 115 and the light flux 116 of the light emitted from the subject is determined by the aperture 101b. Therefore, the A _difference is used as the A image, and the B _difference is used as the B image. On the other hand, in the second region 1204, since the subject is away from the position corresponding to the center of the lens 101a, vignetting occurs in the light beam 116 due to the lens 101a. In this case, the A _difference is used as the A image, and the B _{large aperture} or B _{small aperture} is used as the B image. On the contrary, in the third region 1205, vignetting occurs in the light beam 115 due to the lens 101a. In this case, A _{large aperture} or A _{small aperture} is used as the A image _{, and} B _difference is used as the B image. In the example of FIG. 13, since the window of interest 1202 is within the first region 1203, the control unit 107 selects the A _difference as the A image and the B _difference as the B image to execute window matching processing. The width distribution of the regions 1203, 1204, and 1205 can be determined from the relationship between the light flux and vignetting caused by the lens 101a when designing the imaging device 100. After S906 in FIG. 11, the process advances to S907.

Ｓ９０７で制御部１０７は、Ｓ９０６で決定されたＡ画像とＢ画像での注目ウィンドウに対してウィンドウマッチング処理を実行して、画像同士の対応箇所の探索を行う。次にＳ９０８で制御部１０７は、Ｓ９０７で探索された対応箇所からＡ画像とＢ画像との間の像ズレ量を算出する。Ｓ９０８の次にＳ９０９の処理に進む。 In S907, the control unit 107 performs window matching processing on the windows of interest in the A image and B image determined in S906, and searches for corresponding locations between the images. Next, in S908, the control unit 107 calculates the amount of image shift between the A image and the B image from the corresponding location searched in S907. After S908, the process advances to S909.

Ｓ９０９で制御部１０７は、Ｓ９０８で算出した像ズレ量から距離を算出する。像ズレ量から距離を算出する方法として、例えば参照テーブルを用いる方法がある。記憶部１０３には、レンズ１０１ａの焦点距離、絞り１０１ｂのＦ値、被写体距離、合焦距離、被写体の画像上の位置に基づいて像ズレ量と距離との関係を表すデータを有する参照テーブルが記憶保持されている。例えば、レンズ１０１ａの焦点距離を５０ｍｍ、絞り１０１ｂのＦ値を１．２と１．４、被写体距離を１．５ｍ、合焦距離を１．０ｍ、被写体の画像上の位置を領域１２０３内（図１３）とする。このときにＡ_差とＢ_差との像ズレ量は、参照テーブルから５ピクセルである。つまり、５ピクセルに対応する被写体距離として１．５ｍが取得される。なお、参照テーブルにないパラメータの値に対しては、参照テーブルデータに近い値から公知の処理（内挿処理や外挿処理等）によって算出することができる。 In S909, the control unit 107 calculates the distance from the amount of image shift calculated in S908. As a method for calculating the distance from the amount of image shift, there is a method using, for example, a reference table. The storage unit 103 includes a reference table having data representing the relationship between the amount of image shift and the distance based on the focal length of the lens 101a, the F value of the aperture 101b, the subject distance, the focusing distance, and the position of the subject on the image. memory is retained. For example, the focal length of the lens 101a is 50 mm, the F value of the aperture 101b is 1.2 and 1.4, the subject distance is 1.5 m, the focusing distance is 1.0 m, and the position of the subject on the image is within the area 1203 ( Figure 13). At this time, the amount of image shift between the A _difference and the B _difference is 5 pixels from the reference table. In other words, 1.5 m is obtained as the subject distance corresponding to 5 pixels. Note that values of parameters that are not in the reference table can be calculated from values close to the reference table data by known processing (interpolation processing, extrapolation processing, etc.).

Ｓ９０９からＳ９１０に進み、制御部１０７はすべてのウィンドウに対して処理が終了したか否かを判定する。すべてのウィンドウに対して処理が終了したと判定された場合、ループ処理を抜けて、一連の処理を終了する。未処理のウィンドウがある場合にはＳ９０６に移行して処理を続行する。 The process advances from S909 to S910, and the control unit 107 determines whether processing has been completed for all windows. If it is determined that the processing has been completed for all windows, the loop processing is exited and the series of processing is ended. If there is an unprocessed window, the process moves to S906 and continues processing.

［第１の変形実施形態］
本変形実施形態ではノイズ対策の処理について説明する。上述の説明では、絞り１０１ｂの開口量を変えて２回の撮影を行い、Ａ_絞り大、Ａ_絞り小、Ｂ_絞り大、Ｂ_絞り小の画像を各々１枚ずつ取得する例を挙げた。絞りの開口の変化が小さい程、レンズ１０１ａの中心から離れた位置を通過した、より狭い領域の光束１１５と光束１１６が作る像をそれぞれＡ_差、Ｂ_差に残すことができる。しかし、絞りの開口の変化が小さすぎる場合、明るさのほとんど変わらない画像データ同士が減算されることになる。その結果、差分画像（Ａ_差またはＢ_差）は暗い画像となり、ノイズが目立つ像の対応箇所の探索が行いづらくなる可能性がある。 [First modified embodiment]
In this modified embodiment, noise countermeasure processing will be described. In the above description, an example was given in which photographing is performed twice by changing the aperture amount of the aperture 101b, and one image each of A _{large aperture} , A _{small aperture} , B _{aperture large} , and B _{aperture small} is obtained. The smaller the change in the aperture of the diaphragm, the more narrow the images created by the light beams 115 and 116 that have passed through a position away from the center of the lens 101a can be left in the A _difference and the B _difference , respectively. However, if the change in the aperture of the diaphragm is too small, image data with almost no difference in brightness will be subtracted from each other. As a result, the difference image (A _difference or B _difference ) becomes a dark image, which may make it difficult to search for a corresponding location in the image where noise is noticeable.

そこで撮像装置１００は、各々の絞り１０１ｂの開口量に対して２回以上の撮影を行い、複数の画像データを加算する。このことにより、明るい差分画像データＡ_差、Ｂ_差を生成してノイズを低減することができる。第ｉ回目に撮影された画像データをそれぞれＡ_{絞り大，ｉ}、Ａ_{絞り小，ｉ}、Ｂ_{絞り大，ｉ}、Ｂ_{絞り小，ｉ}と表記する。Ａ_差、Ｂ_差を（式２）から算出することができる。
（式２）中の変数Ｎは、各々の絞り１０１ｂの開口量に対する撮影回数を表す。つまり、合計で２×Ｎ回の撮影が行われることとなる。ただし、Ｎの値が大きすぎると撮影に時間がかかり、その間の像の動きが大きくなる可能性がある。そのため、三脚等に撮像装置１００を固定した状態で、静止した被写体を撮影することが好ましい。 Therefore, the imaging device 100 performs imaging two or more times for each aperture amount of the aperture 101b, and adds the plurality of image data. This makes it possible to generate bright differential image data A _difference and B _difference and reduce noise. The image data taken the i-th time are respectively expressed as A _{large aperture, i} , A _{small aperture, i} , B _{large aperture, i} , and B _{small aperture, i} . The A _difference and the B _difference can be calculated from (Equation 2).
The variable N in (Equation 2) represents the number of times of photographing for each aperture amount of the aperture 101b. In other words, a total of 2×N images will be taken. However, if the value of N is too large, it will take time to photograph, and the movement of the image during that time may become large. Therefore, it is preferable to photograph a stationary subject with the imaging device 100 fixed on a tripod or the like.

本変形実施形態は、生成されるＡ_差、Ｂ_差の画像に係るノイズが低減されるので、像の対応箇所の探索をより高精度に行えるという効果を奏する。 This modified embodiment has the effect that the corresponding portions of the images can be searched with higher accuracy because the noise related to the generated images of the A _difference and the B _difference is reduced.

［第２の変形実施形態］
本変形実施形態では、絞りの開口量を３回以上変更して撮影する例を示す。前記実施形態では、光束１１５と光束１１６のうちの一方に光線のケラレが発生する場合を説明した。ケラレが発生した光束に対応する画像には通常撮影の画像を使用し、ケラレが発生しなかった光束に対応する画像には差分画像を用いて、像の対応箇所の探索を行う例を挙げた。本変形実施形態の撮像装置は、絞り１０１ｂの開口量を３回以上変更して撮影を行う。制御部１０７はＡ画像とＢ画像ともに差分画像を生成してから、像の対応箇所の探索を行う。例えば絞り１０１ｂの開口量を、それぞれ絞り大、絞り中、絞り小と表記する。図１０で説明したように光束１１５は絞り大から絞り中にすることで光束の範囲が変化するが、光束１１６は変化しない。 [Second modified embodiment]
In this modified embodiment, an example will be shown in which the aperture amount of the aperture is changed three or more times and the image is photographed. In the embodiment described above, the case where vignetting occurs in one of the light beams 115 and 116 has been described. An example was given in which a normally captured image is used for the image corresponding to the light flux where vignetting has occurred, and a difference image is used for the image corresponding to the light flux where vignetting is not caused, in order to search for the corresponding part of the image. . The imaging device of this modified embodiment performs imaging by changing the aperture amount of the aperture 101b three or more times. The control unit 107 generates difference images for both the A image and the B image, and then searches for corresponding locations in the images. For example, the aperture amount of the aperture 101b is expressed as large aperture, medium aperture, and small aperture, respectively. As explained with reference to FIG. 10, the range of the luminous flux 115 changes by changing from a large aperture to a medium aperture, but the luminous flux 116 does not change.

図１４は、絞り１０１ｂの開口量を絞り小としたときの絞り１０１ｂと光束との関係を示す。絞り１０１ｂの開口量は光束１１６の範囲を制限するほどに小さくなっており、光線のケラレはなくなっている。一方で、光束１１５は絞り１０１ｂによって完全に遮蔽されているので、Ａ画像は生成されない。そこで（式３）に示すように、Ａ画像に対しては絞り大と絞り中との間で画像の差を求めることで差分画像データが生成される。Ｂ画像に対しては絞り大と絞り小との組み合わせ、または絞り中と絞り小との組み合わせにより画像の差を求めることで差分画像データが生成される。
例えば、Ｆ値１．２とＦ値１．４での撮影により取得されたＡ画像データの差からＡ_差が生成され、Ｆ値１．２とＦ値２．０での撮影により取得されたＢ画像データの差からＢ_差が生成される。 FIG. 14 shows the relationship between the aperture 101b and the luminous flux when the aperture amount of the aperture 101b is made small. The aperture amount of the aperture 101b is small enough to limit the range of the light beam 116, and there is no vignetting of the light beam. On the other hand, since the light beam 115 is completely blocked by the aperture 101b, image A is not generated. Therefore, as shown in Equation 3, differential image data is generated for image A by determining the image difference between large aperture and medium aperture. For the B image, difference image data is generated by determining the difference between images by a combination of a large aperture and a small aperture, or a combination of a medium aperture and a small aperture.
For example, A _difference is generated from the difference between A image data acquired by shooting at F value 1.2 and F value 1.4, and A B _difference is generated from the difference in B image data.

以上の処理を行うことで、Ｂ画像に対応する画像として通常撮影の画像を使用する場合に比べて、より鮮明なＢ画像を得ることができるので、像の対応箇所の探索をより高精度に行える。高精度な被写体距離を算出することができ、被写体の３Ｄマッピングに応用することで、より精細な３Ｄオブジェクトの生成が可能である。また以上の処理を撮像装置のオートフォーカス機能に応用することで、被写体に対して、より正確な合焦が可能である。 By performing the above processing, it is possible to obtain a clearer B image than when using a normally captured image as the image corresponding to the B image, so the search for the corresponding part of the image can be performed with higher precision. I can do it. It is possible to calculate the object distance with high accuracy, and by applying it to 3D mapping of the object, it is possible to generate more detailed 3D objects. Furthermore, by applying the above processing to the autofocus function of the imaging device, it is possible to more accurately focus on the subject.

［第３の変形実施形態］
上述の説明では、画像内に写ったすべての像に対してＡ画像とＢ画像のうちの少なくとも一方については、絞り１０１ｂの開口量を変更して撮影された画像に基づく差分画像を用いる例を挙げた。本変形実施形態では、合焦位置近傍の被写体や暗い差分画像に対して通常の処理（差分画像ではなく視点画像を用いる処理）を行う例を示す。 [Third modified embodiment]
In the above description, an example is described in which a difference image based on an image photographed by changing the aperture amount of the diaphragm 101b is used for at least one of the A image and the B image among all images captured in the image. I mentioned it. In this modified embodiment, an example will be shown in which normal processing (processing using a viewpoint image instead of a difference image) is performed on a subject near the in-focus position or a dark difference image.

合焦位置近傍の被写体に対しては通常撮影の画像でも像のボケが少ないので、差分画像による方法を用いる必要がない場合がある。そこで本変形実施形態の撮像装置は、深度情報が所定の範囲内（例えば、合焦位置近傍）にある第１の被写体に対してＡ画像とＢ画像を用いる。また、第１の被写体とは深度情報が異なる第２の被写体に対しては前記実施形態と同様に、Ａ_差とＢ_差のうちの少なくとも一方が用いられる。これによって、像ズレ量の計算を簡易化することができる。 For objects near the in-focus position, there is little blurring even in normally photographed images, so there may be no need to use a method using differential images. Therefore, the imaging device of this modified embodiment uses the A image and the B image for the first subject whose depth information is within a predetermined range (for example, near the in-focus position). Furthermore, for a second subject whose depth information is different from that of the first subject, at least one of the A _difference and the B _difference is used, similar to the embodiment described above. This makes it possible to simplify the calculation of the amount of image shift.

また、Ａ_差またはＢ_差の画像がノイズを多く含んだ暗い画像となる場合、像ズレ量の計算が不安定になる可能性がある。したがって、Ａ画像とＢ画像を用いて像ズレ量を算出することは有効である。つまり制御部１０７は、Ａ_差またはＢ_差が閾値未満であると判定した場合、Ａ画像とＢ画像を用いて像ズレ量を算出し、Ａ_差またはＢ_差が閾値以上であると判定した場合、Ａ_差とＢ_差のうちの少なくとも一方を用いて像ズレ量を算出する。 Furthermore, if the image of _difference A or _difference B becomes a dark image containing a lot of noise, calculation of the amount of image shift may become unstable. Therefore, it is effective to calculate the amount of image shift using the A image and the B image. In other words, when the control unit 107 determines that the A _difference or the B _difference is less than the threshold, it calculates the amount of image shift using the A image and the B image, and when the controller 107 determines that the A _difference or the B _difference is greater than or equal to the threshold. , the image shift amount is calculated using at least one of the A _difference and the B _difference .

画像間の像ズレ量の検出においては、レンズの瞳上での２つの光束の隔たりが大きいほど像ズレ量は大きくなり、被写体までの距離の算出精度は高くなるので、隔たりの大きい２つの光束を捉えてＡ像とＢ像を生成することが好ましい。そのためにはレンズの絞りの開口を大きくすればよいが、絞りを開くと被写界深度が浅くなるので、焦点の合う範囲から外れた位置にある被写体の像にはボケが発生する。その結果、ボケを有するＡ像とＢ像との間で対応箇所の探索が必要となるので、像ズレ量の検出が困難になる。 When detecting the amount of image shift between images, the larger the distance between two light beams on the pupil of the lens, the larger the amount of image shift, and the higher the accuracy of calculating the distance to the subject. It is preferable to capture the image and generate the A image and the B image. To achieve this, you can increase the aperture of the lens aperture, but opening the aperture reduces the depth of field, so images of objects located outside the in-focus range will be blurred. As a result, it is necessary to search for a corresponding location between the blurred images A and B, making it difficult to detect the amount of image shift.

前記実施形態では、絞りの開口量を変更して複数回の撮影を行い、差分画像を生成することにより、ボケが少なくかつ像ズレ量の大きい画像を用いて像同士の対応箇所を探索する。これにより、合焦位置から外れた距離にある被写体に対して、より高精度な測距（深度情報の取得）を行うことができる。なお、画素部にて各マイクロレンズに対応する２分割されたサブ画素がそれぞれ光電変換部を有する例を示したが、本発明は画素部が３以上に分割された光電変換部を有する実施形態への適用が可能である。 In the embodiment described above, by changing the aperture amount of the aperture and photographing a plurality of times to generate a difference image, corresponding points between images are searched for using images with less blur and a large amount of image shift. Thereby, it is possible to perform more accurate distance measurement (acquisition of depth information) for a subject located at a distance away from the in-focus position. Although an example has been shown in which each sub-pixel divided into two corresponding to each microlens in the pixel section has a photoelectric conversion section, the present invention is directed to an embodiment in which the pixel section has a photoelectric conversion section divided into three or more sections. It is possible to apply to

［その他の実施形態］
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 [Other embodiments]
The present invention provides a system or device with a program that implements one or more functions of the embodiments described above via a network or a storage medium, and one or more processors in the computer of the system or device reads and executes the program. This can also be achieved by processing. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

本開示の実施形態は、以下の構成、方法、プログラムを含む。
（構成１）
視点の異なる複数の視点画像から画像内における被写体の奥行方向の深度情報を取得することが可能な画像処理装置であって、
撮像光学系における異なる瞳部分領域をそれぞれ通過する光に基づく複数の画像を取得する取得手段と、
前記撮像光学系の絞りの開口量を変更する制御を行い、前記絞りの開口量が異なる状態で撮像された複数の画像を用いて前記深度情報を取得する制御を行う制御手段と、を備え、
前記取得手段は、第１の開口量で撮像された第１および第２の視点画像と、前記第１の開口量よりも小さい第２の開口量で撮像された第３および第４の視点画像を取得し、
前記制御手段は、前記第１の視点画像と前記第３の視点画像から第１の差分画像を生成し、前記第２の視点画像と前記第４の視点画像から第２の差分画像を生成し、前記第１および第２の差分画像のうちの１つ以上を用いて前記深度情報を取得する制御を行う
ことを特徴とする画像処理装置。
（構成２）
前記制御手段は、画像の中央部を含む第１の領域にて前記第１および第２の差分画像を用いて前記深度情報を取得する制御を行う
ことを特徴とする構成１に記載の画像処理装置。
（構成３）
前記制御手段は、前記第１の領域とは異なる第２の領域にて前記第１の差分画像と前記第２もしくは第４の視点画像、または、前記第２の差分画像と前記第１もしくは第３の視点画像を用いて前記深度情報を取得する制御を行う
ことを特徴とする構成２に記載の画像処理装置。
（構成４）
前記制御手段は、前記開口量を変更するごとに２回以上の撮像の制御を行い、
前記取得手段は、前記第１の開口量で撮像された複数の前記第１および第２の視点画像と、前記第２の開口量で撮像された複数の前記第３および第４の視点画像を取得する
ことを特徴とする構成１から３のいずれか１つに記載の画像処理装置。
（構成５）
前記制御手段は、複数の前記第１の視点画像と複数の前記第３の視点画像との差分に基づく前記第１の差分画像を生成し、複数の前記第２の視点画像と複数の前記第４の視点画像との差分に基づく前記第２の差分画像を生成する制御を行う
ことを特徴とする構成４に記載の画像処理装置。
（構成６）
前記制御手段は、前記絞りの開口量が異なる状態で撮像された複数の画像を用いて前記第１または第２の差分画像を生成する際、前記開口量の組み合わせにより前記深度情報の取得に用いる画像を変更する
ことを特徴とする構成１から３のいずれか１つに記載の画像処理装置。
（構成７）
前記取得手段は、前記第２の開口量よりも小さい第３の開口量で撮像された第５および第６の視点画像をさらに取得し、
前記制御手段は、前記第２の視点画像と前記第６の視点画像、または、前記第４の視点画像と前記第６の視点画像から前記第２の差分画像を生成する制御を行う
ことを特徴とする構成６に記載の画像処理装置。
（構成８）
前記制御手段は、第１の被写体に対して、前記第１および第２の視点画像、または前記第３および第４の視点画像を用いて前記深度情報を取得する制御を行い、前記第１の被写体とは深度情報が異なる第２の被写体に対して、前記第１および第２の差分画像のうちの１つ以上を用いて前記深度情報を取得する制御を行う
ことを特徴とする構成１から７のいずれか１つに記載の画像処理装置。
（構成９）
前記制御手段は、前記第１または第２の差分画像のデータが閾値未満である場合、前記第１および第２の視点画像、または前記第３および第４の視点画像を用いて前記深度情報を取得する制御を行い、前記第１または第２の差分画像のデータが閾値以上である場合、前記第１および第２の差分画像のうちの１つ以上を用いて前記深度情報を取得する制御を行う
ことを特徴とする構成１から７のいずれか１つに記載の画像処理装置。
（構成１０）
前記取得手段は、複数のマイクロレンズと、各マイクロレンズに対応する複数の光電変換部を備える撮像素子によって、前記光電変換部ごとに対応する前記複数の視点画像を取得する
ことを特徴とする構成１から９のいずれか１つに記載の画像処理装置。
（構成１１）
構成１から１０のいずれか１つに記載の画像処理装置と、
撮像素子と、を備える撮像装置。
（構成１２）
前記撮像素子の画素部は、マイクロレンズと、該マイクロレンズに対応する複数の光電変換部を有し、
前記光電変換部ごとに対応する画像から前記複数の視点画像が取得される
ことを特徴とする構成１１に記載の撮像装置。
（構成１３）
前記深度情報を取得するモードが設定された場合、前記撮像装置の動きを抑制するための通知を行う通知手段を備える
ことを特徴とする構成１１または構成１２に記載の撮像装置。
（方法１）
視点の異なる複数の視点画像から画像内における被写体の奥行方向の深度情報を取得することが可能な画像処理装置にて実行される画像処理方法であって、
撮像光学系における異なる瞳部分領域をそれぞれ通過する光に基づく複数の画像を取得する取得工程と、
前記撮像光学系の絞りの開口量を変更する制御を行い、前記絞りの開口量が異なる状態で撮像された複数の画像を用いて前記深度情報を取得する制御を行う制御工程と、を備え、
前記取得工程では、第１の開口量で撮像された第１および第２の視点画像と、前記第１の開口量よりも小さい第２の開口量で撮像された第３および第４の視点画像が取得され、
前記制御工程では、前記第１の視点画像と前記第３の視点画像から第１の差分画像を生成し、前記第２の視点画像と前記第４の視点画像から第２の差分画像を生成し、前記第１および第２の差分画像のうちの１つ以上を用いて前記深度情報を取得する制御が行われる
ことを特徴とする画像処理方法。
（プログラム）
方法１に記載の各工程を、コンピュータに実行させるプログラム。 Embodiments of the present disclosure include the following configurations, methods, and programs.
(Configuration 1)
An image processing device capable of acquiring depth information in the depth direction of a subject in an image from a plurality of viewpoint images having different viewpoints, the image processing device comprising:
acquisition means for acquiring a plurality of images based on light passing through different pupil partial regions in the imaging optical system;
a control unit that performs control to change the aperture amount of the aperture of the imaging optical system, and performs control to obtain the depth information using a plurality of images captured with different aperture amounts of the aperture,
The acquisition means is configured to acquire first and second viewpoint images captured with a first aperture, and third and fourth viewpoint images captured with a second aperture smaller than the first aperture. get
The control means generates a first difference image from the first viewpoint image and the third viewpoint image, and generates a second difference image from the second viewpoint image and the fourth viewpoint image. , an image processing device that controls acquiring the depth information using one or more of the first and second difference images.
(Configuration 2)
The image processing according to configuration 1, wherein the control means performs control to acquire the depth information using the first and second difference images in a first region including a central part of the image. Device.
(Configuration 3)
The control means may control the first difference image and the second or fourth viewpoint image, or the second difference image and the first or fourth viewpoint image in a second region different from the first region. The image processing device according to configuration 2, wherein the image processing device performs control for acquiring the depth information using a viewpoint image of No. 3.
(Configuration 4)
The control means controls imaging two or more times each time the aperture amount is changed,
The acquisition means is configured to acquire a plurality of the first and second viewpoint images captured with the first aperture amount and a plurality of third and fourth viewpoint images captured with the second aperture amount. The image processing device according to any one of configurations 1 to 3, characterized in that the image processing device acquires images.
(Configuration 5)
The control means generates the first difference image based on the difference between the plurality of first viewpoint images and the plurality of third viewpoint images, and generates the first difference image based on the difference between the plurality of second viewpoint images and the plurality of third viewpoint images. The image processing device according to configuration 4, wherein the image processing device performs control to generate the second difference image based on the difference between the second difference image and the second difference image.
(Configuration 6)
The control means uses a combination of the aperture amounts to obtain the depth information when generating the first or second difference image using a plurality of images captured with different aperture amounts of the diaphragm. The image processing device according to any one of configurations 1 to 3, characterized in that the image processing device changes an image.
(Configuration 7)
The acquisition means further acquires fifth and sixth viewpoint images captured with a third aperture smaller than the second aperture,
The control means performs control to generate the second difference image from the second viewpoint image and the sixth viewpoint image, or from the fourth viewpoint image and the sixth viewpoint image. The image processing device according to configuration 6.
(Configuration 8)
The control means controls acquiring the depth information for the first subject using the first and second viewpoint images or the third and fourth viewpoint images, and From configuration 1, characterized in that control is performed to acquire the depth information of a second subject whose depth information is different from that of the subject using one or more of the first and second difference images. 7. The image processing device according to any one of 7.
(Configuration 9)
When the data of the first or second difference image is less than a threshold, the control means controls the depth information using the first and second viewpoint images or the third and fourth viewpoint images. control to acquire the depth information, and if data of the first or second difference image is equal to or greater than a threshold, control to acquire the depth information using one or more of the first and second difference images; 8. The image processing device according to any one of configurations 1 to 7, characterized in that:
(Configuration 10)
The configuration is characterized in that the acquisition means acquires the plurality of viewpoint images corresponding to each of the photoelectric conversion units using an image sensor including a plurality of microlenses and a plurality of photoelectric conversion units corresponding to each microlens. 10. The image processing device according to any one of 1 to 9.
(Configuration 11)
The image processing device according to any one of configurations 1 to 10,
An imaging device including an imaging element.
(Configuration 12)
The pixel section of the image sensor includes a microlens and a plurality of photoelectric conversion sections corresponding to the microlens,
The imaging device according to configuration 11, wherein the plurality of viewpoint images are acquired from images corresponding to each of the photoelectric conversion units.
(Configuration 13)
The imaging device according to configuration 11 or configuration 12, further comprising a notification unit that provides notification for suppressing movement of the imaging device when a mode for acquiring the depth information is set.
(Method 1)
An image processing method executed by an image processing device capable of acquiring depth information in a depth direction of a subject in an image from a plurality of viewpoint images having different viewpoints, the method comprising:
an acquisition step of acquiring a plurality of images based on light respectively passing through different pupil partial regions in the imaging optical system;
a control step of controlling to change the aperture of the aperture of the imaging optical system and controlling to acquire the depth information using a plurality of images captured with different apertures of the aperture;
In the acquisition step, first and second viewpoint images are captured with a first aperture, and third and fourth viewpoint images are captured with a second aperture smaller than the first aperture. is obtained,
In the control step, a first difference image is generated from the first viewpoint image and the third viewpoint image, and a second difference image is generated from the second viewpoint image and the fourth viewpoint image. , control is performed to obtain the depth information using one or more of the first and second difference images.
(program)
A program that causes a computer to execute each step described in Method 1.

１００撮像装置
１０１撮像部
１０１ｂ絞り
１０２演算部
１０７制御部

100 Imaging device 101 Imaging section 101b Aperture 102 Arithmetic section 107 Control section

Claims

An image processing device capable of acquiring depth information in the depth direction of a subject in an image from a plurality of viewpoint images having different viewpoints, the image processing device comprising:
acquisition means for acquiring a plurality of images based on light passing through different pupil partial regions in the imaging optical system;
a control unit that performs control to change the aperture amount of the aperture of the imaging optical system, and performs control to obtain the depth information using a plurality of images captured with different aperture amounts of the aperture,
The acquisition means is configured to acquire first and second viewpoint images captured with a first aperture, and third and fourth viewpoint images captured with a second aperture smaller than the first aperture. and
The control means generates a first difference image from the first viewpoint image and the third viewpoint image, and generates a second difference image from the second viewpoint image and the fourth viewpoint image. , an image processing device that controls acquiring the depth information using one or more of the first and second difference images.

The image according to claim 1, wherein the control means performs control to acquire the depth information using the first and second difference images in a first region including the center of the image. Processing equipment.

The control means may control the first difference image and the second or fourth viewpoint image, or the second difference image and the first or fourth viewpoint image in a second region different from the first region. The image processing device according to claim 2, wherein control is performed to obtain the depth information using three viewpoint images.

The control means controls imaging two or more times each time the aperture amount is changed,
The acquisition means is configured to acquire a plurality of the first and second viewpoint images captured with the first aperture amount and a plurality of third and fourth viewpoint images captured with the second aperture amount. The image processing device according to claim 1, wherein the image processing device acquires an image.

The control means generates the first difference image based on the difference between the plurality of first viewpoint images and the plurality of third viewpoint images, and generates the first difference image based on the difference between the plurality of second viewpoint images and the plurality of third viewpoint images. The image processing device according to claim 4, wherein the image processing device performs control to generate the second difference image based on a difference from a fourth viewpoint image.

The control means uses a combination of the aperture amounts to obtain the depth information when generating the first or second difference image using a plurality of images captured with different aperture amounts of the diaphragm. The image processing device according to claim 1, wherein the image processing device changes an image.

The acquisition means further acquires fifth and sixth viewpoint images captured with a third aperture smaller than the second aperture,
The control means performs control to generate the second difference image from the second viewpoint image and the sixth viewpoint image, or from the fourth viewpoint image and the sixth viewpoint image. The image processing device according to claim 6.

The control means controls acquiring the depth information for the first subject using the first and second viewpoint images or the third and fourth viewpoint images, and 2. Control is performed to obtain depth information of a second object having different depth information from the object using one or more of the first and second difference images. The image processing device described in .

When the data of the first or second difference image is less than a threshold, the control means controls the depth information using the first and second viewpoint images or the third and fourth viewpoint images. control to acquire the depth information, and if data of the first or second difference image is equal to or greater than a threshold, control to acquire the depth information using one or more of the first and second difference images; The image processing device according to claim 1 , wherein the image processing device performs the following steps.

A claim characterized in that the acquisition means acquires the plurality of viewpoint images corresponding to each of the photoelectric conversion units using an image sensor including a plurality of microlenses and a plurality of photoelectric conversion units corresponding to each microlens. The image processing device according to item 1.

An image processing device according to any one of claims 1 to 10,
An imaging device including an imaging element.

The pixel section of the image sensor includes a microlens and a plurality of photoelectric conversion sections corresponding to the microlens,
The imaging device according to claim 11, wherein the plurality of viewpoint images are acquired from images corresponding to each of the photoelectric conversion units.

The imaging device according to claim 11, further comprising a notification unit that provides notification for suppressing movement of the imaging device when a mode for acquiring the depth information is set.

An image processing method executed by an image processing device capable of acquiring depth information in a depth direction of a subject in an image from a plurality of viewpoint images having different viewpoints, the method comprising:
an acquisition step of acquiring a plurality of images based on light respectively passing through different pupil partial regions in the imaging optical system;
a control step of controlling to change the aperture of the aperture of the imaging optical system and controlling to acquire the depth information using a plurality of images captured with different apertures of the aperture;
In the acquisition step, first and second viewpoint images are captured with a first aperture, and third and fourth viewpoint images are captured with a second aperture smaller than the first aperture. is obtained,
In the control step, a first difference image is generated from the first viewpoint image and the third viewpoint image, and a second difference image is generated from the second viewpoint image and the fourth viewpoint image. , control is performed to obtain the depth information using one or more of the first and second difference images.

A program that causes a computer to execute each step according to claim 14.