JP2019135839A

JP2019135839A - Image processing apparatus, method of controlling image processing apparatus, imaging apparatus, and program

Info

Publication number: JP2019135839A
Application number: JP2019041361A
Authority: JP
Inventors: 彰太山口; Shota Yamaguchi
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2016-01-13
Filing date: 2019-03-07
Publication date: 2019-08-15
Anticipated expiration: 2036-11-22
Also published as: JP2017126979A; JP6797955B2; JP6494587B2

Abstract

To provide an image processing apparatus capable of performing image recording having high convenience while minimizing a recorded amount.SOLUTION: An image processing apparatus acquires a plurality of image data (moving image data or the like) and records the data in a recording medium. The image processing apparatus acquires depth distribution information of an object corresponding to an image. A recording mode control unit 104 performs record control by dynamically determining whether to record in a first mode or a second mode with respect to the image data. The recording mode control unit 104 controls such that, when recording is performed in the first mode, the image data and the depth distribution information corresponding to the image data are recorded in the recording medium by a recording unit 109 while, when recording is performed in the second mode, the image data is recorded in the recording medium by the recording unit 109.SELECTED DRAWING: Figure 1

Description

本発明は、画像および距離情報の記録制御技術に関する。 The present invention relates to image and distance information recording control technology.

連続的に取得された動画像の中からユーザが好みの画像を選択し、事後的に各種の画像処理を行う事で、ユーザの嗜好に合った１枚の静止画を取得する技術がある。特許文献１では、連続して撮影された画像に対して被写体が所定状態であるか否かを判別し、所定状態である画像の記録画質を相対的に高くする技術が開示されている。また、特許文献２では、撮影モードが超解像モードに変更された場合に、フレーム圧縮を行わないことで高画質な画像を取得する技術が開示されている。また、特許文献３のように、撮像画像に加えて付加情報として距離分布情報を生成することで、撮像画像のボケを画像処理で調整する技術が開示されている。 There is a technique in which a user selects a favorite image from continuously acquired moving images and performs various image processing afterward to acquire a single still image that matches the user's preference. Patent Document 1 discloses a technique for determining whether or not a subject is in a predetermined state with respect to continuously shot images, and relatively increasing the recording image quality of the image in the predetermined state. Patent Document 2 discloses a technique for acquiring a high-quality image by not performing frame compression when the shooting mode is changed to the super-resolution mode. Further, as disclosed in Patent Literature 3, a technique for adjusting blur of a captured image by image processing by generating distance distribution information as additional information in addition to the captured image is disclosed.

特開２０１１−１６６３９１号公報JP 2011-166391 A 特開２００７−１８９６６５号公報JP 2007-189665 A 特開２０１５−１１９４１６号公報JP-A-2015-119416

しかしながら、各種画像処理には、被写体の輪郭抽出を高精度に行う必要のある処理が含まれる。電子的に背景をぼかす背景ぼかし処理や、主被写体とそれ以外の領域に対して別々に階調補正処理を行うことで高品質なダイナミックレンジ圧縮画像を得る領域別階調処理等がある。これらの技術を実現するためには、フレーム画像のデータを記録しておくだけでなく、画角内の距離分布（少なくとも距離の相対関係がわかる情報）を示す距離マップを併せて取得しておくことが必要である。また、領域別階調処理においては、画像加工前のＲＡＷ画像も記録しておく必要がある。 However, various types of image processing include processing that requires subject outline extraction with high accuracy. There are a background blur process for electronically blurring the background, a gradation process for each area that obtains a high-quality dynamic range compressed image by separately performing a gradation correction process on the main subject and other areas. In order to realize these techniques, not only the frame image data is recorded, but also a distance map indicating the distance distribution within the angle of view (at least information indicating the relative relationship of the distances) is acquired together. It is necessary. In the gradation processing for each area, it is necessary to record a RAW image before image processing.

一方、動画像の全フレームに対し、前記画像処理を行うために距離マップやＲＡＷ画像のデータを記録しておく形態では、記録容量が膨大になり、ユーザの負担が増すという課題がある。従って、事後処理用の情報を取得する場合にフレーム画像のデータ量を必要最小限に抑えることも非常に重要である。
本発明は、記録容量を抑えつつ、利便性の高い画像記録を行うことができる、撮像画像と付加情報を取得可能な画像処理装置の提供を目的とする。 On the other hand, in the form in which the distance map and the raw image data are recorded for performing the image processing on all the frames of the moving image, there is a problem that the recording capacity becomes enormous and the burden on the user increases. Therefore, it is very important to minimize the data amount of the frame image when acquiring post-processing information.
An object of the present invention is to provide an image processing apparatus capable of acquiring a captured image and additional information, which can perform highly convenient image recording while suppressing a recording capacity.

本発明の一実施形態の装置は、複数の画像データを取得して記録媒体に記録する記録手段を備える画像処理装置であって、画像データに対応する、被写体の深度分布情報を取得する取得手段と、前記画像データおよび前記画像データに対応する前記深度分布情報を前記記録手段により前記記録媒体に記録する第１のモードと、前記深度分布情報を記録せずに前記画像データを前記記録手段により前記記録媒体に記録する第２のモードとを切り替えて前記複数の画像データの記録処理を行う制御手段と、を備えることを特徴とする画像処理装置。 An apparatus according to an embodiment of the present invention is an image processing apparatus including a recording unit that acquires a plurality of image data and records the acquired image data on a recording medium, and acquires an object depth distribution information corresponding to the image data. A first mode for recording the image data and the depth distribution information corresponding to the image data on the recording medium by the recording means, and the image data without recording the depth distribution information by the recording means. An image processing apparatus comprising: control means for performing a recording process of the plurality of image data by switching a second mode for recording on the recording medium.

本発明によれば、記録容量を抑えつつ、利便性の高い画像記録を行うことができる。 According to the present invention, it is possible to perform highly convenient image recording while suppressing the recording capacity.

本発明の第１実施形態に係る撮像装置の構成を示すブロック図である。1 is a block diagram illustrating a configuration of an imaging apparatus according to a first embodiment of the present invention. 第１実施形態における記録モード制御部のブロック図である。FIG. 3 is a block diagram of a recording mode control unit in the first embodiment. 第１実施形態における記録モード制御部の処理を示すフローチャートである。It is a flowchart which shows the process of the recording mode control part in 1st Embodiment. 主被写体領域検出結果の一例を示す説明図である。It is explanatory drawing which shows an example of the main subject area | region detection result. 主被写体領域と背景領域の距離ヒストグラムを例示する図である。It is a figure which illustrates the distance histogram of a main subject area and a background area. 物距離と像距離を示す光学モデルの一例を示す図である。It is a figure which shows an example of the optical model which shows an object distance and an image distance. デフォーカス量と物面距離との関係をグラフで示す図である。It is a figure which shows the relationship between a defocus amount and an object surface distance with a graph. 第１実施形態における画像処理部のブロック図である。It is a block diagram of an image processing part in a 1st embodiment. 第１実施形態における背景画像ぼかし処理のフローチャートである。It is a flowchart of the background image blurring process in 1st Embodiment. 背景画像のぼかし処理が可能であることを知らせる画面例を示す図である。It is a figure which shows the example of a screen which notifies that the blurring process of a background image is possible. 距離マップ整形部の処理を説明する図である。It is a figure explaining the process of a distance map shaping part. ピント被写体抽出部の処理を説明する図である。It is a figure explaining the process of a focus subject extraction part. ぼかし処理を説明する図である。It is a figure explaining a blurring process. 本発明の第２実施形態における記録モード制御部のブロック図である。It is a block diagram of the recording mode control part in 2nd Embodiment of this invention. 第２実施形態における記録モード制御部の処理を示すフローチャートである。It is a flowchart which shows the process of the recording mode control part in 2nd Embodiment. 閾値算出部の処理を説明する図である。It is a figure explaining the process of a threshold value calculation part. 第２実施形態における画像処理部のブロック図である。It is a block diagram of the image processing part in 2nd Embodiment. 第２実施形態における領域別階調補正処理のフローチャートである。It is a flowchart of the gradation correction process classified by area in 2nd Embodiment. 領域別階調補正処理が可能であることを知らせる画面例を示す図である。It is a figure which shows the example of a screen which notifies that the gradation correction process according to area | region is possible. 第２実施形態における階調特性算出方法を説明する図である。It is a figure explaining the gradation characteristic calculation method in 2nd Embodiment. 合成処理を説明する図である。It is a figure explaining a synthetic | combination process. 本発明の第３実施形態に係る撮像装置の構成を示すブロック図である。It is a block diagram which shows the structure of the imaging device which concerns on 3rd Embodiment of this invention. 第３実施形態における記録情報調整部の処理を示すフローチャートである。It is a flowchart which shows the process of the recording information adjustment part in 3rd Embodiment. 主被写体スコア算出処理を説明する図である。It is a figure explaining a main subject score calculation process. 主被写体スコア閾値の算出処理を説明する図である。It is a figure explaining the calculation process of the main subject score threshold value. 本発明の第４実施形態に係る撮像装置の構成を示すブロック図である。It is a block diagram which shows the structure of the imaging device which concerns on 4th Embodiment of this invention. 第４実施形態における露出条件制御部の処理を示すフローチャートである。It is a flowchart which shows the process of the exposure condition control part in 4th Embodiment. 画像と距離マップの記録形式を説明する図である。It is a figure explaining the recording format of an image and a distance map.

以下、図面を参照して本発明の各実施形態について説明する。各実施形態では、高性能な領域抽出処理を伴う画像処理を、ユーザの指示に応じて後から行う画像処理装置について説明する。 Hereinafter, each embodiment of the present invention will be described with reference to the drawings. In each embodiment, an image processing apparatus that performs image processing with high-performance region extraction processing later according to a user instruction will be described.

［第１実施形態］
本発明の第１実施形態では、画像処理装置において、記録された複数フレームの連続画像に関し、後からユーザが選択した任意のフレームに対して、背景画像のぼかし処理を行う場合を想定して説明する。そこで、本実施形態では、複数の撮像画像に対して付加情報としての深度分布情報を記録するモードと記録しないモードとを切り替えて制御可能な画像処理装置を提供する。図１は、本実施形態の画像処理装置の一例としての撮像装置に適用可能な構成を例示したブロック図である。
撮像光学系１０１は、ズームレンズやフォーカスレンズ等のレンズ群、絞り調整装置、シャッタ装置を備える。撮像光学系１０１は、撮像部１０２に到達する被写体像の倍率やピント位置、光量を調整する。撮像部１０２はＣＣＤ（電荷結合素子）イメージセンサやＣＭＯＳ（相補型金属酸化膜半導体）イメージセンサ等を備え、撮像光学系１０１を通過した被写体からの光束を光電変換によって電気信号に変換する。Ａ（アナログ）／Ｄ（デジタル）変換部１０３は、入力されたアナログの電気信号をデジタル画像信号に変換する。 [First Embodiment]
In the first embodiment of the present invention, an explanation is given on the assumption that a background image blurring process is performed on an arbitrary frame selected by a user later, with respect to a recorded continuous image of a plurality of frames in an image processing apparatus. To do. Therefore, the present embodiment provides an image processing apparatus that can be controlled by switching between a mode in which depth distribution information as additional information is recorded and a mode in which recording is not performed for a plurality of captured images. FIG. 1 is a block diagram illustrating a configuration applicable to an imaging apparatus as an example of an image processing apparatus of this embodiment.
The imaging optical system 101 includes a lens group such as a zoom lens and a focus lens, a diaphragm adjusting device, and a shutter device. The imaging optical system 101 adjusts the magnification, focus position, and light amount of the subject image that reaches the imaging unit 102. The imaging unit 102 includes a CCD (charge coupled device) image sensor, a CMOS (complementary metal oxide semiconductor) image sensor, and the like, and converts a light beam from a subject that has passed through the imaging optical system 101 into an electrical signal by photoelectric conversion. The A (analog) / D (digital) conversion unit 103 converts the input analog electric signal into a digital image signal.

記録モード制御部１０４は複数の記録モードを有し、記録部１０９に記録する画像の情報量を制御する。画像処理部１０５は、Ａ／Ｄ変換部１０３から出力される画像信号の他、記録部１０９から読み出した画像信号に対して各種の処理を行う。例えば撮像光学系に起因する歪みやノイズの補正処理、デモザイキング処理、ホワイトバランス調整、色変換処理、ガンマ補正などの処理が実行される。また画像処理部１０５は、所定の画像処理の他に、本実施形態で想定している背景ぼかし処理を行う。本実施形態では、上記画像処理部１０５による画像処理のうち、撮像光学系に起因する歪みやノイズの補正処理以外の画像加工の少なくとも一部を行っていない画像をＲＡＷ画像とする。本実施形態では画像処理部１０５が再生処理を行う例を説明するが、画像処理部１０５とは別に再生処理部を設けてもよい。 The recording mode control unit 104 has a plurality of recording modes, and controls the amount of information of an image recorded in the recording unit 109. The image processing unit 105 performs various processes on the image signal read from the recording unit 109 in addition to the image signal output from the A / D conversion unit 103. For example, correction processing such as distortion and noise caused by the imaging optical system, demosaicing processing, white balance adjustment, color conversion processing, and gamma correction are executed. The image processing unit 105 performs background blur processing assumed in the present embodiment in addition to predetermined image processing. In the present embodiment, among the image processing performed by the image processing unit 105, an image that has not been subjected to at least a part of image processing other than distortion and noise correction processing caused by the imaging optical system is defined as a RAW image. In this embodiment, an example in which the image processing unit 105 performs the reproduction process will be described. However, a reproduction processing unit may be provided separately from the image processing unit 105.

システム制御部１０６は、撮像装置全体の動作制御を統括する制御中枢部であり、ＣＰＵ（中央演算処理装置）やメモリ等を備える。システム制御部１０６は、ユーザによる操作部１０７からの操作指示にしたがって、撮像光学系１０１や撮像部１０２の駆動制御、画像処理部１０５における所定の画像処理等の制御を行う。 The system control unit 106 is a control central unit that controls the overall operation control of the imaging apparatus, and includes a CPU (Central Processing Unit), a memory, and the like. The system control unit 106 performs drive control of the imaging optical system 101 and the imaging unit 102, predetermined image processing in the image processing unit 105, and the like in accordance with an operation instruction from the operation unit 107 by the user.

表示部１０８は、液晶ディスプレイや有機ＥＬ（Electro Luminescence）ディスプレイ等で構成され、撮像部１０２によって生成された画像信号や、記録部１０９から読み出した画像信号にしたがって画像を表示する。記録部１０９は画像信号を記録媒体へ記録する処理を行う。静止画像、動画像それぞれの記録設定に従って、静止画向けの符号化形式（例えばJPEGなど）、動画向けの符号化形式（例えばH.264,H.265など）で記録処理が行われる。また、静止画像、動画像それぞれ画像加工前のＲＡＷ画像で記録する設定である場合には、非圧縮あるいは可逆圧縮など、画像加工後の画像データの記録時よりも画質の劣化の少ない、低圧縮率で圧縮、記録が行われる。記録媒体は、例えば半導体メモリが搭載されたメモリカードや光磁気ディスク等の回転記録体を収容したパッケージ等を用いた情報記録媒体である。記録媒体は撮像装置に着脱可能である。 The display unit 108 is configured by a liquid crystal display, an organic EL (Electro Luminescence) display, or the like, and displays an image according to an image signal generated by the imaging unit 102 or an image signal read from the recording unit 109. The recording unit 109 performs processing for recording an image signal on a recording medium. Recording processing is performed in an encoding format for still images (for example, JPEG) and an encoding format for moving images (for example, H.264, H.265, etc.) according to the recording settings for each of still images and moving images. In addition, when the settings are set so that still images and moving images are recorded as raw images before image processing, low compression, such as non-compression or lossless compression, with less image quality deterioration than when image data is recorded after image processing. Compression and recording are performed at a rate. The recording medium is an information recording medium using, for example, a package containing a rotating recording body such as a memory card on which a semiconductor memory is mounted or a magneto-optical disk. The recording medium is detachable from the imaging device.

バス１１０は、記録モード制御部１０４、画像処理部１０５、システム制御部１０６、表示部１０８、および記録部１０９の間で信号を送受し合うために用いられる。
以下、記録モード制御部１０４と画像処理部１０５を中心にして、本実施形態における処理の流れについて説明する。記録モード制御部１０４は本実施形態にて特徴的な処理ブロックである。また画像処理部１０５は、撮影後の操作指示に応じて背景画像のぼかし処理を行う処理ブロックである。 The bus 110 is used for transmitting and receiving signals among the recording mode control unit 104, the image processing unit 105, the system control unit 106, the display unit 108, and the recording unit 109.
Hereinafter, the flow of processing in the present embodiment will be described with a focus on the recording mode control unit 104 and the image processing unit 105. The recording mode control unit 104 is a characteristic processing block in this embodiment. The image processing unit 105 is a processing block that performs a background image blurring process according to an operation instruction after shooting.

図２および図３を参照して、記録モード制御部１０４の動作について説明する。
図２は記録モード制御部１０４のブロック図である。主被写体領域検出部２０１は、入力画像のデータを取得して主被写体領域を検出する。主被写体領域とは、例えば複数の被写体から選択される被写体の画像領域である。距離マップ算出部２０２は、入力画像のデータを取得して距離マップ（深度分布情報）を算出する。主被写体距離算出部２０３は、主被写体領域検出部２０１により検出された主被写体領域の情報と、距離マップ算出部２０２により算出された距離マップを取得して主被写体距離を算出する。主被写体距離は撮像装置から主被写体までの距離である。背景距離算出部２０４は、背景領域の情報と距離マップから背景距離を算出する。背景領域は主被写体以外の領域であり、背景距離は撮像装置から背景までの距離である。閾値算出部２０５は主被写体距離を取得して、主被写体と背景との距離差に対する閾値を算出する。記録モード判断部２０６は、主被写体距離および背景距離と閾値を取得し、主被写体距離と背景距離の差分と閾値を比較して記録モードの判断処理を行う。記録モードは、画像データおよび距離情報を記録する第１のモードと、画像データを記録する第２のモードを少なくとも含む。本実施形態では、撮像により取得された画像データを各フレームとする動画像について、各フレームの記録時に第１および第２のモードを切り替える制御について説明するが、制御としてはこれに限らない。例えば静止画撮像において、各撮像で第１および第２のモードの切替を行う実施形態も本発明に含まれる。すなわち、取得される複数の画像データについて、第１のモードと第２のモードを切り替える制御であればよい。各部の処理の詳細については後述する。 The operation of the recording mode control unit 104 will be described with reference to FIGS.
FIG. 2 is a block diagram of the recording mode control unit 104. The main subject area detection unit 201 acquires input image data and detects a main subject area. The main subject area is an image area of a subject selected from a plurality of subjects, for example. The distance map calculation unit 202 obtains input image data and calculates a distance map (depth distribution information). The main subject distance calculation unit 203 acquires information on the main subject area detected by the main subject area detection unit 201 and the distance map calculated by the distance map calculation unit 202 to calculate the main subject distance. The main subject distance is a distance from the imaging device to the main subject. The background distance calculation unit 204 calculates the background distance from the background area information and the distance map. The background area is an area other than the main subject, and the background distance is the distance from the imaging device to the background. The threshold calculation unit 205 acquires the main subject distance and calculates a threshold for the difference in distance between the main subject and the background. The recording mode determination unit 206 acquires the main subject distance, the background distance, and the threshold value, and compares the difference between the main subject distance and the background distance with the threshold value to perform a recording mode determination process. The recording mode includes at least a first mode for recording image data and distance information, and a second mode for recording image data. In the present embodiment, control for switching between the first and second modes at the time of recording each frame for a moving image having image data acquired by imaging as each frame will be described, but the control is not limited thereto. For example, in still image capturing, an embodiment in which the first and second modes are switched in each image capturing is also included in the present invention. That is, it is only necessary to control the switching between the first mode and the second mode for a plurality of acquired image data. Details of the processing of each unit will be described later.

図３は記録モード制御部１０４の処理を説明するフローチャートである。本フローチャートでは動画の取得中である場合を想定しており、ユーザによる撮影開始の指示時点から撮影動作終了の指示時点までの間、所定のフレームレートで連続的にフレーム画像が入力されるものとする。 FIG. 3 is a flowchart for explaining the processing of the recording mode control unit 104. In this flowchart, it is assumed that moving images are being acquired, and frame images are continuously input at a predetermined frame rate from the time point when the user starts shooting until the time point when the shooting operation ends. To do.

まず、Ｓ３０１にて主被写体領域検出部２０１は主被写体領域を検出する。図４を参照して具体例を説明する。図４は検出された主被写体領域の一例を示しており、顔検出枠を利用して主被写体領域が検出される。その他には、主被写体が人物でない場合、一般的な物体検出によって被写体領域が検出されるものとし、特定の検出方法には限定されない。また、図４に示すように、画像全体の領域において主被写体領域でない領域を、背景領域と定義する。 First, in S301, the main subject area detection unit 201 detects a main subject area. A specific example will be described with reference to FIG. FIG. 4 shows an example of the detected main subject area, and the main subject area is detected using the face detection frame. In addition, when the main subject is not a person, the subject region is detected by general object detection, and is not limited to a specific detection method. Further, as shown in FIG. 4, an area that is not the main subject area in the entire image area is defined as a background area.

次にＳ３０２で距離マップ算出部２０２は距離マップを算出する。距離マップの算出方法としては、例えば特許文献３に記載されているように、被写体からの光を瞳分割して複数の視点画像（視差画像）を生成し、視差量を算出して被写体の深度分布情報を取得する方法がある。被写体の深度分布情報とは、撮像手段としてのカメラから被写体までの距離（被写体距離）を絶対値として距離値で表わすデータや、画像データにおける相対的な距離関係（画像の深度）を示すデータ（視差量の分布、デフォーカス量の分布等）を含む。本実施形態では距離値で表すデータとして以後の説明を行うが、深度分布情報として視差量あるいはデフォーカス量の分布を用いる場合は、距離値がそれぞれ視差量、デフォーカス量に置き換えて各処理がなされるものとする。
深度分布情報に関して、撮像画像内の各被写体の奥行き方向（深さ方向）の深度に対応する情報としてさまざまな実施形態がある。つまり、被写体の深さに対応するデータが示す情報は、画像内における撮像装置から被写体までの被写体距離を直接的に表すか、または画像内の被写体の距離（被写体距離）や深さの相対関係を表す情報であればよい。例えば、撮像部１０２に対して合焦位置を変更する制御が行われ、撮影された複数の撮像画像データが取得される。それぞれの撮像画像データの合焦領域と、撮像画像データの合焦位置情報から深度分布情報を取得することができる。この他にも、撮像部１０２の撮像素子が瞳分割型の画素構成を有する場合、一対の像信号の位相差から各画素に対する深度分布情報を取得可能である。具体的には、撮像素子は、撮像光学系の異なる瞳部分領域を通過する一対の光束が光学像としてそれぞれ結像したものを電気信号に変換し、対をなす画像データを複数の光電変換部から出力する。対をなす画像データ間の相関演算によって各領域の像ずれ量が算出され、像ずれ量の分布を表す像ずれマップが算出される。あるいはさらに像ずれ量がデフォーカス量に換算され、デフォーカス量の分布（撮像画像の２次元平面上の分布）を表すデフォーカスマップが生成される。このデフォーカス量を撮像光学系や撮像素子の条件に基づいて被写体距離に換算すると、被写体距離の分布を表す距離マップデータが得られる。像ずれマップデータ、デフォーカスマップデータ、あるいはデフォーカス量から変換される被写体距離の距離マップデータを取得可能である。
また、被写体への投光から反射光を受けるまでの遅延時間を測定して被写体までの距離計測を行うＴＯＦ（ＴｉｍｅＯｆＦｌｉｇｈｔ）法を用いて画像内における撮像装置から被写体までの被写体距離を直接的に取得してもよい。ＴＯＦ法では、投光手段により被写体（対象物）にパルス光を投射して、その反射光を撮像部１０２で受光し、このパルス光の飛行時間（遅れ時間）を測定することで被写体距離（対象物までの距離）を測り、深度分布情報を取得する。
本実施形態では距離値で表すデータとして以後の説明を行うが、深度分布情報として視差量あるいはデフォーカス量の分布を用いる場合は、距離値がそれぞれ視差量、デフォーカス量に置き換えて各処理がなされるものとする。 In step S302, the distance map calculation unit 202 calculates a distance map. As a distance map calculation method, for example, as described in Patent Document 3, the light from the subject is divided into pupils to generate a plurality of viewpoint images (parallax images), the amount of parallax is calculated, and the depth of the subject is calculated. There is a method for acquiring distribution information. The depth distribution information of the subject is data representing the distance from the camera as the imaging means to the subject (subject distance) as an absolute value, or data indicating a relative distance relationship (image depth) in the image data ( Parallax amount distribution, defocus amount distribution, etc.). In the present embodiment, the following description will be given as data represented by a distance value. However, when the disparity amount or defocus amount distribution is used as the depth distribution information, the distance value is replaced with the disparity amount and the defocus amount, respectively. Shall be made.
Regarding the depth distribution information, there are various embodiments as information corresponding to the depth in the depth direction (depth direction) of each subject in the captured image. That is, the information indicated by the data corresponding to the depth of the subject directly represents the subject distance from the imaging device to the subject in the image, or the relative relationship between the distance (subject distance) and the depth of the subject in the image. May be any information that represents For example, control for changing the in-focus position is performed on the imaging unit 102, and a plurality of captured image data captured is acquired. The depth distribution information can be acquired from the focus area of each captured image data and the focus position information of the captured image data. In addition, when the imaging element of the imaging unit 102 has a pupil division type pixel configuration, it is possible to acquire depth distribution information for each pixel from a phase difference between a pair of image signals. Specifically, the imaging device converts a pair of light beams that pass through different pupil partial regions of the imaging optical system into optical signals, and converts the paired image data into a plurality of photoelectric conversion units. Output from. The image shift amount of each region is calculated by the correlation calculation between the paired image data, and an image shift map representing the distribution of the image shift amount is calculated. Alternatively, the image shift amount is further converted into a defocus amount, and a defocus map representing the distribution of the defocus amount (distribution on the two-dimensional plane of the captured image) is generated. When this defocus amount is converted into the subject distance based on the conditions of the imaging optical system and the imaging element, distance map data representing the distribution of the subject distance is obtained. Image shift map data, defocus map data, or distance map data of the subject distance converted from the defocus amount can be acquired.
In addition, the subject distance from the imaging device to the subject in the image is directly measured using the TOF (Time Of Flight) method that measures the delay time from the light projection to the subject and receiving the reflected light to measure the distance to the subject. May be acquired automatically. In the TOF method, pulse light is projected onto a subject (object) by a light projecting means, the reflected light is received by the imaging unit 102, and the flight time (delay time) of the pulse light is measured to measure the subject distance ( Measure the distance to the object) and obtain the depth distribution information.
In the present embodiment, the following description will be given as data represented by a distance value. However, when the disparity amount or defocus amount distribution is used as the depth distribution information, the distance value is replaced with the disparity amount and the defocus amount, respectively. Shall be made.

Ｓ３０３にて主被写体距離算出部２０３は、主被写体領域情報および距離マップを用いて主被写体距離（主被写体に対応する視差量、デフォーカス量）を算出する。Ｓ３０４にて背景距離算出部２０４は、背景領域情報と距離マップを用いて背景距離を算出する。図５を参照してＳ３０３およびＳ３０４の処理について具体例を説明する。図５は、主被写体距離および背景距離の算出方法を示す図である。横軸は撮像装置の光軸方向の距離を表し、縦軸は距離ヒストグラムの頻度（度数）を表す。 In step S303, the main subject distance calculation unit 203 calculates a main subject distance (a parallax amount and a defocus amount corresponding to the main subject) using the main subject region information and the distance map. In S304, the background distance calculation unit 204 calculates the background distance using the background area information and the distance map. A specific example of the processing of S303 and S304 will be described with reference to FIG. FIG. 5 is a diagram illustrating a method of calculating the main subject distance and the background distance. The horizontal axis represents the distance in the optical axis direction of the imaging apparatus, and the vertical axis represents the frequency (frequency) of the distance histogram.

まず主被写体距離算出部２０３は、図５（Ａ）に示す主被写体領域内の距離ヒストグラムを取得し、頻度が最大値となるピーク値に対応する距離を主被写体距離とする。図５（Ａ）では２つのピーク値が存在する例を示している。主被写体距離以外にもピーク値が存在している理由は、主被写体領域の境界を矩形枠で規定しているため、背景画像の一部が主被写体領域に入り込んでいるからである。 First, the main subject distance calculation unit 203 acquires a distance histogram in the main subject area shown in FIG. 5A, and sets the distance corresponding to the peak value with the maximum frequency as the main subject distance. FIG. 5A shows an example in which two peak values exist. The reason why there is a peak value other than the main subject distance is that a part of the background image has entered the main subject region because the boundary of the main subject region is defined by a rectangular frame.

次に背景距離算出部２０４は、図５（Ｂ）に示す背景領域内の距離ヒストグラムを取得し、頻度が最大値となるピーク値に対応する距離を背景距離とする。図５（Ｂ）では３つのピーク値が存在する例を示している。 Next, the background distance calculation unit 204 acquires a distance histogram in the background region shown in FIG. 5B, and sets the distance corresponding to the peak value having the maximum frequency as the background distance. FIG. 5B shows an example in which three peak values exist.

Ｓ３０５にて閾値算出部２０５は、主被写体と背景との距離差に対する閾値を算出する。閾値については、主被写体距離の値から光学モデルを用いて決定される。図６を参照して説明する。図６の光学モデルは、撮像光学系１０１を１つのレンズとして近似した場合における、物距離と像距離を示す。焦点距離（fと記す）のレンズに対し、物体が存在する側（図６にてレンズの左側）を物面側とし、レンズを通して物体からの光が結像する側（図６にてレンズの右側）を像面側と定義する。さらに、主被写体にピントが合っていると仮定し、以下の距離を定義する。
・主被写体距離：Dist_Obj
・背景距離：Dist_Back
・主被写体像距離：Img_Obj
主被写体像距離は、主被写体からの光が像面側に結像する距離である。さらに、主被写体からの光が結像する像面側の位置をピント面とし、ピント面から、背景の光が結像する位置までの変位量をデフォーカス量defとする。ここで、デフォーカス量defの符号については、レンズに対して物面側から遠ざかる方向を正とする。従って、物面上で主被写体よりも遠い位置にある背景の光が結像する位置でのデフォーカス量は負値となる。定性的には、デフォーカス量の絶対値が大きいほど、ピント面において背景像の散乱度合いが大きくなり、背景のボケが大きくなる。 In S305, the threshold value calculation unit 205 calculates a threshold value for the difference in distance between the main subject and the background. The threshold is determined using an optical model from the value of the main subject distance. This will be described with reference to FIG. The optical model in FIG. 6 shows the object distance and the image distance when the imaging optical system 101 is approximated as one lens. The side where the object exists (left side of the lens in FIG. 6) is the object surface side with respect to the lens of the focal length (denoted as f), and the side on which the light from the object forms an image through the lens (the lens (Right side) is defined as the image plane side. Further, assuming that the main subject is in focus, the following distance is defined.
・ Main subject distance: Dist_Obj
・ Background distance: Dist_Back
・ Main subject image distance: Img_Obj
The main subject image distance is a distance at which light from the main subject forms an image on the image plane side. Further, a position on the image plane side where light from the main subject forms an image is defined as a focus plane, and a displacement amount from the focus plane to a position where background light is imaged is defined as a defocus amount def. Here, regarding the sign of the defocus amount def, the direction away from the object side with respect to the lens is positive. Therefore, the defocus amount at a position where the background light at a position farther from the main subject on the object image forms a negative value. Qualitatively, the greater the absolute value of the defocus amount, the greater the degree of scattering of the background image on the focal plane, and the greater the background blur.

図６において、レンズの公式より、下記（１）式および（２）式が成立する。
ここで、（１）式および（２）式から主被写体像距離Img_objを消去し、背景距離Dist_Backについての式に変形すると、下記（３）式が得られる。
（３）式より、背景距離Dist_Backは、主被写体距離Dist_Objおよび焦点距離fが一意に定まった場合、デフォーカス量defの関数であるとみなすことができる。したがって、本実施形態では、距離マップとして主として被写体距離を算出する例を示しているが、デフォーカス量、視差量によっても目的が達成されるのである。 In FIG. 6, the following formulas (1) and (2) are established from the lens formula.
Here, when the main subject image distance Img_obj is deleted from the expressions (1) and (2) and transformed into an expression for the background distance Dist_Back, the following expression (3) is obtained.
From the expression (3), the background distance Dist_Back can be regarded as a function of the defocus amount def when the main subject distance Dist_Obj and the focal distance f are uniquely determined. Therefore, in the present embodiment, an example in which the subject distance is mainly calculated as the distance map is shown, but the object is also achieved by the defocus amount and the parallax amount.

図７は、式（３）に基づいて作成された、デフォーカス量（横軸）と物面距離（縦軸）との関係を例示したグラフである。焦点距離を50mmとし、主被写体距離Dist_objが3m(3000mm)の場合と5m(5000mm)の場合の２例を示す。主被写体にピントが合っていると仮定しているので、物体距離がDist_objと等しいときにデフォーカス量は0となる。仮に、背景が十分にぼけているとみなせる許容デフォーカス量を、-0.2mmとしたとき、対応する物体距離は、Dist_obj=3mの場合、約3911mmである。従って、被写体と背景との距離差に対する閾値は、|3911-3000|=911mmとなり、およそ90cmと算出される。
以上が、Ｓ３０５にて閾値算出部２０５が行う処理の説明である。 FIG. 7 is a graph illustrating the relationship between the defocus amount (horizontal axis) and the object surface distance (vertical axis) created based on Expression (3). Two examples are shown when the focal length is 50 mm and the main subject distance Dist_obj is 3 m (3000 mm) and 5 m (5000 mm). Since it is assumed that the main subject is in focus, the defocus amount is 0 when the object distance is equal to Dist_obj. If the allowable defocus amount that can be considered that the background is sufficiently blurred is −0.2 mm, the corresponding object distance is about 3911 mm when Dist_obj = 3 m. Therefore, the threshold for the distance difference between the subject and the background is | 3911-3000 | = 911 mm, and is calculated as approximately 90 cm.
The above is description of the process which the threshold value calculation part 205 performs in S305.

次にＳ３０６にて記録モード判断部２０６は、主被写体距離と背景距離との差分と、閾値算出部２０５から取得した閾値とを比較する。主被写体距離と背景距離との差分が閾値以内である場合、記録モード判断部２０６は、背景画像のボケが十分でなく、後から電子的なぼかし処理を要する可能性があると判断してＳ３０７へ進み、第１のモードである高品質モードでの記録処理を行う。高品質モードでは通常圧縮フレームのデータの他に距離マップを記録する処理が実行される。高品質モードで記録した情報を用いて背景をぼかす処理については後述する。一方、主被写体距離と背景距離との差分が閾値よりも大きい場合、記録モード判断部２０６は、背景画像のボケが十分であるとみなす。つまり、記録モード判断部２０６は電子的なぼかし処理が不要であると判断し、Ｓ３０８へ進み、第２のモードである通常モードでの記録処理を行う。通常モードでは通常圧縮フレームのデータのみを記録する処理が実行される。なお、フレーム圧縮処理および現像処理に伴う所定の信号処理については、画像処理部１０５内で行われるものとする。記録モード制御部１０４による処理は撮影時に行われる。また、高品質モードで記録される画像は通常圧縮フレームに限らず、画像加工前のＲＡＷ画像でもよい。ここでＲＡＷ画像には記録時に可逆の圧縮方式で圧縮処理が施されていてもよい。 In step S <b> 306, the recording mode determination unit 206 compares the difference between the main subject distance and the background distance with the threshold acquired from the threshold calculation unit 205. If the difference between the main subject distance and the background distance is within the threshold value, the recording mode determination unit 206 determines that there is a possibility that the background image is not sufficiently blurred and an electronic blurring process may be required later (S307). The recording process is performed in the high quality mode, which is the first mode. In the high quality mode, a process of recording a distance map in addition to the data of the normal compressed frame is executed. The process of blurring the background using information recorded in the high quality mode will be described later. On the other hand, if the difference between the main subject distance and the background distance is larger than the threshold, the recording mode determination unit 206 considers that the background image is sufficiently blurred. That is, the recording mode determination unit 206 determines that electronic blurring processing is unnecessary, and the process proceeds to S308, where recording processing in the normal mode that is the second mode is performed. In the normal mode, processing for recording only data of a normal compressed frame is executed. Note that the predetermined signal processing accompanying the frame compression processing and the development processing is performed in the image processing unit 105. The processing by the recording mode control unit 104 is performed at the time of shooting. Further, the image recorded in the high quality mode is not limited to the normal compressed frame, but may be a RAW image before image processing. Here, the RAW image may be compressed by a reversible compression method at the time of recording.

次に、記録後にユーザの指示に応じて行われる背景画像のぼかし処理に関して説明する。本処理は主に画像処理部１０５が行うが、ユーザの指示を受け付ける操作部１０７およびシステム制御部１０６も関与する。本処理は記録後に行われるため、以下では「事後ぼかし処理」と呼ぶ。図８および図９を参照して、事後ぼかし処理について説明する。 Next, a background image blurring process performed in accordance with a user instruction after recording will be described. This processing is mainly performed by the image processing unit 105, but the operation unit 107 and the system control unit 106 that receive user instructions are also involved. Since this process is performed after recording, it is hereinafter referred to as “post-blurring process”. The post-blurring process will be described with reference to FIGS.

図８は、事後ぼかし処理に関する画像処理部１０５の処理ブロック図である。距離マップ整形部８０１は距離マップの整形処理を行う。距離マップの整形処理について後述する。ピント被写体抽出部８０２はピント被写体の画像を抽出する。ピント被写体はピントが合っている主被写体である。ぼかし処理部８０３は、背景画像のぼかし処理を行い、処理後のフレーム画像データを出力する。各部の処理の詳細については後述する。 FIG. 8 is a processing block diagram of the image processing unit 105 regarding the post-blurring process. The distance map shaping unit 801 performs distance map shaping processing. The distance map shaping process will be described later. A focus subject extraction unit 802 extracts an image of the focus subject. The focus subject is the main subject in focus. The blur processing unit 803 performs a blur process on the background image, and outputs the processed frame image data. Details of the processing of each unit will be described later.

図９は、事後ぼかし処理のフローチャートである。
まず、Ｓ９０１にて撮像装置は、背景画像のぼかし処理の指示をユーザから受け付ける。事後ぼかし処理が可能なフレームは高品質モードで記録したフレームのみであるので、処理可能なフレームであるか否かをユーザに知らせることが必要となる。図１０は、表示部１０８によって、事後ぼかし処理が可能であることをユーザに通知（報知）する場合の表示例を示す。背景画像のぼかし処理を行うか否かの指示を仰ぐ表示および入力処理が行われる。ユーザが動画再生時に背景画像のぼかし処理を行う事を選択した場合、システム制御部１０６は操作部１０７から操作指示を受け付け、画像処理部１０５に対し、Ｓ９０２以降の処理を行うように命令する。 FIG. 9 is a flowchart of the post-blurring process.
First, in step S <b> 901, the imaging apparatus receives an instruction for a background image blurring process from a user. Since only frames recorded in the high quality mode can be subjected to post-blurring processing, it is necessary to notify the user whether or not the frames are processable. FIG. 10 shows a display example when the display unit 108 notifies (notifies) the user that post-blurring processing is possible. Display and input processing are performed for an instruction as to whether or not to perform the background image blurring process. If the user selects to perform background image blurring during moving image playback, the system control unit 106 receives an operation instruction from the operation unit 107 and instructs the image processing unit 105 to perform the processing from S902 onward.

Ｓ９０２にて距離マップ整形部８０１はフレーム画像と距離マップのデータを取得し、距離マップの整形処理を行う。図１１は、距離マップの整形処理を説明する図である。図１１（Ａ）は入力フレーム画像を例示し、図１１（Ｂ）は整形処理前の入力距離マップを例示し、図１１（Ｃ）は整形処理後の出力距離マップを例示する。 In step S902, the distance map shaping unit 801 acquires frame image and distance map data, and performs distance map shaping processing. FIG. 11 is a diagram for explaining the distance map shaping process. FIG. 11A illustrates an input frame image, FIG. 11B illustrates an input distance map before the shaping process, and FIG. 11C illustrates an output distance map after the shaping process.

図１１（Ｂ）に示す入力距離マップの被写体輪郭は、図１１（Ａ）に示す入力フレーム画像における被写体輪郭に対し、精度が低下している場合がある。その理由としては、距離算出の演算時における被写体境界部分の遠近競合の影響や、フレーム画像に対して解像度を低くして距離マップの算出が行われること等が挙げられる。背景画像のぼかし処理には高精度な輪郭抽出が必要とされる。このため、距離マップの被写体輪郭をフレーム画像の被写体輪郭に合わせる処理が必要となり、この処理を整形処理と呼ぶ。もちろん、整形処理後の距離マップは背景画像のぼかし処理以外の距離マップを必要とする処理でも活用できる汎用性の高いものである。 The subject outline of the input distance map shown in FIG. 11B may be less accurate than the subject outline in the input frame image shown in FIG. The reason is that the distance map is calculated by reducing the resolution of the frame image due to the influence of the perspective conflict at the subject boundary part when calculating the distance. Background blurring processing requires high-precision contour extraction. For this reason, it is necessary to perform processing for matching the subject contour of the distance map with the subject contour of the frame image, and this processing is called shaping processing. Of course, the distance map after the shaping process is highly versatile and can be used in processes that require a distance map other than the blurring process of the background image.

整形処理はバイラテラルフィルタ処理によって行われる。バイラテラルフィルタ処理では、整形用画像をフレーム画像（図１１（Ａ）参照）として、着目画素位置ｐのフィルタ結果（Ｊｐと記す）が、下記式（４）で表される。
Ｊｐ＝（１／Ｋｐ）ΣＩ１ｑ・ｆ（｜ｐ−ｑ｜）・ｇ（｜Ｉ２ｐ−Ｉ２ｑ｜）・・・（４）
式（４）中の各記号の意味は以下のとおりである。
ｑ：周辺画素位置
Ω ：着目画素位置ｐを中心とする積算対象領域
Σ ：ｑ∈Ω範囲の積算
Ｉ１ｑ：周辺画素位置ｑにおける距離マップ信号値
ｆ（｜ｐ−ｑ｜）：着目画素位置ｐを中心とするガウシアン関数
Ｉ２ｐ：着目画素位置ｐでの整形用画像の画素値
Ｉ２ｑ：周辺画素位置ｑでの整形用画像の画素値
ｇ（｜Ｉ２ｐ−Ｉ２ｑ｜）：整形用画像の画素値Ｉ２ｐを中心とするガウシアン関数
Ｋｐ：正規化係数であり、ｆ・ｇ重みの積算値。
（４）式において、着目画素位置ｐと周辺画素位置ｑとが近いほど、ｆ値が大きくなる。着目画素位置ｐのＩ２ｐと周辺画素位置ｑのＩ２ｑとの差が小さいほど、つまり整形用画像において着目画素と周辺画素の画素値が近いほど、その周辺画素のｇ重み（平滑化の重み）は大きくなる。ｆ・ｇ重みで入力距離マップの信号値Ｉ１ｑを重みづけ加算した出力が、整形後の出力距離マップの信号値Ｊｐとなる。 The shaping process is performed by a bilateral filter process. In the bilateral filter processing, the shaping result is used as a frame image (see FIG. 11A), and the filter result (denoted as Jp) at the target pixel position p is expressed by the following equation (4).
Jp = (1 / Kp) ΣI1q · f (| p−q |) · g (| I2p−I2q |) (4)
The meaning of each symbol in Formula (4) is as follows.
q: peripheral pixel position Ω: integration target area centered on the target pixel position p Σ: integration in the q∈Ω range I1q: distance map signal value at the peripheral pixel position q f (| p−q |): target pixel position p Gaussian function centered at I2p: pixel value of the shaping image at the target pixel position p I2q: pixel value of the shaping image at the peripheral pixel position q g (| I2p−I2q |): pixel value I2p of the shaping image Gaussian function centered on Kp: normalization coefficient, integrated value of f · g weights.
In the equation (4), the f value increases as the target pixel position p and the peripheral pixel position q are closer. The smaller the difference between I2p at the target pixel position p and I2q at the peripheral pixel position q, that is, the closer the pixel value of the target pixel and the peripheral pixel in the shaping image is, the g weight (smoothing weight) of the peripheral pixel is. growing. The output obtained by weighting and adding the signal value I1q of the input distance map with the f · g weight becomes the signal value Jp of the output distance map after shaping.

図１１（Ｄ）は、図１１（Ａ）のフレーム画像の位置ｘにおけるプロファイル１１００ｐｆを表す。プロファイル１１００ｐｆの取得位置を図１１（Ａ）のライン１１００に示す。プロファイル１１００ｐｆは位置ｘａで変化するステップ形状である。また図１１（Ｅ）は、図１１（Ｂ）および（Ｃ）の距離マップの位置ｘにおけるプロファイルを表す。プロファイル１１０１ｐｆの取得位置を図１１（Ｂ）のライン１１０１に示し、プロファイル１１０２ｐｆの取得位置を図１１（Ｃ）のライン１１０２に示す。ライン１１０１の位置とライン１１０２の位置は同じである。 FIG. 11D shows a profile 1100pf at the position x of the frame image in FIG. The acquisition position of the profile 1100 pf is indicated by a line 1100 in FIG. The profile 1100pf has a step shape that changes at the position xa. FIG. 11E shows a profile at a position x in the distance maps of FIGS. 11B and 11C. The acquisition position of the profile 1101pf is indicated by a line 1101 in FIG. 11B, and the acquisition position of the profile 1102pf is indicated by a line 1102 in FIG. The position of the line 1101 and the position of the line 1102 are the same.

図１１（Ｂ）に示す入力距離マップにおいて、被写体の距離を示す信号値は被写体像の輪郭より外側にはみ出している。図１１（Ｅ）に破線で示すプロファイル１１０１ｐｆの変化は、図１１（Ｄ）のプロファイル１１００ｐｆが変化する位置ｘａからずれている。バイラテラルフィルタによる整形処理が実行され、図１１（Ｃ）の整形後の距離マップに対応するプロファイル１１０２ｐｆが得られる。プロファイル１１０２ｐｆの変化する位置は、図１１（Ｄ）のプロファイル１１００ｐｆが変化する位置ｘａに一致し、被写体像の輪郭に合った形状となる。すなわちプロファイル１１０２ｐｆは、位置ｘａで大きく変化するステップ形状である。 In the input distance map shown in FIG. 11B, the signal value indicating the distance of the subject protrudes outside the contour of the subject image. The change in the profile 1101pf indicated by the broken line in FIG. 11E is shifted from the position xa where the profile 1100pf in FIG. 11D changes. A shaping process by the bilateral filter is executed, and a profile 1102 pf corresponding to the shaped distance map of FIG. 11C is obtained. The position at which the profile 1102pf changes corresponds to the position xa at which the profile 1100pf in FIG. 11D changes, and has a shape that matches the contour of the subject image. That is, the profile 1102pf has a step shape that changes greatly at the position xa.

図９のＳ９０３にてピント被写体抽出部８０２はピント被写体の抽出を行う。図１２を参照して具体的に説明する。図１２（Ａ）は入力フレーム画像を例示し、図１２（Ｂ）は整形後の距離マップを例示する。図１２（Ｃ）は抽出特性を例示し、図１２（Ｄ）はピント被写体の抽出結果を例示する。 In step S903 of FIG. 9, the focused subject extracting unit 802 extracts a focused subject. This will be specifically described with reference to FIG. FIG. 12A illustrates an input frame image, and FIG. 12B illustrates a distance map after shaping. FIG. 12C illustrates the extraction characteristics, and FIG. 12D illustrates the extraction result of the focus subject.

図１２（Ｂ）に示す整形後の距離マップに対し、図１２（Ｃ）の抽出特性が適用される。図１２（Ｃ）の横軸は被写体の距離を表し、縦軸は抽出結果の出力値を表す。抽出特性は、あらかじめ定められたピント面距離範囲内の距離のみ最大値を出力し、その他の距離ではゼロまたは最小値を出力する特性である。抽出特性の適用により出力される抽出結果を図１２（Ｄ）に示す。ピントが合っている主被写体のみが抽出される。 The extraction characteristics shown in FIG. 12C are applied to the shaped distance map shown in FIG. The horizontal axis of FIG. 12C represents the distance of the subject, and the vertical axis represents the output value of the extraction result. The extraction characteristic is a characteristic that outputs a maximum value only for a distance within a predetermined focus surface distance range, and outputs zero or a minimum value for other distances. An extraction result output by applying the extraction characteristic is shown in FIG. Only the main subject in focus is extracted.

図９のＳ９０４にてぼかし処理部８０３は、背景画像のぼかし処理を行う。図１３を参照して、ぼかし処理を説明する。図１３（Ａ）は、ぼかしフィルタのカーネルの形状例を示し、図１３（Ｂ）はピント面抽出画像を例示する。図１３（Ｃ）は着目位置に対するぼかしフィルタを例示する。 In S904 of FIG. 9, the blurring processing unit 803 performs a blurring process of the background image. The blurring process will be described with reference to FIG. FIG. 13A shows an example of the shape of the kernel of the blur filter, and FIG. 13B exemplifies a focus surface extraction image. FIG. 13C illustrates a blur filter for the position of interest.

レンズによる丸ぼけを模擬するために、ぼかしフィルタのカーネルの形状は、図１３（Ａ）に示すような略円形であり、フィルタ重み（重み付け係数値）を一定とする。本実施形態では、フィルタサイズを５×５とする。ぼかしフィルタは、入力フレーム画像の各画像に適用される。その際、ぼかし処理部８０３はピント面抽出画像を参照し、背景部分のみにフィルタ処理が行われるように制御する。図１３（Ｂ）は、着目位置Ｐが被写体の右側の位置である場合を例示する。この場合、ぼかしフィルタは、図１３（Ｃ）に○の記号で図示した位置に対してフィルタ演算を行う。×の記号で図示する位置は、ピント被写体の画像領域に属する位置である。これらの位置をフィルタ演算の対象に含めると、輪郭部分が混ざってしまい、画質が劣化する可能性がある。このため、背景部分にのみフィルタ処理を施すように制御が行われる。以上の処理によって、ピント被写体以外の背景領域のみに対し、ぼかし処理を施した画像が得られる。 In order to simulate the round blur caused by the lens, the shape of the kernel of the blur filter is substantially circular as shown in FIG. 13A, and the filter weight (weighting coefficient value) is constant. In this embodiment, the filter size is 5 × 5. The blur filter is applied to each image of the input frame image. At that time, the blurring processing unit 803 refers to the focus surface extraction image and performs control so that only the background portion is subjected to filter processing. FIG. 13B illustrates a case where the target position P is the right position of the subject. In this case, the blur filter performs a filter operation on the position indicated by the symbol “◯” in FIG. The position indicated by the symbol x is a position belonging to the image area of the focused subject. If these positions are included in the target of the filter operation, the outline portion is mixed and the image quality may be deteriorated. For this reason, control is performed so that only the background portion is filtered. By the above processing, an image obtained by performing blurring processing only on the background region other than the focused subject is obtained.

図９のＳ９０５にて、後調整が実行される。本処理は、ぼかしの強度がユーザの嗜好に合っているかどうかを確認した結果、ユーザの嗜好に合っていない場合にぼかしの強度を調整する処理である。これにより、ユーザが望む度合のぼかし画像を得ることができる。ぼかしの強度については、図１３（Ａ）のフィルタカーネルのＴＡＰ数を変更することで調整できる。画像処理部１０５が実行する事後ぼかし処理は、撮影後にユーザの指示に応じて行われる。最後に、画像処理部１０５により生成された画像データが記録部１０９によって記録媒体に記録される。 In S905 of FIG. 9, post-adjustment is executed. This process is a process for adjusting the blur intensity when the blur intensity does not match the user's preference as a result of checking whether the blur intensity matches the user's preference. As a result, it is possible to obtain a blurred image of the degree desired by the user. The blur intensity can be adjusted by changing the number of TAPs in the filter kernel in FIG. The post-blurring process executed by the image processing unit 105 is performed according to a user instruction after shooting. Finally, the image data generated by the image processing unit 105 is recorded on the recording medium by the recording unit 109.

本実施形態では、主被写体と背景との距離差に応じて、領域抽出のための追加情報を同時に取得しておくか否かが動的に切り替えられる。このため、記録量が膨大になることによるユーザへの負担を軽減し、事後ぼかし処理用の情報を事前に取得しておくことができる。事後処理が必要とされる可能性の高いフレームに対してのみ、高精度な領域抽出用の情報を記録しておくことで、記録容量を抑えつつ、ユーザメリットの高い画像記録を行うことができる。ここで、本実施形態では、主被写体と背景の距離差が大きいときには通常モードとして撮像画像のみ記録したが、主被写体と背景の距離差への応じ方はこれに限られない。主被写体と背景の距離差が閾値以上ある大きい場合には、その距離差をより強調すべく背景ぼかし処理が行いたいので高品質モードで記録する。 In the present embodiment, whether to acquire additional information for region extraction at the same time is dynamically switched according to the distance difference between the main subject and the background. For this reason, it is possible to reduce a burden on the user due to an enormous amount of recording, and to obtain information for post-blurring processing in advance. By recording high-precision area extraction information only for frames that are likely to require post-processing, it is possible to perform image recording with high user merit while suppressing the recording capacity. . Here, in the present embodiment, when the distance difference between the main subject and the background is large, only the captured image is recorded as the normal mode. When the distance difference between the main subject and the background is larger than the threshold, the background blur processing is performed to further emphasize the distance difference, and recording is performed in the high quality mode.

本実施形態では、主被写体と背景との距離差に基づき、距離差と閾値との比較結果から、追加情報の取得の有無を判断した。これに限定されることなく、例えばＦ値から深度情報を取得し、深度情報を判断の一要素としてもよい。また本実施形態では距離マップ算出法として、瞳分割画像の視差から距離を算出した。これに限定されることなく、例えばコントラストＡＦ（オートフォーカス）評価値等を用いて距離を取得してもよい。これらの事項は後述の実施形態でも同じである。 In this embodiment, based on the distance difference between the main subject and the background, the presence / absence of acquisition of additional information is determined from the comparison result between the distance difference and the threshold value. Without being limited thereto, for example, depth information may be acquired from the F value, and the depth information may be used as an element of determination. In this embodiment, as a distance map calculation method, the distance is calculated from the parallax of the pupil divided image. Without being limited thereto, the distance may be acquired using, for example, a contrast AF (autofocus) evaluation value or the like. These matters are the same in the embodiments described later.

［第２実施形態］
次に本発明の第２実施形態を説明する。第１実施形態では、記録された画像に対し、事後ぼかし処理を行う場合を想定した。第２実施形態では、例えば主被写体が逆光で暗くなっているフレーム画像に対し、後から領域別の階調補正処理を行う場合を想定する。逆光で暗くなっている主被写体と、それ以外の背景領域を同一の階調変換特性で処理した場合、背景の暗部が極端に明るくなってしまい、不自然な画像となる。領域別に階調補正を行う意義は、画像内の主被写体領域と背景領域とを別々の階調特性で補正して不自然さを抑制することである。一方で、この処理には高度な領域抽出処理が要求される。 [Second Embodiment]
Next, a second embodiment of the present invention will be described. In the first embodiment, it is assumed that post-blurring processing is performed on a recorded image. In the second embodiment, for example, it is assumed that the gradation correction processing for each region is performed later on a frame image in which the main subject is dark due to backlight. When a main subject that is dark due to backlight and a background area other than that are processed with the same tone conversion characteristics, the dark portion of the background becomes extremely bright, resulting in an unnatural image. The significance of performing tone correction for each region is to suppress unnaturalness by correcting the main subject region and the background region in the image with different tone characteristics. On the other hand, this processing requires advanced region extraction processing.

本実施形態における処理は第１実施形態と比較して、記録モード制御部１０４の動作と、後処理に関する画像処理部１０５の動作が異なる。以下、第１実施形態とは処理が異なる箇所を中心に説明し、第１実施形態の場合と同様の構成については既に使用した符号を用いることで、それらの詳細な説明を省略する。このような説明の省略は後述の実施形態でも同じである。 The processing in this embodiment differs from the first embodiment in the operation of the recording mode control unit 104 and the operation of the image processing unit 105 related to post-processing. Hereinafter, description will be made centering on portions where processing is different from that of the first embodiment, and the same configurations as in the case of the first embodiment will be omitted by using the same reference numerals already used. Such omission of description is the same in embodiments described later.

図１４および図１５を参照して、記録モード制御部１０４の動作について説明する。図１４は、第２実施形態における記録モード制御部１０４の処理ブロック図である。図２に示す構成との相違点は以下の通りである。 The operation of the recording mode control unit 104 will be described with reference to FIGS. FIG. 14 is a processing block diagram of the recording mode control unit 104 in the second embodiment. Differences from the configuration shown in FIG. 2 are as follows.

主被写体Ｂｖ値算出部１４０２は、主被写体領域検出部２０１から主被写体領域の情報を取得し、主被写体のＢｖ値を算出する。Ｂｖ値とは、着目領域の目標輝度値に対する輝度差を表す露出値である。背景Ｂｖ値算出部１４０３は背景領域の情報を取得し、背景のＢｖ値を算出する。露出段差算出部１４０４は、主被写体および背景の各Ｂｖ値を取得して、主被写体と背景との露出段差を算出する。閾値算出部１４０５は主被写体のＢｖ値から露出段差に対する閾値を算出する。記録モード判断部１４０６は、露出段差算出部１４０４が算出した主被写体と背景との露出段差、および閾値算出部１４０５が算出した閾値を取得する。記録モード判断部１４０６は露出段差と閾値を比較し、記録モードを判断する。各部の処理の詳細については後述する。 The main subject Bv value calculation unit 1402 acquires information on the main subject region from the main subject region detection unit 201 and calculates the Bv value of the main subject. The Bv value is an exposure value that represents a luminance difference with respect to a target luminance value of the region of interest. The background Bv value calculation unit 1403 acquires background area information and calculates a background Bv value. The exposure level difference calculating unit 1404 acquires the Bv values of the main subject and the background, and calculates the exposure level difference between the main subject and the background. A threshold calculation unit 1405 calculates a threshold for the exposure step from the Bv value of the main subject. The recording mode determination unit 1406 acquires the exposure level difference between the main subject and the background calculated by the exposure level calculation unit 1404 and the threshold value calculated by the threshold value calculation unit 1405. The recording mode determination unit 1406 compares the exposure step with the threshold value to determine the recording mode. Details of the processing of each unit will be described later.

図１５は、記録モード制御部１０４の処理を説明するフローチャートである。
まず、Ｓ１５０１にて主被写体領域検出部２０１は画像内の主被写体領域を検出する。次のＳ１５０２にて主被写体Ｂｖ値算出部１４０２は主被写体のＢｖ値を算出する。主被写体のＢｖ値をBv_objと表記すると、これは、下記（５）式により算出される。
Bv_obj = log2( Y_obj / Y_obj_target ) ・・・（５）
（５）式中のlog2は、２を底とする対数関数である。Y_objは主被写体の代表輝度値であり、主被写体領域の輝度値の平均値として算出される。また、Y_obj_targetは主被写体領域の目標輝度値である。目標輝度値は適正露出とみなす輝度値のことであり、予め決まった値である。目標輝度値は、主被写体が人物であるか否かに応じて変更してもよい。（５）式より、主被写体の明るさが暗いほど、Ｂｖ値は小さくなる。例えば、主被写体の代表輝度値が目標輝度値の１／２である場合、Ｂｖ値は−１となる。主被写体の代表輝度値が目標輝度値の２倍である場合、Ｂｖ値は＋１となる。 FIG. 15 is a flowchart for explaining the processing of the recording mode control unit 104.
First, in step S1501, the main subject area detection unit 201 detects a main subject area in an image. In next step S1502, the main subject Bv value calculation unit 1402 calculates the Bv value of the main subject. When the Bv value of the main subject is expressed as Bv_obj, this is calculated by the following equation (5).
Bv_obj = log2 (Y_obj / Y_obj_target) (5)
In the equation (5), log2 is a logarithmic function with 2 as the base. Y_obj is a representative luminance value of the main subject, and is calculated as an average value of the luminance values of the main subject region. Y_obj_target is a target luminance value of the main subject area. The target luminance value is a luminance value regarded as appropriate exposure, and is a predetermined value. The target luminance value may be changed according to whether or not the main subject is a person. From equation (5), the darker the main subject, the smaller the Bv value. For example, when the representative luminance value of the main subject is ½ of the target luminance value, the Bv value is −1. When the representative luminance value of the main subject is twice the target luminance value, the Bv value is +1.

Ｓ１５０３にて背景Ｂｖ値算出部１４０３は、背景のＢｖ値を算出する。背景Ｂｖ値をBv_backと表記すると、これは下記（６）式で表される。
Bv_back = log2( Y_back / Y_back_target ) ・・・（６）
（６）式中のY_backは背景の代表輝度値であり、背景領域の輝度値の平均値等で算出される。また、Y_back_targetは背景領域の目標輝度値である。 In step S1503, the background Bv value calculation unit 1403 calculates a background Bv value. When the background Bv value is expressed as Bv_back, this is expressed by the following equation (6).
Bv_back = log2 (Y_back / Y_back_target) (6)
Y_back in the equation (6) is a representative luminance value of the background, and is calculated by an average value of luminance values of the background region. Y_back_target is the target luminance value of the background area.

次のＳ１５０４で露出段差算出部１４０４は、主被写体領域と背景領域との露出段差（delta_Bvと記す）を、下記（７）式により算出する。
delta_Bv = | Bv_obj - Bv_back | ・・・（７）
（５）〜（７）式から分かるように、背景が明るく主被写体が暗い場合、若しくはその逆の場合になるほど、つまり主被写体に着目した場合のＤレンジ（ダイナミックレンジ）が広くなるほど、露出段差delta_Bvの値は大きくなる。またＤレンジが狭くなるほど、delta_Bvの値は小さくなる。本実施形態では、フレームのＤレンジを示す評価値として露出段差を用いる。 In the next step S1504, the exposure level difference calculation unit 1404 calculates an exposure level difference (denoted as delta_Bv) between the main subject area and the background area by the following equation (7).
delta_Bv = | Bv_obj-Bv_back | (7)
As can be seen from the equations (5) to (7), the exposure level difference increases as the background is bright and the main subject is dark, or vice versa, that is, as the D range (dynamic range) when focusing on the main subject is increased. The value of delta_Bv increases. Moreover, the value of delta_Bv becomes smaller as the D range becomes narrower. In this embodiment, an exposure step is used as an evaluation value indicating the D range of the frame.

次にＳ１５０５で閾値算出部１４０５は、主被写体Ｂｖ値から露出段差に対する閾値を算出する。図１６に示す閾値算出例を挙げて説明する。図１６に示すグラフにおいて、横軸は主被写体Ｂｖ値を表し、縦軸は露出段差閾値を表す。図１６の例では、Ｂｖ_minに対応する閾値がＴＨ_minであり、主被写体Ｂｖ値がゼロ以上である場合の閾値がＴＨ_maxである。Ｂｖ_minからゼロまでの区間において１次式で線形補間を行った例を示しているが、２次以上の高次の補間処理を行ってもよい。 In step S <b> 1505, the threshold calculation unit 1405 calculates a threshold for the exposure step from the main subject Bv value. An example of threshold calculation shown in FIG. 16 will be described. In the graph shown in FIG. 16, the horizontal axis represents the main subject Bv value, and the vertical axis represents the exposure step threshold. In the example of FIG. 16, the threshold corresponding to Bv_min is TH_min, and the threshold when the main subject Bv value is greater than or equal to zero is TH_max. Although an example in which linear interpolation is performed with a linear expression in a section from Bv_min to zero is shown, higher-order interpolation processing of second or higher order may be performed.

図１６に示すように、定性的には、主被写体Ｂｖ値が負値である場合、すなわち、主被写体が適正露出よりも暗い場合、後から階調補正処理を行う必要性が高くなる。従って、露出段差に対する閾値が小さく設定される。一方、主被写体Ｂｖ値がゼロ近辺の場合には主被写体が適正露出に近い。よって階調補正処理の必要性が低くなるので、閾値が大きく設定される。また、主被写体Ｂｖ値が正値である場合には、主被写体が明るすぎる。この場合は階調補正本処理の対象外とするために、閾値が高く設定される（ＴＨ_max）。 As shown in FIG. 16, qualitatively, when the main subject Bv value is a negative value, that is, when the main subject is darker than the appropriate exposure, it is more necessary to perform tone correction processing later. Therefore, the threshold for the exposure step is set small. On the other hand, when the main subject Bv value is near zero, the main subject is close to proper exposure. Therefore, since the necessity for gradation correction processing is reduced, the threshold value is set large. Further, when the main subject Bv value is a positive value, the main subject is too bright. In this case, the threshold value is set high (TH_max) in order not to be subjected to the tone correction main process.

Ｓ１５０６で記録モード判断部１４０６は、閾値算出部１４０５が算出した閾値と、露出段差とを比較し、記録モードを判断する。記録モード判断部１４０６は露出段差が閾値以上である場合、後から階調補正が必要になる可能性が高いと判断し、Ｓ１５０７に進んで高品質モードでの記録処理を行う。高品質モードでは、フレームの画像と距離マップとＲＡＷ画像のデータの記録処理が行われる。画像加工前のＲＡＷ画像を記録しておく理由は、ガンマ変換等の非線形処理の前に階調補正を行うためである。一方、露出段差が閾値未満である場合、記録モード判断部１４０６は、後から階調補正を行う必要がないと判断し、Ｓ１５０８に進んで通常モードでの記録処理を行う。記録モード制御部１０４の処理は撮影時に行われる。 In step S1506, the recording mode determination unit 1406 compares the threshold calculated by the threshold calculation unit 1405 with the exposure level difference to determine the recording mode. If the exposure step is greater than or equal to the threshold, the recording mode determination unit 1406 determines that there is a high possibility that gradation correction will be necessary later, and proceeds to S1507 to perform recording processing in the high quality mode. In the high quality mode, recording processing of frame image, distance map, and RAW image data is performed. The reason for recording the RAW image before image processing is to perform gradation correction before nonlinear processing such as gamma conversion. On the other hand, if the exposure step is less than the threshold value, the recording mode determination unit 1406 determines that it is not necessary to perform tone correction later, and proceeds to S1508 to perform recording processing in the normal mode. The processing of the recording mode control unit 104 is performed at the time of shooting.

次に、記録後にユーザの指示によって行われる領域別階調補正処理に関して説明を行う。本処理は、主に画像処理部１０５が行うが、ユーザの指示を受け付ける操作部１０７およびシステム制御部１０６も関与する。以下、記録後に行われる領域別階調補正処理を、「事後補正処理」と呼ぶ。図１７および図１８を参照して、事後補正処理について説明する。 Next, a description will be given of the tone correction processing for each area performed according to a user instruction after recording. This processing is mainly performed by the image processing unit 105, but the operation unit 107 and the system control unit 106 that receive user instructions are also involved. Hereinafter, the area-specific gradation correction processing performed after recording is referred to as “post-correction processing”. The post correction process will be described with reference to FIGS. 17 and 18.

図１７は、事後補正処理に関する画像処理部１０５の処理ブロック図である。図８に示す構成との相違点は以下の通りである。
第１階調特性算出部１７０３は、フレーム画像データを取得して第１階調特性を算出する。第１階調特性は主被写体領域の階調特性である。また第２階調特性算出部１７０４はフレーム画像データを取得して第２階調特性を算出する。第２階調特性は背景領域の階調特性である。第１階調補正部１７０５は第１階調特性を用いて、入力フレーム画像に対する階調補正処理を行う。第２階調補正部１７０６は第２階調特性を用いて、入力フレーム画像に対する階調補正処理を行う。合成部１７０７はピント被写体抽出部８０２からピント被写体の情報を取得し、第１階調補正部１７０５が階調補正を行った第１の画像、および第２階調補正部１７０６が階調補正を行った第２の画像を取得して合成処理を行う。各部の処理の詳細については後述する。 FIG. 17 is a processing block diagram of the image processing unit 105 regarding post-correction processing. Differences from the configuration shown in FIG. 8 are as follows.
The first gradation characteristic calculation unit 1703 acquires frame image data and calculates the first gradation characteristic. The first gradation characteristic is the gradation characteristic of the main subject area. The second gradation characteristic calculation unit 1704 acquires frame image data and calculates the second gradation characteristic. The second gradation characteristic is the gradation characteristic of the background area. The first gradation correction unit 1705 performs gradation correction processing on the input frame image using the first gradation characteristic. The second tone correction unit 1706 performs tone correction processing on the input frame image using the second tone characteristics. The synthesizing unit 1707 acquires information on the focused subject from the focused subject extracting unit 802, and the first tone corrected by the first tone correcting unit 1705 and the second tone correcting unit 1706 perform tone correction. The second image that has been obtained is acquired and the composition process is performed. Details of the processing of each unit will be described later.

図１８は、事後補正処理を説明するフローチャートである。
まず、Ｓ１８０１で撮像装置は事後補正処理の指示をユーザから受け付ける。図１９は表示例を示し、事後補正処理が可能な高品質記録のフレームに対し、表示部１０８は事後補正処理が可能であることをユーザに提示する。また、ユーザに対して事後補正処理を行うか否かの指示を仰ぐ表示および入力処理が行われる。動画再生時にユーザが撮影後の階調補正処理を行う事を選択した場合、システム制御部１０６は操作部１０７から指示を受け付け、画像処理部１０５に対し、Ｓ１８０２以降の処理を行うように命令する。 FIG. 18 is a flowchart for explaining post-correction processing.
First, in step S <b> 1801, the imaging apparatus receives an instruction for post-correction processing from the user. FIG. 19 shows a display example, and the display unit 108 indicates to the user that post-correction processing is possible for a high-quality recording frame that can be subjected to post-correction processing. In addition, display and input processing are performed to ask the user whether or not to perform post-correction processing. When the user selects to perform tone correction processing after shooting during moving image reproduction, the system control unit 106 receives an instruction from the operation unit 107 and instructs the image processing unit 105 to perform the processing from S1802 onward. .

Ｓ１８０２において、距離マップ整形部８０１が距離マップの整形処理を行い、Ｓ１８０３において、ピント被写体抽出部８０２がピント被写体を抽出する。Ｓ１８０４にて第１階調特性算出部１７０３は第１階調特性を算出する。Ｓ１８０５にて第２階調特性算出部１７０４は第２階調特性を算出する。図２０を参照して、第１階調特性および第２階調特性の算出処理を説明する。図２０（Ａ）は入力フレーム画像を例示し、画像内の主被写体領域と背景領域を示す。つまり主被写体領域検出部２０１により、入力フレーム画像にて画像内の主被写体領域と背景領域とが分離して検出される。主被写体領域と背景領域に対し、別々の階調補正処理が実施される。まず、主被写体領域に対しては、一律のゲイン処理が行われる。一律のゲイン処理とする理由は、本機能が適用される場合に、逆光や日陰等で主被写体が一様に暗くなっている可能性が高いからである。一方、背景領域に対しては、輝度別のゲイン処理が行われる。その理由は、一般に背景領域には様々な輝度を持つ被写体が存在し、Ｄレンジが広いからである。各領域の階調性を損なわないように階調圧縮が実行される。なお、ゲイン特性に関しては、以上の考え方に限定されず、任意の形状の特性をとりうるものとする。 In step S1802, the distance map shaping unit 801 performs distance map shaping processing, and in step S1803, the focus subject extraction unit 802 extracts a focus subject. In step S1804, the first gradation characteristic calculation unit 1703 calculates a first gradation characteristic. In step S1805, the second gradation characteristic calculation unit 1704 calculates the second gradation characteristic. With reference to FIG. 20, the calculation process of the first gradation characteristic and the second gradation characteristic will be described. FIG. 20A illustrates an input frame image and shows a main subject region and a background region in the image. That is, the main subject region detection unit 201 detects the main subject region and the background region in the image separately from the input frame image. Separate tone correction processing is performed on the main subject area and the background area. First, uniform gain processing is performed on the main subject area. The reason for the uniform gain processing is that, when this function is applied, there is a high possibility that the main subject is uniformly dark due to backlight or shade. On the other hand, the gain processing for each luminance is performed on the background area. This is because there are generally subjects having various luminances in the background area, and the D range is wide. Gradation compression is performed so as not to impair the gradation of each region. Note that the gain characteristic is not limited to the above concept, and can have a characteristic of an arbitrary shape.

図２０（Ｂ）は、主被写体領域における、入力輝度（横軸）に対するゲイン（縦軸）の特性を例示する。ゲインは一定値（Gain_objと記す）をとる。主被写体領域の平均輝度値Y_objが適正露出とされる目標輝度値Y_obj_targetとなるように補正するために、ゲインGain_objは、下記（８）式により算出される。
Gain_obj = Y_obj_target / Y_obj ・・・（８） FIG. 20B illustrates characteristics of gain (vertical axis) with respect to input luminance (horizontal axis) in the main subject region. The gain takes a constant value (denoted Gain_obj). In order to perform correction so that the average luminance value Y_obj of the main subject region becomes the target luminance value Y_obj_target that is set to an appropriate exposure, the gain Gain_obj is calculated by the following equation (8).
Gain_obj = Y_obj_target / Y_obj (8)

図２０（Ｃ）は、図２０（Ｂ）に示すゲイン特性で処理を行った場合の入出力輝度の特性を実線のグラフで示す。つまり、この特性は第１階調特性である。点線は、入力輝度値と出力輝度値との比が１：１の場合を示す。第１階調特性のグラフ線の傾斜は、点線で示すグラフ線の傾斜よりも大きい。 FIG. 20C shows a solid line graph of input / output luminance characteristics when processing is performed using the gain characteristics shown in FIG. That is, this characteristic is the first gradation characteristic. A dotted line indicates a case where the ratio between the input luminance value and the output luminance value is 1: 1. The slope of the graph line of the first gradation characteristic is larger than the slope of the graph line indicated by the dotted line.

図２０（Ｄ）は、背景領域の輝度ヒストグラムにて代表輝度値を例示する。横軸は入力輝度を表し、縦軸は頻度（度数）を表す。図２０（Ｅ）は、背景領域における、入力輝度に対するゲイン特性を例示する。ゲインは入力輝度に応じて変化する。第２階調特性算出部１７０４はゲイン特性の算出前に、入力輝度の暗部側と明部側の代表輝度値を算出する。算出処理では、図２０（Ｄ）に示すように、背景領域の輝度ヒストグラムが取得される。代表輝度値Y_back_lowは、最小輝度から所定割合の画素数をカウントした場合に算出される、暗部の代表輝度値である。代表輝度値Y_back_highは、最大輝度から所定割合の画素数をカウントした場合に算出される、明部の代表輝度値である。背景領域の平均輝度値Y_backが適正露出とされる目標輝度値Y_back_targetとなるように、背景領域に対する最大ゲイン（Gain_backと記す）は、下記（９）式により算出される。
Gain_back = Y_back_target / Y_back ・・・（９） FIG. 20D illustrates the representative luminance value in the luminance histogram of the background area. The horizontal axis represents input luminance, and the vertical axis represents frequency (frequency). FIG. 20E illustrates gain characteristics with respect to input luminance in the background region. The gain changes according to the input luminance. The second tone characteristic calculation unit 1704 calculates the representative luminance values of the dark part side and the bright part side of the input luminance before calculating the gain characteristic. In the calculation process, a luminance histogram of the background area is acquired as shown in FIG. The representative luminance value Y_back_low is a dark portion representative luminance value calculated when a predetermined number of pixels are counted from the minimum luminance. The representative luminance value Y_back_high is a representative luminance value of a bright part calculated when a predetermined number of pixels are counted from the maximum luminance. The maximum gain (denoted as Gain_back) for the background region is calculated by the following equation (9) so that the average luminance value Y_back of the background region becomes a target luminance value Y_back_target that is set to appropriate exposure.
Gain_back = Y_back_target / Y_back (9)

図２０（Ｅ）に示すゲイン特性では、入力輝度値がY_back_low以下の区間にてGain_backが一定値であり、入力輝度値がY_back_high以上の区間にて最小ゲイン量が１となる。T_back_lowとY_back_highとの間の区間では、入力輝度値に応じて単調減少となる特性である。図２０（Ｆ）は、図２０（Ｄ）に示すゲイン特性で処理を行った場合の入出力輝度の特性を実線のグラフで示す。この特性は第２階調特性である。第２階調特性を表す実線のグラフは、Y_back_lowにて上側に突出した折れ線形状であって、Y_back_high以上の区間では点線のグラフ線（入力輝度値と出力輝度値との比が１：１の場合）に一致する形状である。 In the gain characteristic shown in FIG. 20E, Gain_back is a constant value in a section where the input luminance value is Y_back_low or less, and the minimum gain amount is 1 in a section where the input luminance value is Y_back_high or more. In the section between T_back_low and Y_back_high, the characteristic monotonously decreases according to the input luminance value. FIG. 20F illustrates a solid line graph of input / output luminance characteristics when processing is performed using the gain characteristics illustrated in FIG. This characteristic is the second gradation characteristic. The solid line graph representing the second gradation characteristic has a polygonal line shape that protrudes upward at Y_back_low, and is a dotted line (in the ratio of the input luminance value to the output luminance value of 1: 1 in the section above Y_back_high). The shape matches the case).

図１８のＳ１８０６では、第１階調補正部１７０５が第１階調特性を用いて、入力フレーム画像の階調補正処理を行う。Ｓ１８０７では、第２階調補正部１７０６が第２階調特性を用いて、入力フレーム画像の階調補正処理を行う。これらの処理は、入力フレーム画像の輝度値を、図２０（Ｃ）、（Ｆ）に例示した階調変換特性でそれぞれ変換する処理である。次のＳ１８０８で合成部１７０７は、階調補正処理が行われた２画像の合成を行う。図２１を参照して、合成部１７０７の処理を説明する。 In S1806 of FIG. 18, the first gradation correction unit 1705 performs gradation correction processing of the input frame image using the first gradation characteristic. In step S1807, the second tone correction unit 1706 performs tone correction processing of the input frame image using the second tone characteristics. These processes are processes for converting the luminance value of the input frame image with the gradation conversion characteristics illustrated in FIGS. 20C and 20F, respectively. In next step S1808, the synthesizing unit 1707 synthesizes the two images subjected to the gradation correction processing. With reference to FIG. 21, the process of the synthesis unit 1707 will be described.

図２１（Ａ）は入力フレーム画像を例示し、図２１（Ｂ）は、ピント被写体抽出部８０２が抽出したピント面画像を例示する。合成処理にて、図２１（Ｂ）に白色領域で示した画像内のピント面領域に対しては、第１階調特性で階調補正を行った画像が出力される。また、図２１（Ｂ）に黒色領域で示した画像内の非ピント面領域（背景領域）に対しては、第２階調特性で階調補正を行った画像が出力される。この処理を行うと、図２１（Ａ）の入力フレーム画像に対し、合成処理後の画像は図２１（Ｃ）に示す画像となる。図２１（Ｃ）は、ピント面である主被写体領域と、非ピント面領域である背景領域に対し、それぞれに異なる階調変換特性で階調補正処理が行われた画像を示す。最後に、領域別階調補正が行われた画像のデータは記録部１０９によって記録媒体に記録される。 FIG. 21A illustrates an input frame image, and FIG. 21B illustrates a focus plane image extracted by the focus subject extraction unit 802. In the synthesis process, an image that has been subjected to tone correction with the first tone characteristics is output to the focus plane region in the image indicated by the white region in FIG. For the non-focus surface area (background area) in the image indicated by the black area in FIG. 21B, an image that has been subjected to gradation correction with the second gradation characteristics is output. When this processing is performed, the image after the synthesis processing is the image shown in FIG. 21C with respect to the input frame image of FIG. FIG. 21C shows an image in which gradation correction processing is performed with different gradation conversion characteristics for the main subject area that is the focus plane and the background area that is the non-focus plane area. Finally, the image data on which the gradation correction for each region has been performed is recorded on the recording medium by the recording unit 109.

本実施形態では、主被写体領域と背景領域との露出段差（明るさの差）に応じて、領域抽出のための追加情報を同時に取得しておくか否かが動的に切り替えられる。このため、記録量が膨大になることによるユーザへの負担を軽減し、事後補正処理（階調補正）用の情報を取得しておくことができる。本実施形態では、主被写体領域と背景領域との露出段差が閾値より大きいときに高品質モードで画像および深度分布情報を記録した。しかしこれに限らず、例えば主被写体領域と背景領域との露出段差が閾値より小さいときの方が大きいときよりも被写体が適切な明るさでそれぞれ撮れている、として高品質モードで記録してもよい。このとき露出段差が閾値よりも大きいときには通常モードで深度分布情報を記録せずに画像を記録する。 In the present embodiment, whether to acquire additional information for region extraction at the same time is dynamically switched according to the exposure step (brightness difference) between the main subject region and the background region. For this reason, it is possible to reduce a burden on the user due to an enormous amount of recording, and to acquire information for post-correction processing (gradation correction). In the present embodiment, the image and the depth distribution information are recorded in the high quality mode when the exposure level difference between the main subject area and the background area is larger than the threshold value. However, the present invention is not limited to this. For example, even when recording is performed in the high quality mode, it is assumed that the subject is shot with appropriate brightness compared to when the exposure step between the main subject area and the background area is smaller than the threshold value. Good. At this time, when the exposure step is larger than the threshold value, the image is recorded without recording the depth distribution information in the normal mode.

［第３実施形態］
次に本発明の第３実施形態を説明する。本実施形態では、第１実施形態で説明した事後ぼかし処理を前提とし、特に情報の記録時において、記録容量をさらに削減することを目的とする。第１実施形態では、主被写体と背景との距離差に基づいて、例えば、背景の画像が十分にぼけていないと判断されたフレームに対して、後処理用の情報が取得される。しかし、撮影シーンによっては、常に主被写体と背景との距離が近い場合があり得る。そのような場合、ほぼ全フレームにわたって後処理用の情報が取得されてしまう。その結果、撮影された画像の記録容量が膨大になる可能性がある。そこで、本実施形態では、記録容量を適正に保つために、高品質モードで記録するフレームをさらに絞り込む処理について説明する。すなわち、本実施形態において後述する記録容量の削減の必要があるかの判定と被写体のスコア判定による記録モードの判定は、その一部あるいは全部を第１および第２の実施形態にそれぞれ組み合わせて実行され得るものである。 [Third Embodiment]
Next, a third embodiment of the present invention will be described. The present embodiment is premised on the post-blurring process described in the first embodiment, and aims to further reduce the recording capacity, particularly when recording information. In the first embodiment, post-processing information is acquired based on the difference in distance between the main subject and the background, for example, for a frame for which it is determined that the background image is not sufficiently blurred. However, depending on the shooting scene, the distance between the main subject and the background may always be short. In such a case, post-processing information is acquired over almost the entire frame. As a result, the recording capacity of the captured image may be enormous. Thus, in the present embodiment, a process for further narrowing down the frames to be recorded in the high quality mode in order to keep the recording capacity appropriate will be described. That is, in this embodiment, the determination of whether or not it is necessary to reduce the recording capacity, which will be described later, and the determination of the recording mode by subject score determination are performed in combination with part or all of them in the first and second embodiments. It can be done.

図２２は、本実施形態の撮像装置に適用可能な構成を示すブロック図である。図１に示す構成との相違は、記録情報調整部２２１０が追加されていることである。図２３は、記録情報調整部２２１０の処理を示すフローチャートである。図２３を参照して、記録情報調整部２２１０について説明する。以下の処理は、記録モード制御部１０４により制御される記録モードにおいて、一連の動画データを記録部１０９が記録した後に実行される。 FIG. 22 is a block diagram illustrating a configuration applicable to the imaging apparatus of the present embodiment. The difference from the configuration shown in FIG. 1 is that a recording information adjustment unit 2210 is added. FIG. 23 is a flowchart showing the processing of the recording information adjustment unit 2210. The recording information adjustment unit 2210 will be described with reference to FIG. The following processing is executed after the recording unit 109 records a series of moving image data in the recording mode controlled by the recording mode control unit 104.

Ｓ２３０１において記録情報調整部２２１０は、画像の記録容量（MEMと記す）を取得し、Ｓ２３０２で最大記録容量（MEM_MAXと記す）を算出する。最大記録容量MEM_MAXは、下記式（１０）により算出される。
MEM_MAX = MEM_FRAME×NUM_FRAME× k1 × k2 ・・・（１０）
（１０）式において、MEM_FRAMEは、１フレームあたりの記録量であり、この場合にはフレーム間圧縮やフレーム内圧縮は行わないものとする。NUM_FRAMEは、処理対象である動画の総フレーム数である。k1は所定の圧縮レートであり、１以下の値をとる。実際の圧縮率は撮影シーンに依存して変わるが、ここでは所定の圧縮率とする。k2は付加情報の記録による容量増加率の許容値であり、１以上の値をとる。 In step S2301, the recording information adjustment unit 2210 acquires the image recording capacity (denoted as MEM), and calculates the maximum recording capacity (denoted as MEM_MAX) in step S2302. The maximum recording capacity MEM_MAX is calculated by the following formula (10).
MEM_MAX = MEM_FRAME x NUM_FRAME x k1 x k2 (10)
In equation (10), MEM_FRAME is the recording amount per frame, and in this case, inter-frame compression or intra-frame compression is not performed. NUM_FRAME is the total number of frames of the moving image to be processed. k1 is a predetermined compression rate and takes a value of 1 or less. The actual compression rate varies depending on the shooting scene, but here it is a predetermined compression rate. k2 is an allowable value of the capacity increase rate due to the recording of the additional information, and takes a value of 1 or more.

次のＳ２３０３で記録情報調整部２２１０は、記録容量MEMと最大記録容量MEM_MAXを比較する。MEMがMEM_MAX以下である場合、記録情報の調整は行われずに処理を終了する。また、MEMがMEM_MAXよりも大きい場合、記録情報調整部２２１０は記録情報の調整を行う必要があると判断し、Ｓ２３０４に処理を進める。 In next S2303, the recording information adjustment unit 2210 compares the recording capacity MEM with the maximum recording capacity MEM_MAX. If MEM is less than or equal to MEM_MAX, the processing ends without adjusting the recording information. If MEM is larger than MEM_MAX, the recording information adjustment unit 2210 determines that it is necessary to adjust the recording information, and the process proceeds to S2304.

本実施形態では、記録容量を削減する必要があると判断された場合、主被写体のサイズ情報および位置情報に基づいて、高品質モードで記録するフレームをさらに絞り込む処理が行われる。その理由としては、ユーザが撮影後に背景をぼかして、静止画としても記録しておきたいと思うフレームは、主被写体が良好な状態で写っているフレームであることによる。絞り込み処理によって取得されるフレームは、具体的には、主被写体の画像領域が撮像された画像中心の近くに存在し、主被写体のサイズが大きく写っているフレームである。 In the present embodiment, when it is determined that the recording capacity needs to be reduced, processing for further narrowing down the frames to be recorded in the high quality mode is performed based on the size information and position information of the main subject. The reason is that the frame that the user wants to record as a still image with a blurred background after shooting is a frame in which the main subject is in good condition. Specifically, the frame acquired by the narrowing-down process is a frame in which the main subject image area is present near the center of the captured image and the main subject size is large.

図２３のＳ２３０４で記録情報調整部２２１０は、主被写体のサイズ情報および位置情報を取得し、次のＳ２３０５にて、主被写体のサイズ情報と位置情報に基づいて主被写体スコアを算出する。なお、Ｓ２３０４以降の処理は、記録した全フレームに対して行われるものとする。図２４を参照して、Ｓ２３０４およびＳ２３０５の処理を説明する。図２４（Ａ）は入力フレーム画像を例示する。主被写体領域の重心位置の座標を(X,Y)と表記し、主被写体領域の高さをHeightと表記し、主被写体領域の幅をWidthと表記する。 In S2304 of FIG. 23, the recording information adjustment unit 2210 acquires the size information and position information of the main subject, and calculates the main subject score based on the size information and position information of the main subject in the next S2305. Note that the processing after S2304 is performed on all the recorded frames. The processes of S2304 and S2305 will be described with reference to FIG. FIG. 24A illustrates an input frame image. The coordinates of the center of gravity of the main subject area are expressed as (X, Y), the height of the main subject area is expressed as Height, and the width of the main subject area is expressed as Width.

Ｓ２３０４では、図２４（Ａ）に示す入力フレーム画像から、主被写体領域を矩形状に抽出し、幅Widthと高さHeightを取得する処理が実行される。さらに、抽出された主被写体領域の重心位置座標(X,Y)が取得される。取得した情報から、面積に相当する正規化サイズ（Sizeと記す）が、下記（１１）式により算出される。
Size = Width × Height / Size_all ・・・（１１） In S2304, a process of extracting the main subject region in a rectangular shape from the input frame image shown in FIG. 24A and acquiring the width Width and height Height is executed. Further, the center-of-gravity position coordinates (X, Y) of the extracted main subject area are acquired. From the acquired information, a normalized size (denoted as Size) corresponding to the area is calculated by the following equation (11).
Size = Width x Height / Size_all (11)

（１１）式のSize_allは、画像サイズに依存しないように正規化するための正規化係数である。例えば、Size_allを画像全体の面積とする。この場合、Sizeは主被写体領域の面積が画像全体の面積に占める割合を示す。また、画像中央位置の座標を(Xc,Yc)と表記した場合、(Xc,Yc)から主被写体領域の重心位置座標(X,Y)までの正規化距離（Distと記す）は、下記（１２）式により算出される。
Size_all in the equation (11) is a normalization coefficient for normalization so as not to depend on the image size. For example, Size_all is the area of the entire image. In this case, Size indicates the ratio of the area of the main subject area to the entire image area. When the coordinates of the image center position are expressed as (Xc, Yc), the normalized distance (denoted as Dist) from (Xc, Yc) to the center of gravity position coordinate (X, Y) of the main subject region is ( 12) Calculated by the equation.

（１２）式のＲは、画像中央位置から画像端部までの距離に相当する正規化係数である。つまり、正規化距離Distは画像中央位置から画像端部までの距離に対する、座標(Xc,Yc)と(X,Y)との距離差の割合を示す。 R in the equation (12) is a normalization coefficient corresponding to the distance from the image center position to the image edge. That is, the normalized distance Dist indicates the ratio of the distance difference between the coordinates (Xc, Yc) and (X, Y) with respect to the distance from the image center position to the image edge.

次にＳ２３０５にて、主被写体スコアが算出される。図２４（Ｂ）および（Ｃ）を参照して説明する。図２４（Ｂ）は、距離スコアの算出特性を例示する。横軸は正規化距離Distを表し、縦軸は距離スコア（Score_Distと記す）を表す。図２４（Ｃ）は、サイズスコアの算出特性を例示する。横軸は正規化サイズSizeを表し、縦軸はサイズスコア（Score_Sizeと記す）を表す。 In step S2305, a main subject score is calculated. This will be described with reference to FIGS. FIG. 24B illustrates distance score calculation characteristics. The horizontal axis represents the normalized distance Dist, and the vertical axis represents the distance score (denoted as Score_Dist). FIG. 24C illustrates size score calculation characteristics. The horizontal axis represents the normalized size Size, and the vertical axis represents the size score (denoted as Score_Size).

本実施形態では、まず、算出された正規化距離情報および正規化サイズ情報から、図２４（Ｂ）および（Ｃ）に示す特性により、距離スコアScore_DistおよびサイズスコアScore_Sizeがそれぞれ算出される。図２４（Ｂ）に示す距離スコアScore_Distの特性に関しては、被写体の画像が画像中央部分に近いほどスコアを大きくするために、Distに対する単調減少の特性となる。図２４（Ｂ）は、２点間を一次式で線形補間した特性を例示する。正規化距離Distに対する第１の閾値D1よりDist値が小さい範囲では、距離スコアScore_Distが一定である。また正規化距離Distに対する第２の閾値D2よりDist値が大きい範囲では、距離スコアScore_Distが一定である。Dist値が第１の閾値以上であって、かつ第２の閾値以下である場合には、Dist値の増加につれて距離スコアScore_Distの値が線形的に減少する。 In the present embodiment, first, a distance score Score_Dist and a size score Score_Size are respectively calculated from the calculated normalized distance information and normalized size information according to the characteristics shown in FIGS. With respect to the characteristics of the distance score Score_Dist shown in FIG. 24B, the score increases as the subject image is closer to the center of the image. FIG. 24B illustrates characteristics obtained by linearly interpolating between two points with a linear expression. In a range where the Dist value is smaller than the first threshold D1 with respect to the normalized distance Dist, the distance score Score_Dist is constant. In the range where the Dist value is larger than the second threshold D2 with respect to the normalized distance Dist, the distance score Score_Dist is constant. When the Dist value is greater than or equal to the first threshold and less than or equal to the second threshold, the value of the distance score Score_Dist decreases linearly as the Dist value increases.

図２４（Ｃ）に示すサイズスコアScore_Sizeの特性に関しては、被写体の画像サイズが大きくなるほどスコアを大きくするために、Sizeに対する単調増加の特性となる。図２４（Ｃ）は、２点間を一次式で線形補間した特性を例示する。正規化サイズSizeに対する第１の閾値S1よりSize値が小さい範囲では、サイズスコアScore_Sizeが一定である。また正規化サイズSizeに対する第２の閾値S2よりSize値が大きい範囲では、サイズスコアScore_Sizeが一定である。Size値が第１の閾値以上であって、かつ第２の閾値以下である場合には、Size値の増加につれてサイズスコアScore_Sizeの値が線形的に増加する。
図２４（Ｂ）および（Ｃ）に示す特性は例示であり、３点以上を設定して補間処理を行ってもよい。 With respect to the characteristics of the size score Score_Size shown in FIG. 24C, since the score increases as the image size of the subject increases, the characteristic increases monotonously with respect to the Size. FIG. 24C illustrates characteristics obtained by linearly interpolating between two points with a linear expression. In a range where the Size value is smaller than the first threshold value S1 for the normalized size Size, the size score Score_Size is constant. Further, the size score Score_Size is constant in a range where the Size value is larger than the second threshold value S2 with respect to the normalized size Size. When the Size value is greater than or equal to the first threshold and less than or equal to the second threshold, the value of the size score Score_Size increases linearly as the Size value increases.
The characteristics shown in FIGS. 24B and 24C are examples, and interpolation processing may be performed by setting three or more points.

次に、算出された距離スコアScore_DistとサイズスコアScore_Sizeから、主被写体スコア（Scoreと記す）が、下記（１３）式により算出される。
Score = w_d × Score_Dist + w_s × Score_Size ・・・（１３）
（１３）式において、w_dとw_sはそれぞれ任意の重み付け係数である。 Next, from the calculated distance score Score_Dist and size score Score_Size, a main subject score (denoted as Score) is calculated by the following equation (13).
Score = w_d × Score_Dist + w_s × Score_Size (13)
In Expression (13), w_d and w_s are arbitrary weighting coefficients.

図２３のＳ２３０６にて記録情報調整部２２１０は、高品質の記録フレームの絞り込み処理を行う。Ｓ２３０５において全フレームに亘って主被写体スコアScoreが算出されている。記録情報調整部２２１０は、主被写体スコアScoreを所定の閾値（TH_Scoreと記す）を比較する。主被写体スコアScoreの値が閾値TH_Scoreを超えているフレームに対し、高品質モードでの記録処理が実行される。主被写体スコアScoreの値が閾値以下であるフレームについては通常モードとなり、付加情報は削除されるので記録されない。図２５を参照して具体的に説明する。横軸は閾値TH_Scoreを表し、縦軸は記録容量MEMを表す。閾値TH_Scoreが大きくなるほど、高品質のフレーム画像の数（フレーム数）は減っていくため、記録容量MEMが小さくなる。記録容量MEMが、Ｓ２３０２で算出された上限値MEM_MAXを下回る最大の閾値をTH_Score_minとする。閾値TH_Score_minを用いて高品質フレームの絞り込みを行うことによって、動画の記録容量をMEM_MAX以下に抑えつつ、動画を取得できる。 In step S2306 in FIG. 23, the recording information adjustment unit 2210 performs high-quality recording frame narrowing processing. In S2305, the main subject score Score is calculated over all frames. The recording information adjustment unit 2210 compares the main subject score Score with a predetermined threshold (denoted TH_Score). The recording process in the high quality mode is executed for the frame in which the value of the main subject score Score exceeds the threshold value TH_Score. A frame in which the value of the main subject score Score is equal to or less than the threshold value is in the normal mode, and additional information is deleted and is not recorded. This will be specifically described with reference to FIG. The horizontal axis represents the threshold value TH_Score, and the vertical axis represents the recording capacity MEM. As the threshold TH_Score increases, the number of high-quality frame images (the number of frames) decreases, so the recording capacity MEM decreases. The maximum threshold value that the recording capacity MEM falls below the upper limit value MEM_MAX calculated in S2302 is defined as TH_Score_min. By narrowing down high-quality frames using the threshold TH_Score_min, a moving image can be acquired while the moving image recording capacity is suppressed to MEM_MAX or less.

本実施形態では、記録情報調整部２２１０の処理によって動画の記録容量の増加を抑えることができる。なお、本実施形態では、高品質フレームの絞り込みを行う指標として主被写体領域の位置情報とサイズ情報を利用した。
また、別の実施形態として、記録モードを切り替える別の指標として、シーンチェンジ度合いを利用してもよい。シーンチェンジ度合いとは、異なるフレーム（例えば現フレームと前フレーム）の間で画像が変化した場合のシーンの変化の大きさを表す指標である。シーンチェンジ度合いは、時系列の複数の画像、例えば現フレームと前フレームとを位置合わせし、画像間の差分を計算することで算出される。時間的に連続する２フレームの間で画像の変化がほとんどない場合には、高品質モードの追加情報を間引く処理が実行される。時間的に連続する２フレームの間で画像の大きな変化がある場合には、両方のフレームに係る追加情報を記録する処理が実行される。
上述した被写体スコアによる判定やシーンチェンジ度合いによる判定は、本実施形態では記録容量の削減の必要があると判定された場合に行っていたが、これに限られるものではない。例えば、記録容量の検出や判定を行わずに、高品質フレームの判定方法として被写体スコアやシーンチェンジ度合いを用いて記録モードを切り替えてもよい。 In the present embodiment, an increase in the recording capacity of moving images can be suppressed by the processing of the recording information adjustment unit 2210. In the present embodiment, position information and size information of the main subject area are used as an index for narrowing down high quality frames.
In another embodiment, the scene change degree may be used as another index for switching the recording mode. The scene change degree is an index representing the magnitude of a scene change when an image changes between different frames (for example, the current frame and the previous frame). The scene change degree is calculated by aligning a plurality of time-series images, for example, a current frame and a previous frame, and calculating a difference between the images. When there is almost no change in the image between two temporally continuous frames, a process of thinning out the additional information in the high quality mode is executed. When there is a large change in the image between two temporally continuous frames, processing for recording additional information relating to both frames is executed.
The determination based on the subject score and the determination based on the degree of scene change are performed when it is determined that the recording capacity needs to be reduced in the present embodiment, but the present invention is not limited to this. For example, the recording mode may be switched using a subject score or a scene change degree as a high-quality frame determination method without detecting or determining the recording capacity.

［第４実施形態］
次に本発明の第４実施形態を説明する。本実施形態では、第１実施形態で説明した事後ぼかし処理を前提とし、各フレームの動きブレを低減させることを目的とする。高品質モードで記録する場合のシャッタ速度およびフレームレートを、主被写体の動きに合わせて変更する制御について説明する。 [Fourth Embodiment]
Next, a fourth embodiment of the present invention will be described. The present embodiment is based on the post-blurring process described in the first embodiment, and aims to reduce motion blur of each frame. A control for changing the shutter speed and the frame rate when recording in the high quality mode in accordance with the movement of the main subject will be described.

後処理により高品質な静止画を生成する対象となるフレームについては、被写体の動きブレが無く、被写体が止まっていることが好ましい状態である。動画を撮像する場合の露出制御では、フレーム画像を連続的に鑑賞する際に被写体像の動きが不連続な状態に見えないようにシャッタ速度が制御される。つまり、シャッタ速度が速くなりすぎないように制御が行われる。一方、そのようにして撮像された動画フレームの画像の１コマを静止画として鑑賞する場合、被写体の移動速度によっては動きブレが発生している可能性がある。そこで本実施形態は、高品質モードで取得するフレーム画像の露出制御、特にシャッタ速度に関して、被写体の動きブレを抑制することを目的とする。 With respect to a frame that is a target for generating a high-quality still image by post-processing, it is preferable that there is no motion blur of the subject and the subject is stopped. In exposure control when capturing a moving image, the shutter speed is controlled so that the motion of the subject image does not appear discontinuous when the frame images are continuously viewed. That is, control is performed so that the shutter speed does not become too fast. On the other hand, when viewing one frame of a moving image frame imaged as such as a still image, motion blur may occur depending on the moving speed of the subject. In view of this, the present embodiment has an object to suppress motion blur of a subject with respect to exposure control of a frame image acquired in a high quality mode, in particular, shutter speed.

図２６は、本実施形態の撮像装置に適用可能な構成を示すブロック図である。図１に示す構成との相違は、露出条件制御部２６０６が設けられていることである。図２７は露出条件制御部２６０６の処理を示すフローチャートである。図２７を参照して、露出条件制御部２６０６について説明する。以下の処理は、記録モード制御部１０４の処理後に行われ、毎フレームまたは所定のフレーム間隔で行われる。 FIG. 26 is a block diagram illustrating a configuration applicable to the imaging apparatus of the present embodiment. The difference from the configuration shown in FIG. 1 is that an exposure condition control unit 2606 is provided. FIG. 27 is a flowchart showing the processing of the exposure condition control unit 2606. The exposure condition control unit 2606 will be described with reference to FIG. The following processing is performed after the recording mode control unit 104, and is performed every frame or at a predetermined frame interval.

まず、Ｓ２７０１では、記録モードが高品質モードであるか否かについて判定処理が行われる。記録モードが通常モードである場合、Ｓ２７０５に進み、現フレームよりも１フレーム時間だけ後の次フレームについても通常モードでの動画の露出で撮影動作が行われる。一方、記録モードが高品質モードである場合には、Ｓ２７０２に処理を進め、主被写体の動きに合わせた露出制御が行われる。Ｓ２７０２にて露出条件制御部２６０６は、主被写体の動きベクトルを算出する。動きベクトルの算出方法は、公知のパターンマッチング処理により行われる。 First, in S2701, a determination process is performed as to whether or not the recording mode is a high quality mode. If the recording mode is the normal mode, the process advances to step S2705, and the shooting operation is performed with the moving image exposure in the normal mode for the next frame that is one frame time after the current frame. On the other hand, if the recording mode is the high quality mode, the process proceeds to S2702, and exposure control is performed in accordance with the movement of the main subject. In S2702, the exposure condition control unit 2606 calculates a motion vector of the main subject. The motion vector is calculated by a known pattern matching process.

次のＳ２７０３にて露出条件制御部２６０６は、次フレームのシャッタ速度を算出する。次フレームのシャッタ速度をTvと表記し、Ｓ２７０２で算出された動きベクトルの大きさをv（単位：ピクセル）と表記する。フレーム間隔をT_frame(フレームレートが60fpsの場合、1/60秒)と表記する。シャッタ速度Tvは、vおよびT_frameから、下記（１４）式により算出される。
Tv = T_frame / v ・・・（１４）
（１４）式は、撮影時間内に、主被写体画像の移動量が１ピクセルとなるシャッタ速度としてTvを算出していることを意味している。換言すれば、Tvは動きブレが１ピクセルに収まるシャッタ速度である。例えば、T_frameを1/60秒とし、vを6ピクセルとする。この場合、主被写体画像の１ピクセルの移動に対応するシャッタ速度は、Tv=1/360秒である。従って、本実施形態の目的に沿えば、（１４）式で算出したTvよりも小さい値をシャッタ速度として用いてもよい。 In next step S2703, the exposure condition control unit 2606 calculates the shutter speed of the next frame. The shutter speed of the next frame is expressed as Tv, and the magnitude of the motion vector calculated in S2702 is expressed as v (unit: pixel). The frame interval is expressed as T_frame (1/60 second when the frame rate is 60 fps). The shutter speed Tv is calculated from the following equation (14) from v and T_frame.
Tv = T_frame / v (14)
The equation (14) means that Tv is calculated as the shutter speed at which the moving amount of the main subject image is 1 pixel within the photographing time. In other words, Tv is a shutter speed at which motion blur is contained in one pixel. For example, T_frame is 1/60 seconds and v is 6 pixels. In this case, the shutter speed corresponding to the movement of one pixel of the main subject image is Tv = 1/360 seconds. Therefore, in accordance with the object of this embodiment, a value smaller than Tv calculated by equation (14) may be used as the shutter speed.

Ｓ２７０４にて露出条件制御部２６０６は、その他の露出条件とフレームレートを決定する。その他の露出条件とは、具体的には感度と絞り値である。絞り値を変えると被写界深度が変わり、前後フレームとの連続性が失われてしまう。このため、本実施形態では、シャッタ速度Tvが変化した分については感度を変化させることで露出を一定に保つ制御が行われる。また、シャッタ速度Tvが速くなるにつれて、フレームレートを変更する制御が行われる。例えば、Tvの値が1/120秒以下となった場合、フレームレートを60fpsから120fpsへ変更する処理が実行される。この処理により、動きが速い被写体の決定的瞬間を逃し難くなるという効果が得られる。最後に露出条件制御部２６０６は、Ｓ２７０４またはＳ２７０５で決定された露出条件を撮像光学系１０１および撮像部１０２の制御にフィードバックして反映させた上で、次フレームの撮像処理を行うように制御する。 In S2704, the exposure condition control unit 2606 determines other exposure conditions and a frame rate. The other exposure conditions are specifically sensitivity and aperture value. Changing the aperture value changes the depth of field and loses continuity with the previous and next frames. For this reason, in the present embodiment, for the amount of change in the shutter speed Tv, control is performed to keep the exposure constant by changing the sensitivity. Further, control is performed to change the frame rate as the shutter speed Tv increases. For example, when the value of Tv is 1/120 seconds or less, processing for changing the frame rate from 60 fps to 120 fps is executed. This process provides an effect that it is difficult to miss a critical moment of a fast-moving subject. Finally, the exposure condition control unit 2606 performs feedback control of the exposure condition determined in S2704 or S2705 in the control of the imaging optical system 101 and the imaging unit 102, and performs control to perform the imaging process for the next frame. .

本実施形態では、露出条件制御部２６０６の処理によって、被写体（動体）の速度に合わせた最適なシャッタ速度で撮影が可能となる。また、動画として鑑賞する場合には、画像フレームに係るシャッタ速度Tvの値が大きいために動体の動きが不連続的に見えることを回避するため、被写体の動き量に応じて、電子的に被写体画像へブラーを付与する処理等が行われる。 In the present embodiment, the processing of the exposure condition control unit 2606 enables shooting at an optimal shutter speed that matches the speed of the subject (moving object). In addition, when viewing as a moving image, the subject is electronically controlled according to the amount of motion of the subject in order to avoid discontinuous movement of the moving object due to the large shutter speed Tv associated with the image frame. A process of adding blur to the image is performed.

また、上述した第１、第２、第３および第４の実施形態では、基本的にフレーム毎に記録モードの判定を行い切り替えて制御していたが、これに限られるものではない。記録データ量を削減する目的で手動あるいは自動で所定フレーム毎に高品質モードで記録するなど、周期的に記録モードを切り替えて制御を行ってもよい。
手動で設定が行われる場合、例えば操作部１０７を介したユーザ操作により所定フレーム数として５フレームと設定されると、５フレームに１フレーム、高品質モードとして画像とともに深度分布情報が記録される。あるいはユーザが設定する撮像のフレームレートに応じて高品質モードで記録する周期が決められてもよい。
自動で設定が行われる場合、上述した各実施形態における主被写体と背景の距離差、露出段差、被写体スコア、シーンチェンジ度合いなどの判定の少なくとも１つを定期的に行う。そして、高品質モードで記録される周期を決定して、次の判定までその周期で高品質モードでの記録が行われるように制御すればよい。 In the first, second, third, and fourth embodiments described above, the recording mode is basically determined and switched for each frame. However, the present invention is not limited to this. For the purpose of reducing the amount of recording data, control may be performed by periodically switching the recording mode, such as recording in a high quality mode every predetermined frame manually or automatically.
When manual setting is performed, for example, when the predetermined number of frames is set to 5 by a user operation via the operation unit 107, the depth distribution information is recorded together with the image as 1 frame in 5 frames and the high quality mode. Alternatively, the period of recording in the high quality mode may be determined according to the imaging frame rate set by the user.
When the setting is automatically performed, at least one of determinations such as a difference in distance between the main subject and the background, an exposure step, a subject score, and a scene change degree in each embodiment described above is periodically performed. Then, a cycle for recording in the high quality mode may be determined, and control may be performed so that recording in the high quality mode is performed in that cycle until the next determination.

＜各実施形態における記録形式のパターン＞
上述した各実施形態において、撮像された複数の画像（フレーム）と、その一部のフレームに対応する深度分布情報を記録する形式については、下記のいずれでもよいものとする。
図２８に撮像された複数の画像（フレーム）と、その一部のフレームに対応する深度分布情報を記録する形式について各パターンをイメージした図を示す。すなわち、記録形式としては、図２８（Ａ）のように、順次撮像され取得された複数の画像１、２、３と、画像１、画像３にそれぞれ対応する距離マップ１、３が全て別ファイルとして記録されている。この場合、各画像ファイルのヘッダに画像と距離マップを関連づける情報（あるいは対応する距離マップがないという情報）を記録し、距離マップ側にも対応する画像の情報を記録するとよい。 <Pattern of recording format in each embodiment>
In each of the above-described embodiments, any of the following may be used as a format for recording a plurality of captured images (frames) and depth distribution information corresponding to some of the frames.
FIG. 28 shows a diagram in which each pattern is imaged with respect to a format for recording a plurality of captured images (frames) and depth distribution information corresponding to some of the frames. That is, as a recording format, as shown in FIG. 28A, a plurality of images 1, 2, and 3 sequentially captured and acquired, and distance maps 1 and 3 respectively corresponding to images 1 and 3 are all separate files. It is recorded as. In this case, information that associates the image with the distance map (or information that there is no corresponding distance map) is recorded in the header of each image file, and the corresponding image information is also recorded on the distance map side.

また、図２８（Ｂ）のように、各画像が連続した画像として関連づけられ（符号化されてもよい）１つの動画像ファイルとなっており、この動画像ファイルと同期した形で複数の距離マップがそれぞれ個別に距離マップファイルとして記録されていてもよい。距離マップには対応する動画のタイムコードが記録されており、静止画切り出しを含めた動画編集の際に、画像と対応づけられて必要に応じて読み出して利用することができる。 Further, as shown in FIG. 28B, each image is associated as a continuous image (may be encoded) as one moving image file, and a plurality of distances are synchronized with the moving image file. Each map may be recorded individually as a distance map file. The distance map records the time code of the corresponding moving image, and can be read out and used as necessary in association with the image when editing the moving image including still image clipping.

また、図２８（Ｃ）のように、各画像が連続した画像として関連づけられ（符号化されてもよい）１つの動画像ファイルとなっており、この動画像ファイルと同期した形で複数の距離マップが別の１つのファイルとして記録されていてもよい。この場合、動画の各フレームのタイムコードと同期したタイムコードが、対応する距離マップに記録されて１つのファイルとして記録されていればよい。図２８（Ｂ）の形態に比べて、距離マップも１つのファイルにすることで動画像ファイルと対で扱いやすく、必要に応じて距離マップ間も公知の符号化技術を用いて符号化することにより、データ量の削減も期待できる。 Also, as shown in FIG. 28C, each image is associated as a continuous image (may be encoded) as one moving image file, and a plurality of distances are synchronized with the moving image file. The map may be recorded as a separate file. In this case, the time code synchronized with the time code of each frame of the moving image may be recorded in the corresponding distance map and recorded as one file. Compared to the form of FIG. 28 (B), the distance map can be handled as a pair with a moving image file by making it a single file, and between distance maps can be encoded using a known encoding technique as necessary. As a result, a reduction in data volume can be expected.

また、図２８（Ｄ）のように、画像とその前あるいは後に対応する距離マップがつながって記録され、全体で１つの動画像ファイルを形成して記録される形式でもよい。この形式では、記録処理が行われた画像に対応する深度分布情報である距離Ｍａｐを、該画像データのメタデータとして該画像データの前または後に記録する。この形式では１つのファイルで扱うことができたり、画像に対応する距離マップも隣接しているためアクセスが容易であったりなどの利点が考えられる。 Further, as shown in FIG. 28D, a format in which an image and a distance map corresponding to the image before or after the image are connected and recorded, and one moving image file is formed as a whole may be recorded. In this format, a distance Map that is depth distribution information corresponding to an image on which recording processing has been performed is recorded before or after the image data as metadata of the image data. In this format, there can be advantages such that it can be handled by one file, and the distance map corresponding to the image is adjacent, so that access is easy.

［その他の実施形態］
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 [Other Embodiments]
The present invention supplies a program that realizes one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in a computer of the system or apparatus read and execute the program This process can be realized. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

以上の通り、本発明によれば、記録容量を抑えつつ、利便性の高い画像記録を行うことができる。 As described above, according to the present invention, it is possible to perform highly convenient image recording while suppressing the recording capacity.

１０１・・・撮像光学系
１０２・・・撮像部
１０４・・・記録モード制御部
１０５・・・画像処理部
１０６・・・システム制御部
１０９・・・記録部
２２１０・・・記録情報調整部
２６０６・・・露出条件制御部 DESCRIPTION OF SYMBOLS 101 ... Imaging optical system 102 ... Imaging part 104 ... Recording mode control part 105 ... Image processing part 106 ... System control part 109 ... Recording part 2210 ... Recording information adjustment part 2606 ... Exposure condition control unit

Claims

An image processing apparatus including a recording unit that acquires a plurality of image data and records the acquired image data on a recording medium,
Acquisition means for acquiring subject depth distribution information corresponding to image data;
A first mode for recording the image data and the depth distribution information corresponding to the image data on the recording medium by the recording unit; and recording the image data by the recording unit without recording the depth distribution information. An image processing apparatus comprising: control means for performing a recording process of the plurality of image data by switching a second mode for recording on a medium.

The control means performs recording processing by switching between the first mode and the second mode during recording of the plurality of image data, and uses the recording means as the moving image to record the recording medium. The image processing apparatus according to claim 1, wherein the image processing apparatus is recorded.

3. The recording unit according to claim 2, wherein the recording unit records depth distribution information corresponding to the plurality of image data and image data subjected to recording processing in the first mode in one moving image file. The image processing apparatus described.

The recording means records depth distribution information corresponding to the image data recorded in the first mode in the moving image file before or after the image data as metadata of the image data. The image processing apparatus according to claim 3.

The control means acquires the depth distribution information including depth information of a main subject and a background among a plurality of subjects, and a difference between depth information of the main subject and background depth information included in the depth distribution information is within a threshold value The image processing apparatus according to claim 1, wherein recording processing is performed in the first mode.

The image processing apparatus according to claim 5, further comprising a threshold value calculation unit that calculates the threshold value from a focal length of an imaging optical system, depth information of a main subject included in the depth distribution information, and an allowable defocus amount. .

6. The control unit according to claim 5, wherein the control unit determines whether or not to perform recording processing in the first mode based on a comparison result between the difference and the threshold value and an F value of the imaging optical system. 6. The image processing apparatus according to 6.

The control means determines whether or not to perform recording processing in the first mode by calculating an evaluation value indicating a dynamic range of a frame and comparing the evaluation value with a threshold value. Item 8. The image processing apparatus according to Item 1.

The control means calculates, as the evaluation value, an exposure step between a main subject region and a background region related to the main subject among a plurality of subjects, and when the exposure step is greater than or equal to the threshold value, The image processing apparatus according to claim 8, wherein a recording process is performed, and the recording process is performed in the second mode when the exposure level difference is smaller than the threshold value.

Extraction means for extracting information on a main subject area related to a main subject among a plurality of subjects from image data and depth distribution information recorded in the first mode;
The image processing apparatus according to claim 1, further comprising: an image processing unit that acquires information on a main subject area extracted by the extraction unit and performs image processing on the image data.

The image processing apparatus according to claim 10, wherein the image processing unit determines a background area in the image from information on the main subject area and performs a blurring process on the background image.

Extraction means for extracting information on a main subject area related to a main subject among a plurality of subjects from image data and depth distribution information recorded in the first mode;
Image processing means for acquiring information on the main subject region extracted by the extraction means and performing image processing on the image data;
The image processing means determines a main subject region and a background region in an image from information on the main subject region, and performs different gradation correction processing on the main subject region and the background region, respectively. The image processing apparatus according to 1.

An adjustment unit that acquires a recording capacity of the moving image and adjusts a recording amount by narrowing down a frame to be recorded in the first mode when the recording capacity is equal to or greater than a threshold;
The image processing apparatus according to claim 1, wherein the adjustment unit performs control to record the narrowed-down image data of the frame and the depth distribution information.

The control means compares a score calculated from at least one of position information and size information of a main subject among a plurality of subjects with a threshold value, and in the first mode in a frame where the score is larger than the threshold value. The image processing apparatus according to claim 1, wherein a recording process is performed, and the recording process is performed in the second mode when the score is equal to or less than the threshold value.

The image processing according to claim 1, wherein the control unit detects a change in a scene from a difference between images, and performs a recording process in the first mode when the change in the scene is detected. apparatus.

The image processing apparatus according to claim 1, wherein the control unit performs a recording process in the first mode for each predetermined frame in a plurality of time-series images acquired by the acquisition unit.

2. The control unit according to claim 1, wherein when the moving image recorded in the first mode is reproduced, the control unit notifies that the image is recorded in the first mode. The image processing apparatus according to any one of 16.

The depth distribution information includes an image shift map based on a parallax amount of a plurality of viewpoint images, a defocus map based on a defocus amount for each region, a distance map indicating a relative distance relationship between each subject in image data, and a TOF method. 18. The image processing apparatus according to claim 1, wherein the image processing apparatus is one of distance information indicating a distance relationship from the acquired imaging apparatus to each subject.

The acquisition unit calculates and acquires an image shift map that is the depth distribution information corresponding to the image data, based on a parallax amount of a pair of parallax images corresponding to the acquired image data. The image processing apparatus according to any one of claims 1 to 18.

The acquisition means calculates and acquires a defocus map, which is the depth distribution information corresponding to the image data, based on a defocus amount for each area of the acquired image data. The image processing apparatus according to any one of 1 to 18.

The acquisition means calculates a relative distance relationship between the subjects as the depth distribution information corresponding to the image data based on a defocus amount for each area of the acquired image data and an imaging optical system or an imaging element. The image processing apparatus according to claim 1, wherein the image processing apparatus calculates and acquires the image processing apparatus.

The acquisition means uses the TOF method that measures the delay time from the light projection to the subject to the reception of the reflected light to measure the distance to the subject, from the imaging device that is the depth distribution information corresponding to the image data The image processing apparatus according to claim 1, wherein a subject distance to each subject is acquired.

23. The image processing apparatus according to claim 1, wherein the control unit performs control to record RAW image data before image processing in the first mode.

The image processing apparatus according to claim 23, wherein the RAW image is an image that has not been subjected to image processing including demosaicing processing, white balance adjustment, color conversion processing, or gamma correction.

The image processing apparatus according to any one of claims 1 to 24;
An imaging apparatus comprising: imaging means for imaging a subject.

An exposure condition control means for controlling an exposure condition when the subject is imaged by the imaging means and image data is generated;
The exposure condition control means acquires the amount of movement of the subject in the image of the frame in the first mode, and determines the shutter speed and the frame rate related to imaging the next frame of the frame from the amount of movement and the frame interval. 26. The imaging apparatus according to claim 25, wherein one or more of them are determined.

A method for controlling an image processing apparatus including a recording unit that acquires and records a plurality of image data on a recording medium,
An acquisition step of acquiring depth distribution information of the subject corresponding to the image data;
A first mode for recording the image data and the depth distribution information corresponding to the image data on the recording medium by the recording unit; and recording the image data by the recording unit without recording the depth distribution information. And a control step of performing recording processing of the plurality of image data by switching to a second mode for recording on a medium.

A program causing a computer of an image processing apparatus to execute each step according to claim 27.