JP2020077964A

JP2020077964A - Imaging apparatus and control method thereof

Info

Publication number: JP2020077964A
Application number: JP2018209754A
Authority: JP
Inventors: 学梅山; Manabu Umeyama
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2018-11-07
Filing date: 2018-11-07
Publication date: 2020-05-21
Anticipated expiration: 2038-11-07
Also published as: JP7246894B2

Abstract

To provide a technology that can acquire dynamic metadata (feature amount of each scene) without going through post production.SOLUTION: An imaging apparatus of the present invention includes image capturing means that captures a moving image, determination means that determines a plurality of scenes of a moving image on the basis of a temporal change of a parameter corresponding to a frame of the moving image, acquisition means that acquires the respective feature quantities of the plurality of scenes, and generating means that generates information in which the feature quantity acquired by the acquisition means is associated with each of the plurality of scenes.SELECTED DRAWING: Figure 1

Description

本発明は、撮像装置およびその制御方法に関する。 The present invention relates to an image pickup device and a control method thereof.

近年、動画の各フレームや各シーンの最大輝度を含むメタデータ（動的メタデータ）を用いて動画の表示方法を順次（各フレームや各シーンで）変更する技術が普及し始めている。具体的には、動的メタデータを用いてトーンマップを順次変更することで、上限表示輝度に応じたＨＤＲ（ハイダイナミックレンジ）表示を順次変更する技術が普及し始めている。特許文献１には、ポストプロダクションの画像データ編集工程において、動画の各シーンのメタデータ（動的メタデータの一部）を生成して動画データに付加する方法が開示されている。 2. Description of the Related Art In recent years, a technique for sequentially changing a display method of a moving image (for each frame or each scene) using metadata (dynamic metadata) including maximum brightness of each frame of each moving image or each scene has become popular. Specifically, a technique for sequentially changing the HDR (high dynamic range) display according to the upper limit display brightness by sequentially changing the tone map using the dynamic metadata is becoming widespread. Patent Document 1 discloses a method of generating metadata (a part of dynamic metadata) of each scene of a moving image and adding the metadata to the moving image data in a post-production image data editing process.

国際公開第２０１５／０１７３１４号International Publication No. 2015/017314

しかしながら、特許文献１に開示の技術では、ポストプロダクション（画像データ編集工程）が無い場合には、動的メタデータが得られず、動画の表示方法（トーンマッピングを用いたＨＤＲ表示など）を順次変更することが難しい。 However, in the technique disclosed in Patent Document 1, dynamic metadata cannot be obtained without post-production (image data editing process), and moving image display methods (such as HDR display using tone mapping) are sequentially performed. Difficult to change.

本発明は、ポストプロダクションを介さずに動的メタデータなど（各シーンの特徴量）を取得できる技術を提供することを目的とする。 An object of the present invention is to provide a technique capable of acquiring dynamic metadata and the like (feature amount of each scene) without going through post production.

本発明の第１の態様は、動画を撮像する撮像手段と、前記動画のフレームに対応するパラメータの時間変化に基づいて前記動画の複数のシーンを決定する決定手段と、前記複数のシーンのそれぞれの特徴量を取得する取得手段と、前記取得手段によって取得された特徴量を前記複数のシーンのそれぞれに関連付けた情報を生成する生成手段と、を有することを特徴とする撮像装置である。 According to a first aspect of the present invention, an image capturing unit that captures a moving image, a determining unit that determines a plurality of scenes of the moving image based on a temporal change of a parameter corresponding to a frame of the moving image, and each of the plurality of scenes. The image pickup apparatus is characterized by comprising: an acquisition unit configured to acquire the feature amount of 1. and a generation unit configured to generate information in which the feature amount acquired by the acquisition unit is associated with each of the plurality of scenes.

本発明の第２の態様は、動画を撮像する撮像ステップと、前記動画のフレームに対応するパラメータの時間変化に基づいて前記動画の複数のシーンを決定する決定ステップと、前記複数のシーンのそれぞれの特徴量を取得する取得ステップと、前記取得ステップにおいて取得された特徴量を前記複数のシーンのそれぞれに関連付けた情報を生成する生成ステップと、を有することを特徴とする撮像装置の制御方法である。 A second aspect of the present invention includes an imaging step of capturing a moving image, a determining step of determining a plurality of scenes of the moving image based on a temporal change of a parameter corresponding to a frame of the moving image, and each of the plurality of scenes. And a generating step of generating information in which the characteristic amount acquired in the acquiring step is associated with each of the plurality of scenes. is there.

本発明の第３の態様は、コンピュータを、上述した撮像装置の各手段として機能させるためのプログラムである。 A third aspect of the present invention is a program for causing a computer to function as each unit of the above-described imaging device.

本発明によれば、ポストプロダクションを介さずに動的メタデータなど（各シーンの特徴量）を生成できる。 According to the present invention, dynamic metadata and the like (feature amount of each scene) can be generated without going through post production.

実施例１に係る撮像装置の構成例を示すブロック図1 is a block diagram showing a configuration example of an image pickup apparatus according to a first embodiment. 実施例１に係るフレーム最大輝度値の時間変化の一例を示す図FIG. 6 is a diagram showing an example of a temporal change of the maximum frame luminance value according to the first embodiment. 実施例１に係る撮影中における各種パラメータの時間変化の一例を示す図FIG. 3 is a diagram showing an example of a temporal change of various parameters during shooting according to the first embodiment. 実施例１に係る撮影処理の一例を示すフローチャートFlowchart showing an example of a photographing process according to the first embodiment 実施例２に係るフレーム最大輝度値と絞り値の時間変化の一例を示す図FIG. 8 is a diagram showing an example of temporal changes in frame maximum brightness value and aperture value according to the second embodiment. 実施例２に係るフレーム最大輝度値と絞り値の時間変化の一例を示す図FIG. 8 is a diagram showing an example of temporal changes in frame maximum brightness value and aperture value according to the second embodiment. 実施例３に係る撮像装置の構成例を示すブロック図Block diagram showing a configuration example of an imaging apparatus according to a third embodiment 実施例３に係るフレーム画像の一例を示す図FIG. 8 is a diagram showing an example of a frame image according to the third embodiment.

＜実施例１＞
以下、本発明の実施例１について説明する。図１は、本実施例に係る撮像装置１００の構成例を示すブロック図である。撮像装置１００は、撮像光学系１０１、撮像素子１０２、撮像制御部１０３、特徴量取得部１０４、シーン決定部１０５、メタデータ生成部１０６、メタデータ付加部１０７、出力部１０８、記憶部１０９、出力ＩＦ１１０、ＣＰＵ１１１、ＲＡＭ１１２、ＲＯＭ１１３、及び、操作部１１４を有する。 <Example 1>
Hereinafter, Example 1 of the present invention will be described. FIG. 1 is a block diagram showing a configuration example of the image pickup apparatus 100 according to the present embodiment. The image pickup apparatus 100 includes an image pickup optical system 101, an image pickup element 102, an image pickup control unit 103, a feature amount acquisition unit 104, a scene determination unit 105, a metadata generation unit 106, a metadata addition unit 107, an output unit 108, a storage unit 109, and It has an output IF 110, a CPU 111, a RAM 112, a ROM 113, and an operation unit 114.

撮像光学系１０１は、被写体を表す光学像を撮像素子１０２に結像（形成）する。撮像光学系１０１は、例えば、ズームレンズやフォーカスレンズ等のレンズ群、絞り調整装置、シャッター装置などを有する。 The imaging optical system 101 forms (forms) an optical image representing a subject on the imaging element 102. The imaging optical system 101 has, for example, a lens group such as a zoom lens and a focus lens, an aperture adjustment device, and a shutter device.

撮像素子１０２は、被写体像（被写体を表す動画）を撮像する。具体的には、撮像素子１０２は、結像された光学像（被写体から撮像光学系１０１を介して入射した光）をアナログ電気信号に変換する光電変換処理を行う。そして、撮像素子１０２は、光電変換処理によって得られたアナログ電気信号をデジタル電気信号（動画の１フレームの画像データ；フレーム画像データ）に変換するＡＤ変換処理（アナログ−デジタル変換処理）をさらに行う。その後、撮像素子１０２は、ＡＤ変換処理によって得られたフレーム画像データを、特徴量取得部１０４とシーン決定部１０５へ出力する。撮像素子１０２は、これらの処理を繰り返すことで、動画の複数のフレームにそれぞれ対応する複数のフレーム画像データを順次出力する。 The image sensor 102 captures a subject image (moving image representing the subject). Specifically, the image sensor 102 performs a photoelectric conversion process for converting the formed optical image (light incident from the subject through the imaging optical system 101) into an analog electric signal. Then, the image sensor 102 further performs AD conversion processing (analog-digital conversion processing) for converting an analog electric signal obtained by the photoelectric conversion processing into a digital electric signal (image data of one frame of a moving image; frame image data). .. After that, the image pickup device 102 outputs the frame image data obtained by the AD conversion processing to the feature amount acquisition unit 104 and the scene determination unit 105. The image sensor 102 repeats these processes to sequentially output a plurality of frame image data corresponding to a plurality of frames of a moving image.

撮像制御部１０３は、撮像装置１００の撮像条件を制御する。本実施例では、撮像制御部１０３は、撮像装置１００に対するユーザ操作や、撮像装置１００の状態などに応じて、撮像装置１００の露出（露出条件）を制御する。例えば、撮像制御部１０３は、絞りや、撮像素子の電荷蓄積時間などを制御することにより、露出を制御する。具体的には、撮像制御部１０３は、ユーザによって指定されたゲイン値、シャッター速度、絞り値などに応じて、撮像光学系１０１の状態や、撮像素子１０２の処理などを制御することにより、露出を制御する。さらに、撮像制御部１０３は、撮像装置１００に対するユーザ操作や、撮像装置１００の状態などに応じて、撮像装置１００のフォーカスを制御する。例えば、撮像制御部１０３は、フォーカスレンズの駆動量や駆動方向などを制御することにより、フォーカスを制御する。具体的には、ＡＦ（オートフォーカス）撮影モードが設定されている場合に、撮像制御部１０３は、フォーカスレンズの位置を所定位置にして、フレーム画像（動画の１フレームの画像）のコントラストの形状（分布）を算出する。そして、撮像制御部１０３は、フレーム画像内の複数の位置のうち、コントラストが最も高い位置を、撮像素子１０２で光束を合焦させる位置（フォーカス位置）としてＡＦ制御を行う。 The imaging control unit 103 controls the imaging conditions of the imaging device 100. In the present embodiment, the imaging control unit 103 controls the exposure (exposure condition) of the imaging device 100 according to a user operation on the imaging device 100, the state of the imaging device 100, and the like. For example, the imaging control unit 103 controls the exposure by controlling the aperture and the charge storage time of the image sensor. Specifically, the image pickup control unit 103 controls the state of the image pickup optical system 101, the processing of the image pickup element 102, and the like according to a gain value, a shutter speed, an aperture value, and the like designated by the user, thereby performing exposure. To control. Furthermore, the image capturing control unit 103 controls the focus of the image capturing apparatus 100 according to a user operation on the image capturing apparatus 100, the state of the image capturing apparatus 100, and the like. For example, the imaging control unit 103 controls focus by controlling the drive amount and drive direction of the focus lens. Specifically, when the AF (autofocus) shooting mode is set, the imaging control unit 103 sets the position of the focus lens to a predetermined position and sets the shape of the contrast of the frame image (the image of one frame of the moving image). Calculate (distribution). Then, the imaging control unit 103 performs AF control with the position having the highest contrast among the plurality of positions in the frame image as the position (focus position) at which the light flux is focused by the image sensor 102.

特徴量取得部１０４は、撮像素子１０２から出力されたフレーム画像データの特徴量（フレーム特徴量）を取得する。具体的には、特徴量取得部１０４は、フレーム特徴量として、動画の複数のシーンを決定するための特徴量と、各シーンの特徴量（シーン特徴量）を取得するための特徴量とを取得する。フレーム特徴量は「フレームに対応するパラメータ」とも言える。本実施例では、特徴量取得部１０４は、フレーム画像データの最大輝度
値（フレーム最大輝度値）を当該フレーム画像データから取得し、フレーム最大輝度値をシーン決定部１０５へ出力する。フレーム最大輝度値は、複数のシーンを決定するための特徴量と、シーン特徴量を取得するための特徴量との両方として使用される。 The feature amount acquisition unit 104 acquires the feature amount (frame feature amount) of the frame image data output from the image sensor 102. Specifically, the feature amount acquisition unit 104 includes, as the frame feature amount, a feature amount for determining a plurality of scenes of a moving image and a feature amount for obtaining the feature amount of each scene (scene feature amount). get. It can be said that the frame feature amount is also a “parameter corresponding to the frame”. In the present embodiment, the feature amount acquisition unit 104 acquires the maximum brightness value of the frame image data (frame maximum brightness value) from the frame image data, and outputs the frame maximum brightness value to the scene determination unit 105. The frame maximum brightness value is used as both a feature amount for determining a plurality of scenes and a feature amount for acquiring the scene feature amount.

シーン決定部１０５は、動画の複数のシーンを決定するシーン決定処理を行う。シーン決定処理は「動画の全期間を複数の期間に分割するシーン分割処理」とも言える。フレーム特徴量はシーンの切り替わり時に（大きく）変化することが多い。そこで、本実施例では、シーン決定部１０５は、特徴量取得部１０４から出力されたフレーム特徴量（フレーム最大輝度値）の時間変化に基づいて複数のシーンを決定する。フレーム特徴量の時間変化は「時間的に連続するフレーム間におけるフレーム特徴量の変化」とも言える。シーン決定部１０５は、シーン決定処理の結果をメタデータ生成部１０６へ出力する。さらに、シーン決定部１０５は、特徴量取得部１０４から出力されたフレーム特徴量（フレーム最大輝度値）をメタデータ生成部１０６へ出力し、撮像素子１０２から出力されたフレーム画像データをメタデータ付加部１０７へ出力する。なお、シーン決定部１０５は、撮像素子１０２から出力されたフレーム画像データに対して各種画像処理を施し、画像処理後のフレーム画像データをメタデータ付加部１０７へ出力してもよい。画像処理として、例えば、撮像光学系１０１や撮像素子１０２に起因する歪みやノイズを低減する補正処理が行われてもよいし、ホワイトバランス調整、色変換処理、ガンマ補正などが行われてもよい。 The scene determination unit 105 performs a scene determination process that determines a plurality of scenes of a moving image. It can be said that the scene determination processing is “scene division processing for dividing the entire period of the moving image into a plurality of periods”. The frame feature amount often changes (significantly) when the scene is switched. Therefore, in the present embodiment, the scene determination unit 105 determines a plurality of scenes based on the temporal change in the frame feature amount (frame maximum luminance value) output from the feature amount acquisition unit 104. It can be said that the temporal change of the frame feature amount is “change of the frame feature amount between temporally consecutive frames”. The scene determination unit 105 outputs the result of the scene determination processing to the metadata generation unit 106. Furthermore, the scene determination unit 105 outputs the frame feature amount (frame maximum brightness value) output from the feature amount acquisition unit 104 to the metadata generation unit 106, and adds the frame image data output from the image sensor 102 to the metadata. It is output to the unit 107. The scene determination unit 105 may perform various image processing on the frame image data output from the image sensor 102 and output the frame image data after the image processing to the metadata addition unit 107. As the image processing, for example, correction processing for reducing distortion and noise caused by the image pickup optical system 101 and the image pickup device 102 may be performed, and white balance adjustment, color conversion processing, gamma correction, and the like may be performed. ..

メタデータ生成部１０６は、動画データ（動画のデータ）に付加する情報（動的メタデータ）を生成し、動的メタデータをメタデータ付加部１０７へ出力する。具体的には、メタデータ生成部１０６は、シーン決定部１０５から出力された情報（シーン決定処理の結果、及び、動画の各フレームのフレーム特徴量（フレーム最大輝度値））に基づいて、動画の各シーンのシーン特徴量を取得（決定）する。そして、メタデータ生成部１０６は、取得したシーン特徴量を各シーンに関連付けた情報を、動的メタデータとして生成する。 The metadata generation unit 106 generates information (dynamic metadata) to be added to moving image data (moving image data), and outputs the dynamic metadata to the metadata addition unit 107. Specifically, the metadata generation unit 106, based on the information output from the scene determination unit 105 (the result of the scene determination processing, and the frame feature amount (frame maximum brightness value) of each frame of the moving image). The scene feature amount of each scene is acquired (determined). Then, the metadata generation unit 106 generates, as dynamic metadata, information in which the acquired scene feature amount is associated with each scene.

メタデータ付加部１０７は、シーン決定部１０５（撮像素子１０２）から順次出力された複数のフレーム画像データからなる動画データを生成し、メタデータ生成部１０６から出力された動的メタデータを動画データに付加する。そして、メタデータ付加部１０７は、動的メタデータが付加された後の動画データを、出力部１０８へ出力する。例えば、動画データは、ＭＰＥＧ−４ＡＶＣやＨＥＶＣ（ＨｉｇｈＥｆｆｉｃｉｅｎｃｙＶｉｄｅｏＣｏｄｉｎｇ）などのファイル形式のデータであり、メタデータ付加部１０７は、ファイル形式の動画データを得るための符号化処理を行う。そして、メタデータ付加部１０７は、動的メタデータを、ＭＰＥＧ−４ＡＶＣやＨＥＶＣで規定されたＳＥＩ（ＳｕｐｐｌｅｍｅｎｔａｌＥｎｈａｎｃｅｍｅｎｔＩｎｆｏｒｍａｔｉｏｎ）として付加する。 The metadata addition unit 107 generates moving image data composed of a plurality of frame image data sequentially output from the scene determination unit 105 (imaging device 102), and uses the dynamic metadata output from the metadata generation unit 106 as moving image data. Added to. Then, the metadata adding unit 107 outputs the moving image data to which the dynamic metadata has been added to the output unit 108. For example, the moving image data is data in a file format such as MPEG-4 AVC or HEVC (High Efficiency Video Coding), and the metadata adding unit 107 performs an encoding process to obtain the moving image data in the file format. Then, the metadata adding unit 107 adds the dynamic metadata as SEI (Supplemental Enhancement Information) defined by MPEG-4 AVC or HEVC.

出力部１０８は、メタデータ付加部１０７から出力された動画データ（動的メタデータが付加された後の動画データ）を、記憶部１０９へ出力する。メタデータ付加部１０７が動画データに動的メタデータを付加せずに、出力部１０８が動画データと動的メタデータを個別に出力してもよい。その場合は、メタデータ付加部１０７や出力部１０８は、動画データと動的メタデータを互いに関連付けて出力してもよいし、そうでなくてもよい。 The output unit 108 outputs the moving image data (moving image data after the dynamic metadata is added) output from the metadata adding unit 107 to the storage unit 109. The output unit 108 may individually output the moving image data and the dynamic metadata without the metadata adding unit 107 adding the dynamic metadata to the moving image data. In that case, the metadata adding unit 107 and the output unit 108 may or may not output the moving image data and the dynamic metadata in association with each other.

記憶部１０９は、ＣＦ（コンパクトフラッシュ）カード等のランダムアクセスの記録媒体であり、出力部１０８から出力された動画データ（動的メタデータが付加された後の動画データ）を記憶する。記憶部１０９は、撮像装置１００から取り外し可能であり、撮像装置１００以外の装置（パーソナルコンピュータ等）に装着することが可能である。なお、記憶部１０９は、撮像装置１００に対して着脱不可能な内蔵の記録媒体であってもよい。 The storage unit 109 is a random access recording medium such as a CF (Compact Flash) card, and stores the moving image data output from the output unit 108 (the moving image data after the dynamic metadata is added). The storage unit 109 is removable from the imaging device 100, and can be attached to a device (personal computer or the like) other than the imaging device 100. The storage unit 109 may be a built-in recording medium that is not removable from the imaging device 100.

出力ＩＦ１１０は、記憶部１０９が記憶している動画データ（動的メタデータが付加された後の動画データ）を不図示の外部装置へ出力する。例えば、ＨＤＭＩ（登録商標）（Ｈｉｇｈ−ＤｅｆｉｎｉｔｉｏｎＭｕｌｔｉｍｅｄｉａＩｎｔｅｒｆａｃｅ）規格に準拠した通信プロトコルに従って、動画データをストリーム形式で出力する。なお、動画データと動的メタデータの伝送方法は特に限定されない。例えば、ＳＭＰＴＥ（ＳｏｃｉｅｔｙｏｆＭｏｔｉｏｎＰｉｃｔｕｒｅ＆ＴｅｌｅｖｉｓｉｏｎＥｎｇｉｎｅｅｒｓ）ＳＴ２０９４で規定されたパラメータが、動的メタデータとして伝送されてもよい。具体的には、ＨＤＲ１０＋で規定されたＳｃｅｎｅ−ＭａｘＣＬＬ（ＭａｘｉｍｕｍＣｏｎｔｅｎｔＬｉｇｈｔＬｅｖｅｌ）が動的メタデータとして伝送されてもよい。 The output IF 110 outputs the moving image data stored in the storage unit 109 (the moving image data after the dynamic metadata is added) to an external device (not shown). For example, the moving image data is output in a stream format according to a communication protocol based on the HDMI (registered trademark) (High-Definition Multimedia Interface) standard. The method of transmitting the moving image data and the dynamic metadata is not particularly limited. For example, a parameter defined by SMPTE (Society of Motion Picture & Television Engineers) ST 2094 may be transmitted as dynamic metadata. Specifically, Scene-Max CLL (Maximum Content Light Level) defined by HDR10 + may be transmitted as dynamic metadata.

ＣＰＵ１１１は、撮像装置１００が有する他のブロックに、不図示の内部バスを介して接続されている。ＣＰＵ１１１は、撮像装置１００の処理を制御する。ＲＡＭ１１２は、撮像装置１００が有する他のブロックに、不図示の内部バスを介して接続されている。ＲＡＭ１１２は、ＣＰＵ１１１のワークエリアや、種々のデータを一時的に記憶する一時記憶領域として使用される。ＲＯＭ１１３は、撮像装置１００が有する他のブロックに、不図示の内部バスを介して接続されている。ＲＯＭ１１３には、ＣＰＵ１１１の処理に係るファームウェア、ＣＰＵ１１１の処理に係る情報、等が予め記録されている。 The CPU 111 is connected to other blocks included in the imaging device 100 via an internal bus (not shown). The CPU 111 controls the processing of the imaging device 100. The RAM 112 is connected to other blocks included in the imaging device 100 via an internal bus (not shown). The RAM 112 is used as a work area for the CPU 111 and a temporary storage area for temporarily storing various data. The ROM 113 is connected to other blocks included in the imaging device 100 via an internal bus (not shown). Firmware related to the processing of the CPU 111, information related to the processing of the CPU 111, and the like are recorded in the ROM 113 in advance.

操作部１１４は、不図示の内部バスを介してＣＰＵ１１１に接続されている。操作部１１４は、ユーザ操作を受け付ける入力部としての各種操作部材である。操作部１１４は、撮影を開始するための撮影開始ボタン、フォーカス動作のオート制御／マニュアル制御を切り替えるための切替スイッチ、フォーカス調整操作を行うためのフォーカスリングなどを含む。また、操作部１１４は、不図示のタッチパネルと液晶パネルを有し、表示される機能アイコンを各種機能ボタンとして作用させる。機能ボタンは、撮影開始ボタン、動画撮影モード選択ボタン、ホワイトバランス設定ボタン、ＩＳＯ感度設定ボタンなどを含む。動画撮影モードには、マニュアル露出撮影モード、オート露出撮影モード、ＭＦ（マニュアルフォーカス）撮影モード、ＡＦ（オートフォーカス）撮影モード、タイムラプス撮影モード、カスタムモードなどがある。 The operation unit 114 is connected to the CPU 111 via an internal bus (not shown). The operation unit 114 is various operation members as an input unit that receives a user operation. The operation unit 114 includes a shooting start button for starting shooting, a switch for switching between automatic control and manual control of focus operation, a focus ring for performing focus adjustment operation, and the like. The operation unit 114 has a touch panel and a liquid crystal panel (not shown), and causes the displayed function icons to act as various function buttons. The function buttons include a shooting start button, a moving image shooting mode selection button, a white balance setting button, an ISO sensitivity setting button, and the like. The moving image shooting mode includes a manual exposure shooting mode, an auto exposure shooting mode, an MF (manual focus) shooting mode, an AF (auto focus) shooting mode, a time lapse shooting mode, and a custom mode.

次に、撮像装置１００の撮影処理の一例について説明する。図２は、特徴量取得部１０４によって取得されたフレーム最大輝度値の時間変化の一例を示し、シーン決定部１０５のシーン決定処理を行わない場合の例を示す。図２は、撮影開始フレーム（撮影開始時に得られたフレーム；動画の最初のフレーム）の番号が「０」であり、且つ、撮影終了フレーム（撮影終了時に得られたフレーム；動画の最後のフレーム）の番号が「Ｎ」である動画データＡが得られる場合の例を示す。この場合は、動画データＡによって表された動画の全期間が１つのシーンＡ０の期間として扱われ、動画データＡの最大輝度値と、シーンＡ０の最大輝度値との両方が、輝度値ＡＬ_ＭＡＸとなる。以下では、シーン決定部１０５のシーン決定処理を行う場合の例として、図２の動画データＡが得られる場合の例を説明する。 Next, an example of the image capturing process of the image capturing apparatus 100 will be described. FIG. 2 shows an example of a temporal change in the maximum frame luminance value acquired by the feature amount acquisition unit 104, and shows an example in the case where the scene determination processing of the scene determination unit 105 is not performed. In FIG. 2, the number of the shooting start frame (the frame obtained at the start of shooting; the first frame of the moving image) is “0”, and the shooting end frame (the frame obtained at the end of shooting; the last frame of the moving image). The example in the case of obtaining the moving image data A with the number “)” is “N” is shown. In this case, the entire period of the moving image represented by the moving image data A is treated as a period of one scene A0, and both the maximum brightness value of the moving image data A and the maximum brightness value of the scene A0 are set to the brightness value AL _MAX. Becomes Hereinafter, an example of a case where the moving image data A of FIG. 2 is obtained will be described as an example of a case where the scene determination unit 105 performs the scene determination process.

図３（ａ）〜３（ｄ）は、動画データＡを得るための撮影中における各種パラメータ（フレーム最大輝度値、決定されたシーンの期間、シーン特徴量など）の時間変化の一例を示す。本実施例では、１つのシーンに対して取得されたフレーム最大輝度の最大値、つまり当該シーンの動画データの最大輝度値（シーン最大輝度値）が、シーン特徴量として取得されるとする。図３（ａ）〜３（ｄ）において、「Ｆｒ_ＮＯＷ」は、現在撮影中のフレーム画像のフレーム番号（フレームの番号）である。「ＡＬ_{ＭＡＸ＿ＮＯＷ}」は、フレームＦｒ_ＮＯＷ（フレーム番号Ｆｒ_ＮＯＷのフレーム）のフレーム最大輝度値、つまり現在撮影中のフレーム画像の最大輝度値である。「ＡｎＬ_ＭＡＸ」における「Ａｎ」はシーン
番号（シーンの番号）であり、「ＡｎＬ_ＭＡＸ」はシーンＡｎ（シーン番号Ａｎのシーン）のシーン最大輝度値である。シーンＡ１は、フレーム０（撮影開始フレーム）から始まるシーンである。フレーム番号は、シーン決定部１０５によってカウントされる。例えば、シーン決定部１０５は、フレーム画像データを取得するたびにインクリメントするカウンタを備え、当該カウンタの値をフレーム番号として使用する。シーン番号はメタデータ生成部１０６によってカウントされる。詳細は後述する。 3A to 3D show an example of temporal changes of various parameters (frame maximum brightness value, determined scene period, scene feature amount, etc.) during shooting for obtaining the moving image data A. In the present embodiment, it is assumed that the maximum value of the maximum frame brightness acquired for one scene, that is, the maximum brightness value of the moving image data of the scene (scene maximum brightness value) is acquired as the scene feature amount. In FIGS. 3A to 3D, “Fr _NOW ” is the frame number (frame number) of the frame image currently being photographed. “AL _{MAX_NOW} ” is the frame maximum luminance value of the frame Fr _NOW (frame of frame number Fr _NOW ), that is, the maximum luminance value of the frame image currently being photographed. “An” in “AnL _MAX ” is a scene number (scene number), and “AnL _MAX ” is a scene maximum luminance value of the scene An (scene of scene number An). The scene A1 is a scene starting from frame 0 (shooting start frame). The frame number is counted by the scene determination unit 105. For example, the scene determination unit 105 includes a counter that increments each time frame image data is acquired, and uses the value of the counter as a frame number. The scene number is counted by the metadata generation unit 106. Details will be described later.

図４は、撮像装置１００の撮影処理の一例を示すフローチャートである。図４の撮影処理は、シーン決定処理と、動的メタデータ付加処理（動的メタデータを動画データに付加する処理）とを含む。図４の撮影処理は、ユーザによる撮影開始操作を操作部１１４が受け付けたことをＣＰＵ１１１が検知することで開始され、ＣＰＵ１１１が撮像装置１００の各ブロックを制御することにより実現される。撮影開始操作は、例えば、撮像装置１００の非撮影状態において操作部１１４の撮影開始ボタンを押下するユーザ操作である。 FIG. 4 is a flowchart showing an example of the photographing process of the image pickup apparatus 100. The shooting process of FIG. 4 includes a scene determination process and a dynamic metadata addition process (a process of adding dynamic metadata to moving image data). The image capturing process of FIG. 4 is started by the CPU 111 detecting that the operation unit 114 has received an image capturing start operation by the user, and is realized by the CPU 111 controlling each block of the image capturing apparatus 100. The shooting start operation is, for example, a user operation of pressing the shooting start button of the operation unit 114 in the non-shooting state of the imaging apparatus 100.

ステップＳ４０１にて、特徴量取得部１０４は、撮像素子１０２から出力されたフレーム画像データのフレーム最大輝度値ＡＬ_{ＭＡＸ＿ＮＯＷ}の取得を開始する。特徴量取得部１０４は、フレーム最大輝度値ＡＬ_{ＭＡＸ＿ＮＯＷ}をシーン決定部１０５へ出力する。また、特徴量取得部１０４は、フレームＦｒ_ＮＯＷのフレームを含むシーンＡｎのシーン最大輝度値ＡｎＬ_ＭＡＸよりもフレーム最大輝度値ＡＬ_{ＭＡＸ＿ＮＯＷ}が高い場合に、フレーム最大輝度値ＡＬ_{ＭＡＸ＿ＮＯＷ}をシーン最大輝度値ＡｎＬ_ＭＡＸとしてＲＡＭ１１２に記録する。この処理は「ＲＡＭ１１２に記録されたシーン最大輝度値ＡｎＬ_ＭＡＸをフレーム最大輝度値ＡＬ_{ＭＡＸ＿ＮＯＷ}で更新する処理」とも言える。図３（ａ）は、シーン最大輝度値Ａ１Ｌ_ＭＡＸ（シーンＡ１のシーン最大輝度値）よりも高いフレーム最大輝度値ＡＬ_{ＭＡＸ＿ＮＯＷ}が検出され、シーン最大輝度値Ａ１Ｌ_ＭＡＸが更新された状態を示す。フレームＦｒ_ＮＯＷがフレーム０である場合には、特徴量取得部１０４は、フレーム最大輝度値ＡＬ_{ＭＡＸ＿ＮＯＷ}をシーン最大輝度値Ａ１Ｌ_ＭＡＸとしてＲＡＭ１１２に記録する。 In step S401, the feature amount acquisition unit 104 starts acquisition of the frame maximum luminance value AL _{MAX_NOW} of the frame image data output from the image sensor 102. The feature amount acquisition unit 104 outputs the frame maximum luminance value AL _{MAX_NOW} to the scene determination unit 105. In addition, when the frame maximum brightness value AL _{MAX_NOW} is higher than the scene maximum brightness value AnL _{MAX of the} scene An including the frame Fr _NOW , the feature amount acquisition unit 104 sets the frame maximum brightness value AL _{MAX_NOW} to the scene maximum brightness value AnL. _It is recorded in the RAM 112 as _MAX . This process can also be said to be a “process for updating the scene maximum brightness value AnL _MAX recorded in the RAM 112 with the frame maximum brightness value AL _{MAX —} _NOW ”. FIG. 3A shows a state in which the frame maximum brightness value AL _{MAX_NOW} higher than the scene maximum brightness value A1L _MAX (the scene maximum brightness value of the scene A1) is detected and the scene maximum brightness value A1L _MAX is updated. When the frame Fr _NOW is frame 0, the feature amount acquisition unit 104 _{records the} frame maximum brightness value AL _{MAX_NOW} in the RAM 112 as the scene maximum brightness value A1L _MAX .

ステップＳ４０２にて、シーン決定部１０５は、フレームＦｒ_ＮＯＷでシーンを切り替えるか否かを判断する。この判断は「フレームＦｒ_ＮＯＷで動画の期間を分割するか否かの判断」や「シーンを決定（確定）するか否かの判断」などとも言える。シーンを切り替えないと判断された場合（ステップＳ４０２：Ｎｏ）は、ステップＳ４０６に処理が進められる。シーンを切り替えると判断された場合（ステップＳ４０２：Ｙｅｓ）は、ステップＳ４０３に処理が進められる。 In step S402, the scene determination unit 105 determines whether to switch the scene with the frame Fr _NOW . This determination can also be said to be “determination as to whether or not to divide the moving image period by the frame Fr _NOW ” and “determination as to whether to determine (determine) a scene”. If it is determined that the scenes will not be switched (step S402: No), the process proceeds to step S406. If it is determined to switch the scene (step S402: Yes), the process proceeds to step S403.

ステップＳ４０３にて、シーン決定部１０５は、フレームＦｒ_ＮＯＷの１つ前のフレームＦｒ_ＮＯＷ−１までの期間を、シーンの期間として決定（確定）する。 In step S403, the scene determining unit 105, a period until the frame _{Fr the NOW} of the previous frame _{Fr the NOW} -1, determined as the period of the scene (ok).

本実施例では、フレーム最大輝度値が変化しない時間位置ではシーンが切り替わらず、フレーム最大輝度値が変化する時間位置でシーンが切り替わるように、ステップＳ４０２，Ｓ４０３の処理が行われる。例えば、フレーム最大輝度値が閾値よりも小さい変化量で変化する時間位置ではシーンが切り替わらず、フレーム最大輝度値が閾値よりも大きい変化量で変化する時間位置でシーンが切り替わるように、ステップＳ４０２，Ｓ４０３の処理が行われる。具体的には、ステップＳ４０２にて、シーン決定部１０５は、フレームＦｒ_ＮＯＷのフレーム最大輝度値ＡＬ_{ＭＡＸ＿ＮＯＷ}と、フレームＦｒ_ＮＯＷ−１のフレーム最大輝度値ＡＬ_{ＭＡＸ＿ＮＯＷ−１}とを比較する。そして、フレーム最大輝度値ＡＬ_{ＭＡＸ＿ＮＯＷ}とフレーム最大輝度値ＡＬ_{ＭＡＸ＿ＮＯＷ−１}の差分（絶対値）が閾値未満である場合にシーンが切り替わらず、当該差分が閾値以上である場合にシーンが切り替わるように、ステップＳ４０２，Ｓ４０３の処理が行われる。なお、閾値は特に限定されない。閾値は、予め定められた固定値であってもよいし、ユーザが変更可能な値であっても
よい。 In the present embodiment, the processing of steps S402 and S403 is performed such that the scene does not switch at the time position where the frame maximum brightness value does not change, but the scene switches at the time position where the frame maximum brightness value changes. For example, the scene is not switched at a time position where the frame maximum brightness value changes by a change amount smaller than the threshold value, and the scene is switched at a time position where the frame maximum brightness value changes by a change amount larger than the threshold value, in step S402, The process of S403 is performed. Specifically, at step S402, the scene determining unit 105 compares the frame maximum luminance value _{AL MAX_NOW} frame _{Fr the NOW,} and a frame the maximum brightness value _{AL MAX_NOW-1} frame _{Fr the NOW} -1. Then, when the difference (absolute value) between the frame maximum luminance value AL _{MAX_NOW} and the frame maximum luminance value AL _{MAX_NOW-1} is less than the threshold value, the scene does not switch, and when the difference is equal to or more than the threshold value, the scene switches. The processing of steps S402 and S403 is performed. The threshold value is not particularly limited. The threshold may be a predetermined fixed value or a value that can be changed by the user.

図３（ｂ）は、フレームＦｒ_ＮＯＷ＝Ｍ＋１のタイミングでシーンが決定（確定）された状態を示す。図３（ｂ）の例では、シーン決定部１０５は、フレーム番号Ｍ＋１のフレーム画像データの取得時に、フレーム０からフレームＭまでの期間をシーンＡ１の期間として決定（確定）する。そして、シーン決定部１０５は、シーンＡ１のシーン開始フレーム番号（シーンの最初のフレームの番号）０と、シーンＡ１のシーン終了フレーム番号（シーンの最後のフレームの番号）Ｍとを、メタデータ生成部１０６へ出力する。さらに、シーン決定部１０５は、シーンを決定したことを表すシーン決定信号を、特徴量取得部１０４へ出力する。 FIG. 3B shows a state in which the scene is determined (determined) at the timing of the frame Fr _NOW = M + 1. In the example of FIG. 3B, the scene determination unit 105 determines (determines) the period from frame 0 to frame M as the period of scene A1 when acquiring the frame image data of frame number M + 1. Then, the scene determination unit 105 generates the metadata of the scene start frame number 0 (the first frame number of the scene) 0 of the scene A1 and the scene end frame number (the last frame number of the scene) M of the scene A1. Output to the unit 106. Further, the scene determination unit 105 outputs a scene determination signal indicating that the scene has been determined to the feature amount acquisition unit 104.

特徴量取得部１０４は、シーン決定信号を取得すると、シーン最大輝度値ＡｎＬ_ＭＡＸを記録するＲＡＭ１１２の領域を変更する。これにより、各シーンのシーン最大輝度値が個別にＲＡＭ１１２に記録される。図３（ｂ）の例では、シーン最大輝度値Ａ１Ｌ_ＭＡＸがＲＡＭ１１２の所定領域に記録された状態で、シーン最大輝度値Ａ２Ｌ_ＭＡＸの記録領域が選択され、フレーム最大輝度値ＡＬ_{ＭＡＸ＿ＮＯＷ}がシーン最大輝度値Ａ２Ｌ_ＭＡＸとしてＲＡＭ１１２に記録される。その後、フレーム最大輝度値ＡＬ_{ＭＡＸ＿ＮＯＷ}がシーン最大輝度値Ａ２Ｌ_ＭＡＸよりも高い場合に、ＲＡＭ１１２に記録されたシーン最大輝度値Ａ２Ｌ_ＭＡＸがフレーム最大輝度値ＡＬ_{ＭＡＸ＿ＮＯＷ}で更新される。 When the feature amount acquisition unit 104 acquires the scene determination signal, the feature amount acquisition unit 104 changes the area of the RAM 112 that records the scene maximum luminance value AnL _MAX . As a result, the maximum scene brightness value of each scene is individually recorded in the RAM 112. In the example of FIG. 3B, in the state where the scene maximum luminance value A1L _MAX is recorded in a predetermined area of the RAM 112, the recording area of the scene maximum luminance value A2L _MAX is selected, and the frame maximum luminance value AL _{MAX_NOW} is set to the scene maximum luminance. It is recorded in the RAM 112 as the value A2L _MAX . Thereafter, the frame maximum luminance value _{AL MAX_NOW} is higher than the scene maximum luminance value _{A2L MAX,} recorded scene maximum luminance value _{A2L MAX} in RAM112 is updated in the frame the maximum luminance value _{AL MAX_NOW.}

ステップＳ４０４にて、メタデータ生成部１０６は、シーン決定部１０５から取得したシーン開始フレーム番号とシーン終了フレーム番号、及び、ＲＡＭ１１２から取得したシーン最大輝度値から、１シーンのメタデータ（動的メタデータの一部）を生成する。１シーンのメタデータは、シーン開始フレーム番号、シーン終了フレーム番号、シーン番号、シーン最大輝度値などを含む。シーン決定部１０５は、例えば、シーン開始フレーム番号とシーン終了フレーム番号を取得するたびにインクリメントするカウンタを備え、当該カウンタの値をシーン番号として使用する。シーン終了フレーム番号の更新をトリガとしてカウンタのインクリメントが行われてもよい。 In step S404, the metadata generation unit 106 determines the metadata (dynamic metadata) of one scene from the scene start frame number and the scene end frame number acquired from the scene determination unit 105 and the scene maximum brightness value acquired from the RAM 112. Part of the data). The metadata of one scene includes a scene start frame number, a scene end frame number, a scene number, a scene maximum brightness value, and the like. The scene determination unit 105 includes, for example, a counter that increments each time the scene start frame number and the scene end frame number are acquired, and uses the value of the counter as the scene number. The counter may be incremented by using the update of the scene end frame number as a trigger.

図３（ｂ）の例では、メタデータ生成部１０６は、シーン開始フレーム番号０とシーン終了フレーム番号Ｍの取得に応じて、シーン番号Ａ１を決定し、シーン最大輝度値Ａ１Ｌ_ＭＡＸを取得する。そして、メタデータ生成部１０６は、シーン開始フレーム番号０、シーン終了フレーム番号Ｍ、シーン番号Ａ１、及び、シーン最大輝度値Ａ１Ｌ_ＭＡＸを互いに関連付けたデータを、シーンＡ１のメタデータとして生成する。その後、メタデータ生成部１０６は、シーンＡ１のメタデータをメタデータ付加部１０７へ出力する。メタデータ生成部１０６は、シーン開始フレーム番号とシーン終了フレーム番号を取得するたびに、シーン最大輝度値を読み出すＲＡＭ１１２の領域を切り替える。これにより、メタデータ生成部１０６は、シーン開始フレーム番号とシーン終了フレーム番号から決まるシーンに対応するシーン最大輝度値を、ＲＡＭ１１２から読み出すことができる。 In the example of FIG. 3B, the metadata generation unit 106 determines the scene number A1 according to the acquisition of the scene start frame number 0 and the scene end frame number M, and acquires the scene maximum luminance value A1L _MAX . Then, the metadata generation unit 106 generates data in which the scene start frame number 0, the scene end frame number M, the scene number A1, and the scene maximum brightness value A1L _MAX are associated with each other as the metadata of the scene A1. After that, the metadata generation unit 106 outputs the metadata of the scene A1 to the metadata addition unit 107. The metadata generation unit 106 switches the area of the RAM 112 from which the maximum brightness value of the scene is read every time the scene start frame number and the scene end frame number are acquired. As a result, the metadata generation unit 106 can read the maximum scene brightness value corresponding to the scene determined from the scene start frame number and the scene end frame number from the RAM 112.

ステップＳ４０５にて、メタデータ付加部１０７は、メタデータ生成部１０６から取得したメタデータ（１シーンのメタデータ）を、シーン決定部１０５から順次出力された複数のフレーム画像データからなる動画データ（１シーンの動画データ）に付加する。図３（ｂ）の例では、シーンＡ１のメタデータが、シーンＡ１の動画データに付加される。メタデータが付加された後の動画データは、記憶部１０９に記録される。ここでは、ＨＥＶＣのＳＥＩとしてメタデータが付加されるとする。そして、メタデータが付加された後の動画データとして、ＨＥＶＣの符号化処理により生成されたＨＥＶＣファイルが、記憶部１０９に記録される。詳細は後述するが、ステップＳ４０２〜Ｓ４０６の処理が繰り返されることがある。２回目以降のステップＳ４０５では、記録済みの動画（過去のステップＳ４０５で記録された動画）に対して、続きとして、今回の動画（１シーンの動画）が結
合される（記録済みの動画の更新）。 In step S405, the metadata adding unit 107 sets the metadata (metadata of one scene) acquired from the metadata generation unit 106 to moving image data (a plurality of frame image data sequentially output from the scene determination unit 105 ( It is added to the video data of one scene). In the example of FIG. 3B, the metadata of the scene A1 is added to the moving image data of the scene A1. The moving image data to which the metadata has been added is recorded in the storage unit 109. Here, it is assumed that metadata is added as SEI of HEVC. Then, the HEVC file generated by the HEVC encoding process is recorded in the storage unit 109 as the moving image data after the metadata is added. Although details will be described later, the processes of steps S402 to S406 may be repeated. In step S405 after the second time, the current moving image (one scene of moving image) is combined with the recorded moving image (the moving image recorded in the previous step S405) (update of the recorded moving image). ).

ステップＳ４０６にて、ＣＰＵ１１１は、ユーザによる撮影終了操作を操作部１１４が受け付けたか否かを判断する。撮影終了操作は、例えば、撮像装置１００の撮影状態において操作部１１４の撮影終了ボタンを押下するユーザ操作である。撮影終了操作がなかったと判断された場合（ステップＳ４０６：Ｎｏ）は、ステップＳ４０２に処理が戻され、ステップＳ４０２〜Ｓ４０６の処理が繰り返される。図３（ｃ）は、フレームＭ＋１でシーンＡ１が決定された後の状態を示す。図３（ｃ）において、「シーンＡ２」は、シーンＡ１の次のシーンである。特徴量取得部１０４は、シーンＡ１の撮影時と同様に（上述したように）、シーンＡ２のシーン最大輝度値Ａ２Ｌ_ＭＡＸをＲＡＭ１１２に記録する。撮影終了操作があったと判断された場合（ステップＳ４０６：Ｙｅｓ）は、ステップＳ４０７に処理が進められる。シーン決定部１０５は、撮影終了操作があった場合に、最終シーン（動画の最後のシーン）のシーン開始フレーム番号と、最終シーンのシーン終了フレーム番号（撮影終了フレームの番号；動画の最後のフレームの番号）とを、メタデータ生成部１０６へ出力する。 In step S406, the CPU 111 determines whether or not the operation unit 114 has received a shooting end operation by the user. The shooting end operation is, for example, a user operation of pressing the shooting end button of the operation unit 114 in the shooting state of the image pickup apparatus 100. When it is determined that the shooting end operation has not been performed (step S406: No), the process is returned to step S402, and the processes of steps S402 to S406 are repeated. FIG. 3C shows a state after the scene A1 is determined in the frame M + 1. In FIG. 3C, “scene A2” is a scene next to the scene A1. The feature amount acquisition unit 104 records the scene maximum luminance value A2L _MAX of the scene A2 in the RAM 112 as in the case of shooting the scene A1 (as described above). If it is determined that the shooting end operation has been performed (step S406: Yes), the process proceeds to step S407. When the shooting end operation is performed, the scene determination unit 105 determines the scene start frame number of the final scene (the last scene of the video) and the scene end frame number of the final scene (the shooting end frame number; the last frame of the video). No. of) and are output to the metadata generation unit 106.

ステップＳ４０７にて、メタデータ生成部１０６は、シーン決定部１０５から取得したシーン開始フレーム番号とシーン終了フレーム番号、及び、ＲＡＭ１１２から取得したシーン最大輝度値から、最終シーンのメタデータ（動的メタデータの一部）を生成する。 In step S407, the metadata generation unit 106 determines the metadata (dynamic metadata) of the final scene from the scene start frame number and the scene end frame number acquired from the scene determination unit 105, and the maximum scene brightness value acquired from the RAM 112. Part of the data).

図３（ｄ）は、撮影終了フレームＮの画像の撮影が終了し、動画データＡの取得が終了した状態を示す。図３（ｄ）の例では、メタデータ生成部１０６は、シーン開始フレーム番号Ｍ＋１とシーン終了フレーム番号Ｎの取得に応じて、シーン番号Ａ２を決定し、シーン最大輝度値Ａ２Ｌ_ＭＡＸを取得する。そして、メタデータ生成部１０６は、シーン開始フレーム番号Ｍ＋１、シーン終了フレーム番号Ｎ、シーン番号Ａ２、及び、シーン最大輝度値Ａ２Ｌ_ＭＡＸを互いに関連付けたデータを、シーンＡ２のメタデータとして生成する。その後、メタデータ生成部１０６は、シーンＡ２のメタデータをメタデータ付加部１０７へ出力する。 FIG. 3D shows a state in which the shooting of the image of the shooting end frame N is finished and the acquisition of the moving image data A is finished. In the example of FIG. 3D, the metadata generation unit 106 determines the scene number A2 according to the acquisition of the scene start frame number M + 1 and the scene end frame number N, and acquires the scene maximum luminance value A2L _MAX . Then, the metadata generation unit 106 generates, as the metadata of the scene A2, data in which the scene start frame number M + 1, the scene end frame number N, the scene number A2, and the scene maximum brightness value A2L _MAX are associated with each other. After that, the metadata generation unit 106 outputs the metadata of the scene A2 to the metadata addition unit 107.

ステップＳ４０８にて、メタデータ付加部１０７は、メタデータ生成部１０６から取得したメタデータ（最終シーンのメタデータ）を、シーン決定部１０５から順次出力された複数のフレーム画像データからなる動画データ（最終シーンの動画データ）に付加する。図３（ｄ）の例では、シーンＡ２のメタデータが、シーンＡ２の動画データに付加される。メタデータが付加された後の動画データは、記憶部１０９に記録される。ステップＳ４０５の処理が行われている場合には、記録済みの動画（ステップＳ４０５で記録された動画）に対して、続きとして、今回の動画（最終シーンの動画）が結合される（記録済みの動画の更新）。 In step S408, the metadata adding unit 107 sets the metadata (final scene metadata) acquired from the metadata generating unit 106 to moving image data (a plurality of frame image data sequentially output from the scene determining unit 105 ( The video data of the final scene). In the example of FIG. 3D, the metadata of the scene A2 is added to the moving image data of the scene A2. The moving image data to which the metadata has been added is recorded in the storage unit 109. When the process of step S405 is performed, the current moving image (the moving image of the final scene) is combined with the recorded moving image (the moving image recorded in step S405) (recorded already). Video update).

以上述べたように、本実施例によれば、撮像装置において、フレーム最大輝度値の時間変化に基づいて複数のシーンが決定され、複数のシーンのそれぞれのシーン最大輝度値が取得される。そして、シーン最大輝度値を複数のシーンのそれぞれに対応付けた情報（動的メタデータ）が生成される。つまり、ポストプロダクション（画像データ編集工程）を介さずに動的メタデータなどを取得することができる。その結果、動的メタデータに基づいて動画の表示方法を順次変更することができる。例えば、トーンマッピングを用いたＨＤＲ（ハイダイナミックレンジ）表示において、動的メタデータに基づいてトーンマップを順次変更することができる。 As described above, according to the present embodiment, the imaging device determines a plurality of scenes based on the temporal change of the frame maximum luminance value, and acquires the scene maximum luminance value of each of the plurality of scenes. Then, information (dynamic metadata) in which the maximum scene brightness value is associated with each of a plurality of scenes is generated. That is, the dynamic metadata and the like can be acquired without going through the post production (image data editing process). As a result, the moving image display method can be sequentially changed based on the dynamic metadata. For example, in HDR (high dynamic range) display using tone mapping, the tone map can be sequentially changed based on the dynamic metadata.

なお、図４を用いて、シーンの動画データを記憶部１０９に記録する処理を繰り返す例を説明したが、これに限られない。例えば、メタデータ付加部１０７は、動画の撮影完了時に、各シーンの部分にメタデータが付加されるように、動画全体を表す動画データに動
的メタデータを付加し、動的メタデータが付加された後の動画データを記憶部１０９に記録してもよい。この場合には、メタデータ付加部１０７は、取得した動画データ（複数のフレーム画像データ）とメタデータを、ＲＡＭ１１２に一時記録する。 Although an example of repeating the process of recording the moving image data of the scene in the storage unit 109 has been described with reference to FIG. 4, the present invention is not limited to this. For example, the metadata adding unit 107 adds the dynamic metadata to the moving image data representing the entire moving image so that the metadata is added to the part of each scene when the moving image capturing is completed, and the dynamic metadata is added. The moving image data after being processed may be recorded in the storage unit 109. In this case, the metadata adding unit 107 temporarily records the acquired moving image data (a plurality of frame image data) and metadata in the RAM 112.

図４を用いて、フレーム番号でシーンを特定する例を説明したが、これに限られない。例えば、撮影時間や撮影時刻でシーンを特定してもよい。具体的には、シーン開始フレーム番号の代わりにシーン撮影開始時間を用い、シーン終了フレーム番号の代わりにシーン撮影終了時間を用いればよい。シーン撮影開始時間は、所定のタイミング（動画の撮影が開始したタイミングなど）からシーンの撮影が開始されるまでの時間であり、シーン撮影終了時間所定のタイミングからシーンの撮影が終了されるまでの時間である。 Although an example in which a scene is specified by a frame number has been described using FIG. 4, the present invention is not limited to this. For example, the scene may be specified by the shooting time or the shooting time. Specifically, the scene shooting start time may be used instead of the scene start frame number, and the scene shooting end time may be used instead of the scene end frame number. The scene shooting start time is the time from the predetermined timing (such as the timing when the shooting of the moving image starts) to the start of the scene shooting. The scene shooting end time is the time from the predetermined timing to the end of the scene shooting. It's time.

図４を用いて、連続するフレーム間でのフレーム最大輝度値の変化からシーンの切り替わりを検出する例を説明したが、シーンの決定方法はこれに限られないし、シーンを決定するためのパラメータはフレーム最大輝度値に限られない。シーンを決定するためのパラメータは、シーン特徴量を取得するためのフレーム特徴量と異なっていてもよい。例えば、シーンを決定するためのパラメータは、当該パラメータに対応するフレームである対応フレームのフレーム最大輝度値と、対応フレームに対して時間的に連続する１つ以上のフレームのフレーム最大輝度値との平均値であってもよい。具体的には、シーン決定部１０５は、フレームＦｒ_ＮＯＷのフレーム最大輝度値と、フレームＦｒ_ＮＯＷに対して時間的に前に連続する１つ以上のフレームのフレーム最大輝度値とをＲＡＭ１１２に記録してもよい。そして、シーン決定部１０５は、それらフレーム最大輝度値の平均値が閾値よりも大きい変化量で変化した場合にシーンが切り替わるように、複数のシーンを決定してもよい。フレームＦｒ_ＮＯＷに対して時間的に後に連続するフレームのフレーム最大輝度値を使用してもよい。フレームＦｒ_ＮＯＷに対して時間的に前に連続するフレームのフレーム最大輝度値と、フレームＦｒ_ＮＯＷに対して時間的に後に連続するフレームのフレーム最大輝度値との一方を使用してもよいし、両方を使用してもよい。シーンを決定するための複数のパラメータ（複数種類のパラメータ）が存在してもよい。 Although the example of detecting the scene change from the change in the maximum frame luminance value between consecutive frames has been described with reference to FIG. 4, the method for determining the scene is not limited to this, and the parameter for determining the scene is It is not limited to the frame maximum brightness value. The parameter for determining the scene may be different from the frame feature amount for acquiring the scene feature amount. For example, the parameter for determining the scene includes a frame maximum brightness value of a corresponding frame that is a frame corresponding to the parameter and a frame maximum brightness value of one or more frames that are temporally consecutive with respect to the corresponding frame. It may be an average value. Specifically, the scene determining unit 105 records the frame maximum luminance value of the frame _{Fr the NOW,} and a frame the maximum brightness value of one or more temporally successive frames prior to the frame _{Fr the NOW} the RAM112 You may. Then, the scene determination unit 105 may determine a plurality of scenes so that the scenes are switched when the average value of the frame maximum brightness values changes by a change amount larger than the threshold value. The frame maximum luminance value of the frame that is temporally subsequent to the frame Fr _NOW may be used. A frame the maximum luminance value of the temporally successive frames prior to the frame Fr _{the NOW,} may be used one of the frame a maximum luminance value of the consecutive frames temporally after the frame Fr _{the NOW,} Both may be used. There may be a plurality of parameters (a plurality of types of parameters) for determining the scene.

フレーム特徴量はフレーム最大輝度値に限られないし、シーン特徴量はシーン最大輝度値に限られない。例えば、フレーム特徴量は、フレーム画像データの輝度値の他の代表値（平均値、最小値、最頻値、中間値など）やヒストグラムであってもよい。シーン特徴量は、シーンに対して取得されたフレーム特徴量の他の代表値（平均値、最小値、最頻値、中間値など）であってもよい。フレーム特徴量からシーン特徴量を決定せずに、シーン動画データ（シーンの動画データ）から、シーン動画データの輝度値の代表値（最大値、平均値、最小値、最頻値、中間値など）をシーン特徴量として取得してもよい。１フレームのフレーム特徴量や１シーンのシーン特徴量は複数の値（複数種類の値）を含んでもよい。 The frame feature amount is not limited to the frame maximum brightness value, and the scene feature amount is not limited to the scene maximum brightness value. For example, the frame feature amount may be another representative value (average value, minimum value, mode value, intermediate value, etc.) of luminance values of frame image data or a histogram. The scene feature amount may be another representative value (average value, minimum value, mode value, intermediate value, etc.) of the frame feature amount acquired for the scene. The representative value (maximum value, average value, minimum value, mode value, intermediate value, etc.) of the brightness value of the scene moving image data from the scene moving image data (scene moving image data) without determining the scene characteristic amount from the frame characteristic amount. ) May be acquired as the scene feature amount. The frame feature amount of one frame and the scene feature amount of one scene may include a plurality of values (a plurality of types of values).

＜実施例２＞
以下、本発明の実施例２について説明する。なお、以下では、実施例１と異なる点（構成、処理、等）について詳しく説明し、実施例１と同じ点についての説明は省略する。実施例１では、シーンを決定するためのパラメータとしてフレーム最大輝度値を使用する例を説明した。撮像パラメータはシーンの切り替わり時に変更されることが多い。そこで、本実施例では、シーンを決定するためのパラメータとして、フレーム画像を撮像する際の撮像パラメータを使用する例を説明する。具体的には、マニュアル露出撮影モードにおいて、シーンを決定するためのパラメータとして絞り値を使用する例を説明する。 <Example 2>
Example 2 of the present invention will be described below. In the following, points (configuration, processing, etc.) different from the first embodiment will be described in detail, and description of the same points as the first embodiment will be omitted. In the first embodiment, the example in which the frame maximum luminance value is used as the parameter for determining the scene has been described. The imaging parameter is often changed when the scene is switched. Therefore, in the present embodiment, an example will be described in which an imaging parameter used when capturing a frame image is used as a parameter for determining a scene. Specifically, an example in which an aperture value is used as a parameter for determining a scene in the manual exposure shooting mode will be described.

本実施例に係る撮像装置は、図１（実施例１）の撮像装置１００と同様の構成を有する。但し、本実施例では、シーン決定部１０５によるシーン決定処理が実施例１と異なる。さらに、本実施例では、特徴量取得部１０４は、フレーム最大輝度値をシーン決定部１０
５へ出力しない。その代わりに、撮像制御部１０３は、絞り値をシーン決定部１０５へ出力する。撮像制御部１０３は、ゲイン値やシャッター速度などもシーン決定部１０５へ出力してもよい。 The imaging device according to the present embodiment has the same configuration as the imaging device 100 of FIG. 1 (first embodiment). However, in this embodiment, the scene determination processing by the scene determination unit 105 is different from that in the first embodiment. Further, in the present embodiment, the feature amount acquisition unit 104 sets the frame maximum brightness value to the scene determination unit 10.
Do not output to 5. Instead, the imaging control unit 103 outputs the aperture value to the scene determination unit 105. The imaging control unit 103 may also output the gain value, shutter speed, and the like to the scene determination unit 105.

シーン決定処理に関する処理フローの一例について説明する。まず、操作部１１４は、マニュアル露出撮影モードへ変更するモード変更操作（ユーザ操作）を受け付ける。ＣＰＵ１１１は、モード変更操作に応じて、撮像装置１００の各ブロックを制御し、マニュアル露出撮影モードを設定する。次に、操作部１１４は、撮影開始操作を受け付ける。ＣＰＵ１１１は、撮影開始操作に応じて、撮像装置１００の各ブロックを制御し、撮影を開始する。次に、操作部１１４は、絞り値を変更する絞り変更操作（ユーザ操作）を受け付ける。撮像制御部１０３は、変更後の絞り値に応じて撮像光学系１０１の状態を変更し、変更後の絞り値をシーン決定部１０５へ出力する。次に、シーン決定部１０５は、取得した絞り値の変化に応じて、シーンを決定（確定）するか否かを判断する。換言すれば、シーン決定部１０５は、取得した絞り値の変化に応じて、現在のタイミングでシーンを切り替えるか否かを判断する。 An example of the processing flow relating to the scene determination processing will be described. First, the operation unit 114 receives a mode change operation (user operation) for changing to the manual exposure shooting mode. The CPU 111 controls each block of the image pickup apparatus 100 according to the mode changing operation and sets the manual exposure shooting mode. Next, the operation unit 114 receives a shooting start operation. The CPU 111 controls each block of the image pickup apparatus 100 in response to a shooting start operation to start shooting. Next, the operation unit 114 receives an aperture change operation (user operation) for changing the aperture value. The imaging control unit 103 changes the state of the imaging optical system 101 according to the changed aperture value, and outputs the changed aperture value to the scene determination unit 105. Next, the scene determination unit 105 determines whether to determine (determine) a scene according to the change in the acquired aperture value. In other words, the scene determination unit 105 determines whether to switch the scene at the current timing according to the acquired aperture value change.

図５は、フレーム最大輝度値と絞り値の時間変化の一例を示し、シーン決定部１０５のシーン決定処理を行わない場合の例を示す。図５は、撮影開始フレームの番号が「０」であり、且つ、撮影終了フレームの番号が「Ｎ」である動画データＢが得られる場合の例を示す。図５では、フレームＭ＋１の画像の撮影時に、絞り値がＦ２．２からＦ２．０に変更されている。しかしながら、シーン決定処理を行われないため、動画データＢによって表された動画の全期間が１つのシーンＢ０の期間として扱われ、動画データＢの最大輝度値と、シーンＢ０の最大輝度値との両方が、輝度値ＢＬ_ＭＡＸとなる。以下では、シーン決定部１０５のシーン決定処理を行う場合の例として、図５と同様に絞り値が変更され、図５の動画データＢが得られる場合の例を説明する。 FIG. 5 shows an example of temporal changes in the maximum frame brightness value and the aperture value, and shows an example in which the scene determination processing of the scene determination unit 105 is not performed. FIG. 5 shows an example in which moving image data B in which the shooting start frame number is “0” and the shooting end frame number is “N” is obtained. In FIG. 5, the aperture value is changed from F2.2 to F2.0 when the image of the frame M + 1 is captured. However, since the scene determination process is not performed, the entire period of the moving image represented by the moving image data B is treated as one period of the scene B0, and the maximum luminance value of the moving image data B and the maximum luminance value of the scene B0 are combined. Both result in a brightness value BL _MAX . In the following, as an example of performing the scene determination processing of the scene determination unit 105, an example in which the aperture value is changed and the moving image data B of FIG. 5 is obtained as in FIG. 5 will be described.

図６は、フレーム最大輝度値と絞り値の時間変化の一例を示し、シーン決定部１０５のシーン決定処理を行う場合の例を示す。本実施例では、シーン決定部１０５は、絞り値が変化する時間位置でシーンが切り替わるように、複数のシーンを決定する。従って、図６に示すように、フレーム０からフレームＭまでの期間がシーンＢ１の期間として決定され、フレームＭ＋１からフレームＮまでの期間がシーンＢ２の期間として決定される。そして、実施例１と同様の処理により、シーンＢ１のシーン最大輝度値として輝度値Ｂ１Ｌ_ＭＡＸが取得され、シーンＢ２のシーン最大輝度値として輝度値Ｂ２Ｌ_ＭＡＸが取得され、シーン最大輝度値Ｂ１Ｌ_ＭＡＸ，Ｂ２Ｌ_ＭＡＸを含む動的メタデータが生成される。 FIG. 6 shows an example of a temporal change in the maximum frame brightness value and the aperture value, and shows an example in the case of performing the scene determination processing of the scene determination unit 105. In this embodiment, the scene determination unit 105 determines a plurality of scenes so that the scenes are switched at the time position where the aperture value changes. Therefore, as shown in FIG. 6, the period from frame 0 to frame M is determined as the period of scene B1, and the period from frame M + 1 to frame N is determined as the period of scene B2. Then, by the same processing as that in the first embodiment, the brightness value B1L _MAX is acquired as the scene maximum brightness value of the scene B1, the brightness value B2L _MAX is acquired as the scene maximum brightness value of the scene B2, and the scene maximum brightness value B1L _MAX , Dynamic metadata including B2L _MAX is generated.

以上述べたように、本実施例によれば、撮像装置において、撮像パラメータ（絞り値）の時間変化に基づいて複数のシーンが決定され、実施例１と同様に動的メタデータが生成される。つまり、ポストプロダクションを介さずに動的メタデータなどを取得することができる。その結果、動的メタデータに基づいて動画の表示方法を順次変更することができる。 As described above, according to the present embodiment, the imaging apparatus determines a plurality of scenes based on the temporal change of the imaging parameter (aperture value), and the dynamic metadata is generated as in the first embodiment. .. That is, dynamic metadata and the like can be acquired without going through post production. As a result, the moving image display method can be sequentially changed based on the dynamic metadata.

なお、マニュアル露出撮影モードが設定される例を説明したが、自動で絞り値を変更する自動露出撮影モードが設定されてもよい。自動露出設定モードでは、撮像制御部１０３は、撮像素子１０２から取得したフレーム画像データを参照して、絞り値を自動で（ユーザ操作によらずに）変更する。自動露出設定モードが設定されている場合であっても、マニュアル露出設定モードが設定されている場合と同様に、絞り値の時間変化に基づいて複数のシーンを決定することができる。 Although the example in which the manual exposure shooting mode is set has been described, the automatic exposure shooting mode in which the aperture value is automatically changed may be set. In the automatic exposure setting mode, the imaging control unit 103 refers to the frame image data acquired from the image sensor 102 and automatically changes the aperture value (not by a user operation). Even when the automatic exposure setting mode is set, as in the case where the manual exposure setting mode is set, it is possible to determine a plurality of scenes based on the temporal change of the aperture value.

絞り値が（わずかでも）変化した時間位置でシーンを切り替える例を説明したが、絞り値が閾値よりも大きい変化量で変化した時間位置でシーンを切り替えてもよい。例えば、
絞り値が１段未満（１／３段など）変化してもシーンを切り替えず、絞り値が１段以上変化するとシーンを切り替えるような制御を行ってもよい。 The example in which the scene is switched at the time position where the aperture value changes (even slightly) has been described, but the scene may be switched at the time position where the aperture value changes by a change amount larger than the threshold value. For example,
The control may be performed such that the scene is not switched even if the aperture value changes by less than 1 step (such as 1/3 step), and the scene is switched when the aperture value changes by 1 step or more.

シーンを決定するための撮像パラメータは絞り値に限られない。例えば、ＩＳＯ感度、シャッタースピード、フォーカス位置、焦点距離、ホワイトバランス、露出値などの時間変化に基づいて複数のシーンが決定されてもよい。露出値は、ＩＳＯ感度、シャッタースピード、及び、絞り値から算出できる。シーンを決定するための撮像パラメータとして、１種類の撮像パラメータが使用されてもよいし、複数種類の撮像パラメータが使用されてもよい。 The imaging parameter for determining the scene is not limited to the aperture value. For example, a plurality of scenes may be determined based on time changes such as ISO sensitivity, shutter speed, focus position, focal length, white balance, and exposure value. The exposure value can be calculated from the ISO sensitivity, the shutter speed, and the aperture value. As the imaging parameter for determining the scene, one type of imaging parameter may be used, or a plurality of types of imaging parameters may be used.

シーンを決定するための複数のパラメータが存在する場合には、シーン決定部１０５は、複数のパラメータのうち、設定されている撮影モードに応じたパラメータを用いて複数のシーンを決定してもよい。例えば、ＩＳＯ感度を自動で設定する撮影モードの場合に、ＩＳＯ感度以外のパラメータの時間変化には基づかずに、ＩＳＯ感度の時間変化に基づいて、複数のシーンが決定されてもよい。ホワイトバランスを自動で設定する撮影モードの場合に、ホワイトバランス以外のパラメータの時間変化には基づかずに、ホワイトバランスの時間変化に基づいて、複数のシーンが決定されてもよい。同様に、絞り優先モードの場合に絞り値が使用され、シャッター速度優先モードの場合にシャッター速度が使用されてもよい。全ての撮像パラメータを手動で設定するマニュアルモードの場合に全ての撮像パラメータを考慮して複数のシーンが決定されてもよい。複数のパラメータの少なくともいずれかが、シーンを決定するためのパラメータとしてユーザに指定されてもよい。 When there are a plurality of parameters for determining the scene, the scene determination unit 105 may determine the plurality of scenes by using a parameter according to the set shooting mode among the plurality of parameters. .. For example, in the case of a shooting mode in which the ISO sensitivity is automatically set, a plurality of scenes may be determined based on the temporal change of the ISO sensitivity instead of the temporal change of parameters other than the ISO sensitivity. In the case of the shooting mode in which the white balance is automatically set, a plurality of scenes may be determined based on the time change of the white balance instead of the time change of the parameters other than the white balance. Similarly, the aperture value may be used in the aperture priority mode, and the shutter speed may be used in the shutter speed priority mode. In the case of the manual mode in which all the imaging parameters are manually set, a plurality of scenes may be determined in consideration of all the imaging parameters. At least one of the plurality of parameters may be designated by the user as a parameter for determining the scene.

特定の撮影モードが設定されている場合にシーン決定処理を行わない（複数のシーンを決定しない）ような制御を行ってもよい。例えば、ＡＦ撮影モードにおけるフォーカス変更時は、同一シーンの画像を撮影している可能性が高いため、シーンを切り替えず、ＭＦ撮影モードにおけるフォーカス変更時は、意図した画作りをしているため、シーンを切り替えるような制御を行ってもよい。撮影モードなどに依らず、シーン決定処理の実行／非実行がユーザによって指定されてもよい。 The control may be performed such that the scene determination process is not performed (a plurality of scenes are not determined) when a specific shooting mode is set. For example, when the focus is changed in the AF shooting mode, it is highly likely that the same scene image is shot. Therefore, the scene is not switched, and the intended image is made when the focus is changed in the MF shooting mode. You may perform control which switches a scene. The execution / non-execution of the scene determination process may be designated by the user regardless of the shooting mode or the like.

シーンを決定するためのパラメータ、シーン決定処理の実行／非実行の切り替え方法、シーン決定処理を実行する（または、しない）撮影モードなどは特に限定されない。 The parameters for determining the scene, the method for switching execution / non-execution of the scene determination processing, the shooting mode for performing (or not performing) the scene determination processing, and the like are not particularly limited.

＜実施例３＞
以下、本発明の実施例３について説明する。なお、以下では、実施例１と異なる点（構成、処理、等）について詳しく説明し、実施例１と同じ点についての説明は省略する。本実施例では、フレーム画像の合焦領域（被写界深度内領域）を考慮してシーンの決定や動的メタデータの生成を行う例を説明する。撮影モードは特に限定されないが、本実施例では、ＭＦ撮影モードの例を説明する。 <Example 3>
Example 3 of the present invention will be described below. In the following, points (configuration, processing, etc.) different from the first embodiment will be described in detail, and description of the same points as the first embodiment will be omitted. In the present embodiment, an example will be described in which a scene is determined and dynamic metadata is generated in consideration of a focus area (area within the depth of field) of a frame image. Although the shooting mode is not particularly limited, an example of the MF shooting mode will be described in this embodiment.

図７は、本実施例に係る撮像装置７００の構成例を示すブロック図である。図７において、図１（実施例１）と同じブロックには図１と同じ符号が付されている。撮像装置７００は、実施例１の特徴量取得部１０４の代わりに特徴量取得部７０４を有し、実施例１のシーン決定部１０５の代わりにシーン決定部７０５を有する。さらに、撮像装置７００は、被写界深度算出部７１５を有する。 FIG. 7 is a block diagram showing a configuration example of the image pickup apparatus 700 according to the present embodiment. 7, the same blocks as those in FIG. 1 (Embodiment 1) are denoted by the same reference numerals as those in FIG. The image pickup apparatus 700 has a feature amount acquisition unit 704 instead of the feature amount acquisition unit 104 of the first embodiment, and has a scene determination unit 705 instead of the scene determination unit 105 of the first embodiment. Furthermore, the imaging device 700 has a depth of field calculation unit 715.

特徴量取得部７０４は、実施例１の特徴量取得部１０４と同様の機能を有する。但し、特徴量取得部７０４は、フレーム画像データ全体の最大輝度値ではなく、フレーム画像の合焦領域（被写界深度内領域）に対応する画像データの最大輝度値を、フレーム最大輝度値として取得する。合焦領域（被写界深度内領域）は、被写界深度算出部７１５から通知される。実施例１と同様に、フレーム最大輝度値（合焦領域に対応する画像データの最大
輝度値）は、複数のシーンを決定するための特徴量と、シーン特徴量を取得するための特徴量との両方として使用される。なお、複数のシーンを決定するための特徴量として、合焦領域に対応する画像データの最大輝度値が取得され、シーン特徴量を取得するための特徴量として、フレーム画像データ全体の最大輝度値が取得されてもよい。逆でもよい。 The feature amount acquisition unit 704 has the same function as the feature amount acquisition unit 104 of the first embodiment. However, the feature amount acquisition unit 704 sets the maximum brightness value of the image data corresponding to the focused area (in-depth-of-field area) of the frame image as the frame maximum brightness value, not the maximum brightness value of the entire frame image data. get. The focus area (area within the depth of field) is notified from the depth of field calculation unit 715. Similar to the first embodiment, the frame maximum brightness value (maximum brightness value of image data corresponding to the focus area) is a feature amount for determining a plurality of scenes and a feature amount for acquiring the scene feature amount. Used as both. The maximum brightness value of the image data corresponding to the in-focus area is acquired as the feature amount for determining the plurality of scenes, and the maximum brightness value of the entire frame image data is acquired as the feature amount for acquiring the scene feature amount. May be obtained. The reverse is also acceptable.

シーン決定部７０５は、実施例１のシーン決定部１０５と同様の機能を有する。但し、フレーム最大輝度値が閾値よりも大きい変化量で変化する場合において、シーン決定部７０５は、合焦領域（被写界深度内領域）の時間変化を考慮して、シーンを切り替えるか否かを判断する。合焦領域（被写界深度内領域）は、被写界深度算出部７１５から通知される。 The scene determination unit 705 has the same function as the scene determination unit 105 of the first embodiment. However, when the frame maximum luminance value changes by a change amount larger than the threshold value, the scene determination unit 705 considers the temporal change of the focus area (area within the depth of field) and determines whether to switch the scene. To judge. The focus area (area within the depth of field) is notified from the depth of field calculation unit 715.

被写界深度算出部７１５は、絞り値、フォーカス値（フォーカス位置）、及び、ズーム値（焦点距離）を、撮像制御部１０３から取得する。換言すれば、撮像制御部１０３は、絞り値、フォーカス値、及び、ズーム値を、被写界深度算出部７１５へ出力する。被写界深度算出部７１５は、絞り値、フォーカス値、及び、ズーム値から被写界深度を算出する。そして、被写界深度算出部７１５は、合焦領域として、フレーム画像の被写界深度内領域を、特徴量取得部７０４とシーン決定部７０５へ通知する。 The depth of field calculation unit 715 acquires the aperture value, the focus value (focus position), and the zoom value (focal length) from the imaging control unit 103. In other words, the imaging control unit 103 outputs the aperture value, the focus value, and the zoom value to the depth of field calculation unit 715. The depth of field calculation unit 715 calculates the depth of field from the aperture value, the focus value, and the zoom value. Then, the depth-of-field calculation unit 715 notifies the feature amount acquisition unit 704 and the scene determination unit 705 of the in-depth-of-field region of the frame image as the focus region.

シーン決定処理に関する処理フローの一例について説明する。まず、操作部１１４は、ＭＦ撮影モードへ変更するモード変更操作（ユーザ操作）を受け付ける。ＣＰＵ１１１は、モード変更操作に応じて、撮像装置１００の各ブロックを制御し、ＭＦ撮影モードを設定する。次に、撮像制御部１０３は、絞り値、シャッター速度、フォーカス値、及び、ズーム値などに応じて、撮像光学系１０１の状態や、撮像素子１０２の処理などを制御をする。さらに、撮像制御部１０３は、絞り値、フォーカス値、及び、ズーム値を被写界深度算出部７１５へ出力する。 An example of the processing flow relating to the scene determination processing will be described. First, the operation unit 114 receives a mode change operation (user operation) for changing to the MF shooting mode. The CPU 111 controls each block of the image pickup apparatus 100 according to the mode change operation and sets the MF shooting mode. Next, the image pickup control unit 103 controls the state of the image pickup optical system 101, the process of the image pickup element 102, and the like according to the aperture value, shutter speed, focus value, zoom value, and the like. Further, the imaging control unit 103 outputs the aperture value, the focus value, and the zoom value to the depth of field calculation unit 715.

そして、被写界深度算出部７１５は、撮像制御部１０３から取得した絞り値、フォーカス値、及び、ズーム値から被写界深度を算出する。例えば、以下の式１を用いて被写界深度が算出される。

Then, the depth-of-field calculation unit 715 calculates the depth-of-field from the aperture value, the focus value, and the zoom value acquired from the imaging control unit 103. For example, the depth of field is calculated using the following Expression 1.

次に、被写界深度算出部７１５は、撮像素子１０２から取得したフレーム画像データと、算出した被写界深度とに基づいて、当該フレーム画像データによって表されたフレーム画像の被写界深度内領域を検出する。そして、被写界深度算出部７１５は、被写界深度内領域（合焦領域）を、特徴量取得部７０４とシーン決定部７０５へ通知する。なお、合焦領域の検出方法は特に限定されない。例えば、所定の空間周波数帯域を有するエッジ領域を検出し（エッジ検出）、検出されたエッジ領域の密度が所定の閾値よりも大きい画像領域を合焦領域として決定してもよい。 Next, the depth of field calculation unit 715 determines, based on the frame image data acquired from the image sensor 102 and the calculated depth of field, that the depth of field of the frame image represented by the frame image data is within the depth of field. Detect the area. Then, the depth-of-field calculation unit 715 notifies the in-depth-of-field area (focus area) to the feature amount acquisition unit 704 and the scene determination unit 705. The method of detecting the focused area is not particularly limited. For example, an edge area having a predetermined spatial frequency band may be detected (edge detection), and an image area in which the density of the detected edge area is larger than a predetermined threshold may be determined as the focus area.

次に、操作部１１４は、撮影開始操作を受け付ける。ＣＰＵ１１１は、撮影開始操作に応じて、撮像装置１００の各ブロックを制御し、撮影を開始する。そして、特徴量取得部７０４は、被写界深度算出部７１５から通知された被写界深度内領域に対応する画像データの最大輝度値を、フレーム最大輝度値として取得し、フレーム最大輝度値をシーン決定部１０５へ出力する。次に、シーン決定部７０５は、特徴量取得部１０４から取得したフレーム最大輝度値と、被写界深度算出部７１５から通知された被写界深度内領域とを用いて、複数のシーンを決定する。 Next, the operation unit 114 receives a shooting start operation. The CPU 111 controls each block of the image pickup apparatus 100 in response to a shooting start operation to start shooting. Then, the feature amount acquisition unit 704 acquires the maximum luminance value of the image data corresponding to the area within the depth of field notified from the depth of field calculation unit 715 as the frame maximum luminance value, and the frame maximum luminance value is calculated. Output to the scene determination unit 105. Next, the scene determination unit 705 determines a plurality of scenes using the frame maximum brightness value acquired from the feature amount acquisition unit 104 and the depth-of-field internal region notified from the depth-of-field calculation unit 715. To do.

図８（ａ）〜８（ｄ）は、フレーム画像の一例を示す。図８（ａ）〜８（ｄ）では、被写体８００に合焦するフォーカス値が設定されている。図８（ａ），８（ｄ）では、深い被写界深度が設定されており、領域８０１（フレーム画像の全体）が合焦領域とされている。図８（ｂ），８（ｃ）では、浅い被写界深度が設定されており、領域８０１よりも狭い領域８０２（フレーム画像の一部）が合焦領域とされており、領域８０１から領域８０２を除いた領域８０３が非合焦領域とされている。 8A to 8D show examples of frame images. In FIGS. 8A to 8D, a focus value for focusing on the subject 800 is set. In FIGS. 8A and 8D, a deep depth of field is set, and the area 801 (entire frame image) is set as the focus area. In FIGS. 8B and 8C, a shallow depth of field is set, and an area 802 (a part of the frame image) narrower than the area 801 is set as the focus area, and the areas 801 to 801 are used. An area 803 excluding 802 is a non-focus area.

本実施例では、合焦領域のフレーム最大輝度値の時間変化に基づいて複数のシーンが決定される。このため、図８（ｃ）のように、非合焦領域８０３内に高輝度領域８０４が発生し、閾値よりも大きい増加量でフレーム最大輝度値が増加しても、シーンは決定（確定）されない（シーンは切り替えられない）。これにより、動的メタデータに基づく表示において、合焦領域以外の輝度変化によって、合焦領域内の被写体（注目される可能性が高い被写体）の見えが変化することを防ぐことができる。 In this embodiment, a plurality of scenes are determined based on the temporal change in the maximum frame brightness value of the focused area. Therefore, as shown in FIG. 8C, even if the high-luminance region 804 occurs in the non-focused region 803 and the frame maximum luminance value increases by an increase amount larger than the threshold value, the scene is determined (determined). Not done (scenes cannot be switched). Accordingly, in the display based on the dynamic metadata, it is possible to prevent the appearance of the subject (the subject that is likely to be noticed) in the in-focus region from changing due to the change in the brightness in the out-of-focus region.

本実施例では、被写界深度が浅い状態（合焦領域が狭い状態；図８（ｃ））から、被写界深度が深い状態（合焦領域が広い状態；図８（ｄ））に変化する場合は、フレーム最大輝度の大きな変化に応じて、シーンを決定（確定）する（シーンを切り替える）。これにより、高輝度領域８０４の影響を受けた被写体８００の表示が可能となる。一方で、被写界深度が深い状態（合焦領域が広い状態；図８（ｄ））から、被写界深度が浅い状態（合焦領域が狭い状態；図８（ｃ））に変化する場合は、フレーム最大輝度の時間変化に依らずシーンを決定（確定）しない（シーンを切り替えない）。これにより、被写体８００の見えを変化させない表示が可能となる。 In the present embodiment, a state with a shallow depth of field (a state with a narrow focus area; FIG. 8C) is changed to a state with a deep depth of field (a state with a wide focus area; FIG. 8D). If it changes, the scene is determined (determined) according to a large change in the maximum frame brightness (the scene is switched). As a result, it is possible to display the subject 800 affected by the high brightness area 804. On the other hand, the state where the depth of field is deep (the state where the focus area is wide; FIG. 8D) is changed to the state where the depth of field is shallow (the state where the focus area is narrow; FIG. 8C). In this case, the scene is not determined (confirmed) regardless of the temporal change of the maximum frame brightness (the scene is not switched). This enables display without changing the appearance of the subject 800.

以上述べたように、本実施例によれば、合焦領域を考慮することにより、より好適な動的メタデータを取得できたり、より好適に複数のシーンを決定できたりする。 As described above, according to the present exemplary embodiment, more suitable dynamic metadata can be acquired and a plurality of scenes can be more preferably determined by considering the focus area.

なお、実施例１〜３（図１，７）の各ブロックは、個別のハードウェアであってもよいし、そうでなくてもよい。２つ以上のブロックの機能が、共通のハードウェアによって実現されてもよい。１つのブロックの複数の機能のそれぞれが、個別のハードウェアによって実現されてもよい。１つのブロックの２つ以上の機能が、共通のハードウェアによって実現されてもよい。また、各ブロックは、ハードウェアによって実現されてもよいし、そうでなくてもよい。例えば、装置が、プロセッサと、制御プログラムが格納されたメモリとを有していてもよい。そして、装置が有する少なくとも一部のブロックの機能が、プロセッサがメモリから制御プログラムを読み出して実行することにより実現されてもよい。 Each block of the first to third embodiments (FIGS. 1 and 7) may or may not be individual hardware. The functions of two or more blocks may be implemented by common hardware. Each of the plurality of functions of one block may be realized by individual hardware. Two or more functions of one block may be implemented by common hardware. Also, each block may or may not be realized by hardware. For example, the device may have a processor and a memory in which a control program is stored. Then, the functions of at least some blocks of the device may be realized by the processor reading the control program from the memory and executing the control program.

なお、実施例１〜３（上述した変形例を含む）はあくまで一例であり、本発明の要旨の範囲内で実施例１〜３の構成を適宜変形したり変更したりすることにより得られる構成も、本発明に含まれる。実施例１〜３の構成を適宜組み合わせて得られる構成も、本発明に含まれる。 It should be noted that Embodiments 1 to 3 (including the above-described modifications) are merely examples, and configurations obtained by appropriately modifying or changing the configurations of Embodiments 1 to 3 within the scope of the gist of the present invention. Are also included in the present invention. The present invention also includes configurations obtained by appropriately combining the configurations of the first to third embodiments.

＜その他の実施例＞
本発明は、上述の実施例の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 <Other Examples>
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in a computer of the system or apparatus read and execute the program. It can also be realized by the processing. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１００，７００：撮像装置１０２：撮像素子１０４，７０４：特徴量取得部
１０５，７０５：シーン決定部１０６：メタデータ生成部 100, 700: Imaging device 102: Imaging device 104, 704: Feature amount acquisition unit 105, 705: Scene determination unit 106: Metadata generation unit

Claims

An image capturing means for capturing a moving image,
Deciding means for deciding a plurality of scenes of the moving image based on a temporal change of a parameter corresponding to a frame of the moving image,
An acquisition means for acquiring the respective feature quantities of the plurality of scenes,
Generating means for generating information associating the feature quantity acquired by the acquiring means with each of the plurality of scenes;
An image pickup apparatus comprising:

The imaging apparatus according to claim 1, further comprising an output unit that outputs the moving image data and the information.

The image pickup apparatus according to claim 2, wherein the output unit outputs the moving image data and the information in association with each other.

The image pickup apparatus according to claim 1, wherein the feature amount includes a maximum brightness value of moving image data of a scene corresponding to the feature amount.

The image pickup apparatus according to claim 1, wherein the parameter includes a maximum brightness value of image data of the frame.

The determination unit determines the plurality of scenes such that the scene does not switch at a time position where the parameter does not change, and the scene switches at a time position where the parameter changes. The imaging device according to any one of items.

The determination means is configured such that the scene does not switch at a time position where the parameter changes by a change amount smaller than a threshold, and the scene switches at a time position where the parameter changes by a change amount larger than the threshold. The image pickup apparatus according to claim 6, wherein the scene is determined.

The image pickup apparatus according to claim 1, wherein the parameter includes an image pickup parameter when the image of the frame is picked up.

The image pickup apparatus according to claim 8, wherein the image pickup parameter includes at least one of ISO sensitivity, shutter speed, aperture value, focus position, focal length, white balance, and exposure value.

The acquisition means is
The maximum brightness value of the image data corresponding to the focus area of the image of the frame is obtained for each of the plurality of frames of the moving image,
The imaging device according to claim 1, wherein the maximum value of two or more maximum brightness values acquired for the scene of the moving image is included in the feature amount of the scene.

The determining unit determines the plurality of scenes so that the scenes are not switched regardless of a temporal change of the parameter at a time position where a focused area of an image of the frame is narrowed. The imaging device according to any one of 10.

There are multiple parameters corresponding to the frame,
The said determination means determines the said some scene using the parameter according to the photography mode set among the said several parameters, The any one of the Claims 1-11 characterized by the above-mentioned. Imaging device.

13. The image pickup apparatus according to claim 1, wherein the determination unit does not determine the plurality of scenes when a specific shooting mode is set.

The parameter includes an average value of the maximum brightness value of the image data of the corresponding frame that is the frame corresponding to the parameter and the maximum brightness value of one or more frames that are temporally consecutive with respect to the corresponding frame. The imaging device according to any one of claims 1 to 13, characterized in that.

The image pickup apparatus according to claim 1, wherein the feature amount is dynamic metadata defined by SMPTE ST 2094.

An image capturing step for capturing a moving image,
A determining step of determining a plurality of scenes of the moving image based on a temporal change of a parameter corresponding to a frame of the moving image;
An acquisition step of acquiring respective feature quantities of the plurality of scenes,
A generation step of generating information in which the feature amount acquired in the acquisition step is associated with each of the plurality of scenes;
A method for controlling an imaging device, comprising:

A program for causing a computer to function as each unit of the imaging device according to claim 1.