JP2021090129A

JP2021090129A - Image processing device, imaging apparatus, image processing method and program

Info

Publication number: JP2021090129A
Application number: JP2019218950A
Authority: JP
Inventors: 遼太鈴木; Ryota Suzuki; 成記望月; Shigeki Mochizuki
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-12-03
Filing date: 2019-12-03
Publication date: 2021-06-10
Anticipated expiration: 2039-12-03
Also published as: JP7446797B2

Abstract

To restore an image with good image quality from a coded image regardless of an imaging condition.SOLUTION: An image processing device includes determining means that determines an inference parameter and an acquiring means that acquires an image capturing condition in order to acquire an image in which the degradation due to encoding is restored by performing inference based on an inference parameter on a decoded image obtained by decoding an encoded image. The determining means determines the inference parameter according to the image capturing condition acquired by the acquiring means.SELECTED DRAWING: Figure 1

Description

本発明は、画像処理装置、撮像装置、画像処理方法およびプログラムに関する。 The present invention relates to an image processing device, an imaging device, an image processing method and a program.

近年の撮像装置は、撮像センサが取得した直後の撮像データ（現像未処理のＲＡＷ画像）を記録することができる。ＲＡＷ画像は、現像処理がされていないため、色情報が損失することなく、豊富な階調を維持したまま記録されるため、ＲＡＷ画像に対して自由度の高い編集が可能になる。しかしながら、ＲＡＷ画像はデータ量が膨大であるため、記録メディアの空き領域を圧迫させてしまうという問題がある。従って、ＲＡＷ画像に対して、圧縮符号化を行い、データ量を少なくして記録することが望まれる。 Recent imaging devices can record imaging data (RAW images that have not been developed) immediately after being acquired by the imaging sensor. Since the RAW image is not developed, the color information is not lost and the RAW image is recorded while maintaining abundant gradations, so that the RAW image can be edited with a high degree of freedom. However, since the amount of data in a RAW image is enormous, there is a problem that the free area of the recording medium is squeezed. Therefore, it is desirable to perform compression coding on the RAW image and record it with a small amount of data.

一方、ニューラルネットワークを用いたディープラーニング技術が幅広い分野で応用されている。ディープラーニング技術は画像処理の分野において、画像の高画質化の用途に用いられる。ここで、ＲＡＷ画像を符号化すると、ＲＡＷ画像の劣化が生じる。そこで、劣化したＲＡＷ画像に対してディープラーニング技術を適用することで、劣化した画像の復元を図ることができる。これにより、ＲＡＷ画像の画質を担保するとともに、記録メディアの空き容量が圧迫されることを回避することができる。 On the other hand, deep learning technology using neural networks is applied in a wide range of fields. Deep learning technology is used in the field of image processing for the purpose of improving the image quality of images. Here, if the RAW image is encoded, the RAW image is deteriorated. Therefore, by applying the deep learning technique to the deteriorated RAW image, the deteriorated image can be restored. As a result, the image quality of the RAW image can be ensured, and it is possible to avoid pressure on the free space of the recording medium.

関連する技術として、特許文献１の技術が提案されている。特許文献１の技術は、ニューラルネットワークの中間層の少なくとも１つの内部パラメータを、学習後に処理するときに調整することで、ノイズ除去性能の向上を図っている。 As a related technique, the technique of Patent Document 1 has been proposed. The technique of Patent Document 1 aims to improve the noise removal performance by adjusting at least one internal parameter of the intermediate layer of the neural network at the time of processing after learning.

２０１８−２０６３８２号公報2018-206382A

特許文献１の技術は、内部パラメータの調整方法をノイズ量のみで判定するものであり、ノイズ以外の撮像条件は考慮されていない。一般的に、撮像装置は、撮像感度によりノイズは変動する他、露出によって明るさが変動する。また、被写界深度も、絞りや焦点距離によって変動する。ここで、適正露出で撮像した画像にノイズを付加した画像を学習用画像として学習された推論パラメータを適用したニューラルネットワークにより、ノイズが除去された画像を取得するケースを想定する。上記推論パラメータは適正露出の条件に最適化されたパラメータであるため、推論対象の画像が露出アンダーまたは露出オーバーであった場合、高いノイズ除去性能を確保することは難しい。従って、推論パラメータは、ノイズ量のみで調整された学習済みパラメータではなく、様々な撮像条件に対応した学習済みパラメータが用いられることが望ましい。これは、ニューラルネットワークを用いた画質劣化の復元においても同様である。 The technique of Patent Document 1 determines the method of adjusting the internal parameters only by the amount of noise, and does not consider imaging conditions other than noise. In general, in an imaging device, noise fluctuates depending on the imaging sensitivity, and brightness also fluctuates depending on the exposure. The depth of field also varies depending on the aperture and focal length. Here, it is assumed that a noise-removed image is acquired by a neural network to which an inference parameter learned is applied to an image in which noise is added to an image captured with an appropriate exposure as a learning image. Since the above inference parameter is a parameter optimized for the condition of proper exposure, it is difficult to secure high noise removal performance when the image to be inferred is underexposed or overexposed. Therefore, it is desirable that the inference parameter is not a learned parameter adjusted only by the amount of noise, but a learned parameter corresponding to various imaging conditions. This also applies to the restoration of image quality deterioration using a neural network.

本発明は、符号化された画像を復元する際に、撮像条件によらず、良好な画質の画像に復元することを目的とする。 An object of the present invention is to restore an encoded image to an image with good image quality regardless of the imaging conditions.

上記目的を達成するために、本発明の画像処理装置は、符号化された画像を復号した復号画像に対して推論パラメータに基づく推論を実行することによって前記符号化による劣化を復元した画像を取得するために、前記推論パラメータを決定する決定手段と、前記画像の撮像条件を取得する取得手段と、を備え、前記決定手段は、前記取得手段により取得した前記撮像条件に応じて、前記推論パラメータを決定することを特徴とする。 In order to achieve the above object, the image processing apparatus of the present invention acquires an image in which the deterioration due to the coding is restored by executing inference based on the inference parameters on the decoded image obtained by decoding the encoded image. In order to do so, the inference parameter is provided with a determination means for determining the inference parameter and an acquisition means for acquiring the imaging condition of the image, and the determination means has the inference parameter according to the imaging condition acquired by the acquisition means. Is characterized by determining.

本発明によれば、符号化された画像を復元する際に、撮像条件によらず、良好な画質の画像に復元することができる。 According to the present invention, when restoring an encoded image, it is possible to restore an image with good image quality regardless of the imaging conditions.

画像処理装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of an image processing apparatus. ニューラルネットワークの学習方法を説明する図である。It is a figure explaining the learning method of a neural network. 第１実施形態における推論パラメータの決定方法の流れを示すフローチャートである。It is a flowchart which shows the flow of the method of determining an inference parameter in 1st Embodiment. 第２実施形態における推論パラメータの決定方法の流れを示すフローチャートである。It is a flowchart which shows the flow of the method of determining an inference parameter in 2nd Embodiment. 第３実施形態における推論パラメータの決定方法の流れを示すフローチャートである。It is a flowchart which shows the flow of the method of determining an inference parameter in 3rd Embodiment. 第４実施形態における推論パラメータの決定方法の流れを示すフローチャートである。It is a flowchart which shows the flow of the method of determining an inference parameter in 4th Embodiment. 符号化データのメタデータにＰＳＮＲを埋め込む方法を説明する図である。It is a figure explaining the method of embedding PSNR in the metadata of the coded data. 第５実施形態における推論パラメータの決定方法の流れを示すフローチャートである。It is a flowchart which shows the flow of the method of determining an inference parameter in 5th Embodiment. 符号化データのファイルフォーマットの一例を示す図である。It is a figure which shows an example of the file format of the coded data.

以下、本発明の各実施の形態について図面を参照しながら詳細に説明する。しかしながら、以下の各実施の形態に記載されている構成はあくまで例示に過ぎず、本発明の範囲は各実施の形態に記載されている構成によって限定されることはない。 Hereinafter, each embodiment of the present invention will be described in detail with reference to the drawings. However, the configurations described in the following embodiments are merely examples, and the scope of the present invention is not limited by the configurations described in the respective embodiments.

＜第１実施形態＞
図１は、画像処理装置１００の構成例を示すブロック図である。画像処理装置１００は、ＲＡＷ復号部１０１、メタデータ取得部１０２、劣化復元処理部１０３および推論パラメータ決定部１０４を含む。画像処理装置１００は、撮像部（撮像手段）を有する撮像装置（以下、カメラ）に内蔵されるものとして説明するが、画像処理装置１００は他の任意の装置（例えば、情報処理装置、復号装置）に適用されてもよい。画像処理装置１００は、プロセッサおよびメモリを有していてもよい。この場合、メモリに記憶されているプログラムをプロセッサが実行することにより、各実施形態の処理が実現されてもよい。また、画像処理装置１００は、所定のプログラミング回路で実現されてもよい。後述する劣化復元処理部１０３による推論処理は、例えば、グラフィックス・プロセッシング・ユニットにより実行されてもよい。 <First Embodiment>
FIG. 1 is a block diagram showing a configuration example of the image processing device 100. The image processing device 100 includes a RAW decoding unit 101, a metadata acquisition unit 102, a deterioration restoration processing unit 103, and an inference parameter determination unit 104. The image processing device 100 will be described as being built in an imaging device (hereinafter, a camera) having an imaging unit (imaging means), but the image processing device 100 is an arbitrary device (for example, an information processing device, a decoding device). ) May be applied. The image processing device 100 may have a processor and a memory. In this case, the processing of each embodiment may be realized by the processor executing the program stored in the memory. Further, the image processing device 100 may be realized by a predetermined programming circuit. The inference processing by the deterioration restoration processing unit 103, which will be described later, may be executed by, for example, the graphics processing unit.

図１では、符号化データと復号ＲＡＷ画像と劣化復元ＲＡＷ画像とは、それぞれ記憶部１０５Ａ、１０５Ｂおよび１０５Ｃに記憶される。記憶部１０５Ａと記憶部１０５Ｂと記憶部１０５Ｃとは一体的な記憶部であってもよいし、それぞれ別個な記憶部であってもよい。また、各記憶部は、例えば、上述したメモリ、或いはバッファや所定の記憶装置等により実現されるものであってもよい。 In FIG. 1, the coded data, the decoded RAW image, and the deteriorated restored RAW image are stored in the storage units 105A, 105B, and 105C, respectively. The storage unit 105A, the storage unit 105B, and the storage unit 105C may be an integrated storage unit or may be separate storage units. Further, each storage unit may be realized by, for example, the above-mentioned memory, a buffer, a predetermined storage device, or the like.

ＲＡＷ復号部１０１は、符号化データを取得して復号することにより復号ＲＡＷ画像を生成する。符号化データは、例えば、撮像装置が取得したＲＡＷ画像（撮像画像）を符号化したデータである。ＲＡＷ復号部１０１は、復号ＲＡＷ画像を劣化復元処理部１０３に出力する。復号ＲＡＷ画像は、復号画像に対応する。ＲＡＷ復号部１０１が生成する復号ＲＡＷ画像は、符号化による劣化が生じているＲＡＷ画像であり、画質が低下している。 The RAW decoding unit 101 generates a decoded RAW image by acquiring and decoding the encoded data. The coded data is, for example, data obtained by encoding a RAW image (captured image) acquired by an imaging device. The RAW decoding unit 101 outputs the decoded RAW image to the deterioration restoration processing unit 103. The decoded RAW image corresponds to the decoded image. The decoded RAW image generated by the RAW decoding unit 101 is a RAW image in which deterioration due to coding has occurred, and the image quality is deteriorated.

メタデータ取得部１０２は、符号化データからメタデータ（メタ情報）を取得し、取得したメタデータを推論パラメータ決定部１０４に出力する第１の取得手段である。メタデータ取得部１０２は、画像を撮影した撮像部に関する情報を取得してもよい。この場合、メタデータ取得部１０２は、第２の取得手段としても機能する。図９は、符号化データのファイルフォーマットの一例を示す図である。図９に示されるように、符号化データは、ヘッダ部９０１とメタデータ部９０２とペイロード部９０３とにより構成される。ヘッダ部９０１には、符号化データがＲＡＷ形式のデータであることを示す識別コード等が含まれている。メタデータ部９０２には、露出補正値やＩＳＯ感度、絞り値、焦点距離等の撮像条件（撮影情報）を表すパラメータが含まれる。メタデータ部９０２には、さらに撮像時のＡＦ合焦点数やセンサ情報等の撮像に使用したカメラの固有な情報等の撮像条件が含まれてもよい。ペイロード部９０３には、符号化された画像の圧縮データが含まれている。 The metadata acquisition unit 102 is a first acquisition means that acquires metadata (meta information) from the encoded data and outputs the acquired metadata to the inference parameter determination unit 104. The metadata acquisition unit 102 may acquire information about the imaging unit that captured the image. In this case, the metadata acquisition unit 102 also functions as a second acquisition means. FIG. 9 is a diagram showing an example of a file format of coded data. As shown in FIG. 9, the coded data is composed of a header unit 901, a metadata unit 902, and a payload unit 903. The header portion 901 includes an identification code or the like indicating that the coded data is RAW format data. The metadata unit 902 includes parameters representing imaging conditions (shooting information) such as an exposure compensation value, an ISO sensitivity, an aperture value, and a focal length. The metadata unit 902 may further include imaging conditions such as the number of AF focus points at the time of imaging and information unique to the camera used for imaging such as sensor information. The payload unit 903 contains compressed data of the encoded image.

図１に示される劣化復元処理部１０３は、ニューラルネットワークで構成される。劣化復元処理部１０３は、推論パラメータ決定部１０４が出力する推論パラメータを用いて、復号ＲＡＷ画像に対して画像復元を目的とした推論処理を行う推論手段として機能する。劣化復元処理部１０３は、復号ＲＡＷ画像に対して、ニューラルネットワークを用いた推論処理を行うことで、ＲＡＷ画像を符号化した際に生じる劣化を復元した劣化復元ＲＡＷ画像を生成する。 The deterioration restoration processing unit 103 shown in FIG. 1 is composed of a neural network. The deterioration restoration processing unit 103 functions as an inference means that performs inference processing for the purpose of image restoration on the decoded RAW image using the inference parameters output by the inference parameter determination unit 104. The deterioration restoration processing unit 103 generates an deterioration restoration RAW image in which the deterioration generated when the RAW image is encoded is restored by performing inference processing using a neural network on the decoded RAW image.

各実施形態のニューラルネットワークは、ＣＮＮ（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）であるものとする。ＣＮＮは、畳み込み層とプーリング層とを有しており、出力側に全結合層が接続されるニューラルネットワークである。各実施形態の推論パラメータは、全結合層においては各層のノード間を結ぶエッジごとに持つ重みやバイアスに相当し、畳み込み層においてはカーネル（フィルタ）の重みやバイアスに相当する。以下、ニューラルネットワークの学習（機械学習）により更新されるパラメータを総称して推論パラメータと称する。推論パラメータは、学習済みの推論パラメータである。 It is assumed that the neural network of each embodiment is a CNN (Convolutional Neural Network). CNN is a neural network having a convolutional layer and a pooling layer, and a fully connected layer is connected to the output side. The inference parameters of each embodiment correspond to the weights and biases of each edge connecting the nodes of each layer in the fully connected layer, and correspond to the weights and biases of the kernel (filter) in the convolution layer. Hereinafter, the parameters updated by the learning (machine learning) of the neural network are collectively referred to as inference parameters. The inference parameter is a learned inference parameter.

推論パラメータ決定部１０４は、メタデータ取得部１０２が取得したメタデータ（メタ情報）に基づき、劣化復元処理部１０３で用いる推論パラメータを決定する決定手段である。推論パラメータ決定部１０４は記憶領域１０４Ａを有している。記憶領域１０４Ａには、撮像条件ごとに推論パラメータが記憶される。 The inference parameter determination unit 104 is a determination means for determining the inference parameter to be used in the deterioration restoration processing unit 103 based on the metadata (meta information) acquired by the metadata acquisition unit 102. The inference parameter determination unit 104 has a storage area 104A. Inference parameters are stored in the storage area 104A for each imaging condition.

次に、ニューラルネットワークの学習方法について説明する。カメラは、撮像条件に応じて、同じ被写体であっても、明るさやノイズ量、ボケ発生等によって、画像の出来栄えが大きく変わる。このため、同じ推論パラメータが適用されたニューラルネットワークに画像を入力したとしても、常に一定の画像復元効果を得ることは難しい。例えば、適正露出の単一条件で撮像した学習用画像を用いて学習が行われることで、推論パラメータが生成されたとする。この場合、推論の対象となる画像が学習時と同じ撮像条件で取得された画像であれば、適正露出において高い画像復元効果が得られる。一方、露出オーバー（適正露出以上の露出で撮像した画像）で撮像した画像は適正露出の画像と比べて被写体が明るくなり、露出アンダーで撮像した画像（適正露出より小さい露出で撮像した画像）は適正露出の画像と比べて被写体が暗くなる。従って、明るさ、という観点で画像の性質が大きく変わるため、同じ推論パラメータが適用されると、画像の復元効果は低下する。 Next, the learning method of the neural network will be described. Depending on the imaging conditions of the camera, even if the subject is the same, the quality of the image changes greatly depending on the brightness, the amount of noise, the occurrence of blur, and the like. Therefore, even if an image is input to a neural network to which the same inference parameters are applied, it is difficult to always obtain a constant image restoration effect. For example, it is assumed that the inference parameters are generated by learning using a learning image captured under a single condition of proper exposure. In this case, if the image to be inferred is an image acquired under the same imaging conditions as at the time of learning, a high image restoration effect can be obtained at proper exposure. On the other hand, an image captured with overexposure (an image captured with an exposure higher than the proper exposure) has a brighter subject than an image with a proper exposure, and an image captured with an underexposure (an image captured with an exposure smaller than the proper exposure) The subject is darker than the image with proper exposure. Therefore, since the properties of the image change significantly from the viewpoint of brightness, the image restoration effect is reduced when the same inference parameters are applied.

ここで、上記３つの撮像条件で撮像した学習用画像を混在させて学習を行い、推論パラメータを生成するというアプローチが考えられる。この場合、適正露出に対する画像の復元効果は、上述した適正露出の単一条件で撮像した学習用画像を用いて推論パラメータを生成する場合と比べて低下する。従って、画像の復元効果を最大限まで高めるには、撮像条件に応じた推論パラメータを複数生成し、生成された各推論パラメータを復元対象のＲＡＷ画像を撮像したときの撮像条件に応じて適用することが必要である。 Here, an approach is conceivable in which learning images captured under the above three imaging conditions are mixed and learned to generate inference parameters. In this case, the image restoration effect on the proper exposure is lower than that in the case of generating the inference parameter using the learning image captured under the single condition of the proper exposure described above. Therefore, in order to maximize the image restoration effect, a plurality of inference parameters according to the imaging conditions are generated, and each generated inference parameter is applied according to the imaging conditions when the RAW image to be restored is imaged. It is necessary.

図２は、ニューラルネットワークの学習方法を説明する図である。本実施形態では、撮像条件として露出補正値が適用される。そして、露出補正値についての推論パラメータは、適正露出と露出アンダーと露出オーバーとの３つに分けて学習される。ＲＡＷ符号化部２０１は、非圧縮ＲＡＷ画像を取得し、圧縮符号化して、符号化データを生成する。ＲＡＷ符号化部２０１は、生成された符号化データをＲＡＷ復号部１０１に出力する。ここで、図２に示されるように、記憶部１０５Ｄには、非圧縮ＲＡＷ画像が記憶されている。記憶部１０５Ｄは、記憶部１０５Ａ、１０５Ｂおよび１０５Ｃと一体的に構成されるものとして説明するが、別個に設けられてもよい。記憶部１０５Ｄには、圧縮符号化されていないＲＡＷ画像（非圧縮ＲＡＷ画像）が記憶されている。 FIG. 2 is a diagram illustrating a learning method of a neural network. In this embodiment, an exposure compensation value is applied as an imaging condition. Then, the inference parameters for the exposure compensation value are learned separately for proper exposure, underexposure, and overexposure. The RAW coding unit 201 acquires an uncompressed RAW image, compresses and encodes it, and generates coded data. The RAW coding unit 201 outputs the generated coded data to the RAW decoding unit 101. Here, as shown in FIG. 2, an uncompressed RAW image is stored in the storage unit 105D. The storage unit 105D will be described as being integrally configured with the storage units 105A, 105B and 105C, but may be provided separately. The storage unit 105D stores an uncompressed RAW image (uncompressed RAW image).

画質比較部２０２は、画質が低下した復号ＲＡＷ画像と、画質が低下していない非圧縮ＲＡＷ画像とを比較してＭＳＥ（平均二乗誤差：ＭｅａｎＳｑｕａｒｅｄＥｒｒｏｒ）を算出する。画質比較部２０２は、算出したＭＳＥを推論パラメータ更新部２０３に出力する。ＭＳＥは、ニューラルネットワークの性能を表す指標値であり、ＭＳＥの値が小さいほど、復号ＲＡＷ画像が非圧縮ＲＡＷ画像への再現性が高いという評価を表す。推論パラメータ更新部２０３は、画質比較部２０２が出力したＭＳＥに応じて推論パラメータを更新する。推論パラメータ更新部２０３は、撮像条件ごとの推論パラメータを記憶する記憶領域２０３Ａを有する。推論パラメータ更新部２０３は、メタデータ取得部１０２が取得した撮像条件に応じて記憶領域２０３Ａに記憶されている各推論パラメータを個別的に更新する。推論パラメータ更新部２０３は、撮像条件に応じて、更新された推論パラメータを劣化復元処理部１０３に出力する。劣化復元処理部１０３は、上記の学習ごとに更新される推論パラメータを用いて、符号化劣化した複合ＲＡＷ画像の復元処理を行う。推論パラメータ更新部２０３は、学習を行うごとにＭＳＥが小さくなるように、確率的勾配降下法等を用いて、推論パラメータを更新する。 The image quality comparison unit 202 calculates MSE (Mean Squared Error) by comparing the decoded RAW image whose image quality has deteriorated with the uncompressed RAW image whose image quality has not deteriorated. The image quality comparison unit 202 outputs the calculated MSE to the inference parameter update unit 203. MSE is an index value indicating the performance of the neural network, and the smaller the value of MSE, the higher the reproducibility of the decoded RAW image to the uncompressed RAW image. The inference parameter update unit 203 updates the inference parameters according to the MSE output by the image quality comparison unit 202. The inference parameter update unit 203 has a storage area 203A for storing inference parameters for each imaging condition. The inference parameter update unit 203 individually updates each inference parameter stored in the storage area 203A according to the imaging conditions acquired by the metadata acquisition unit 102. The inference parameter update unit 203 outputs the updated inference parameter to the deterioration restoration processing unit 103 according to the imaging conditions. The deterioration restoration processing unit 103 performs restoration processing of the code-degraded composite RAW image by using the inference parameters updated every time the learning is performed. The inference parameter update unit 203 updates the inference parameters by using a stochastic gradient descent method or the like so that the MSE becomes smaller each time learning is performed.

推論パラメータ更新部２０３は、適正露出と露出アンダーと露出オーバーと３つの撮像条件ごとに個別的に学習を繰り返して行う学習手段として機能する。これにより、記憶領域２０３Ａに記憶される、撮像条件ごとの推論パラメータが最適化される。上述した例では、復号ＲＡＷ画像と非圧縮ＲＡＷ画像との画質の比較にはＭＳＥが適用される場合について説明したが、２つの画像の比較はＭＳＥ以外の任意の手法が適用されてもよい。例えば、損失関数を評価する手法等が適用されてもよい。なお、画質比較における指標値は画質を示す指標であればよく、ＭＳＥに限定されるものではない。また、推論パラメータの更新の手法は、誤差を自律調整する任意の手法を適用できる。例えば、推論パラメータの更新には、誤差逆伝播法等が適用されてもよい。 The inference parameter update unit 203 functions as a learning means for individually repeating learning for each of the three imaging conditions of proper exposure, underexposure, and overexposure. As a result, the inference parameters for each imaging condition stored in the storage area 203A are optimized. In the above example, the case where MSE is applied to compare the image quality of the decoded RAW image and the uncompressed RAW image has been described, but any method other than MSE may be applied to the comparison of the two images. For example, a method of evaluating the loss function or the like may be applied. The index value in the image quality comparison may be an index indicating image quality, and is not limited to MSE. Further, as the method for updating the inference parameters, any method for autonomously adjusting the error can be applied. For example, an error backpropagation method or the like may be applied to update the inference parameters.

次に、推論パラメータ決定部１０４が行う推論パラメータの決定方法について説明する。図３は、第１実施形態における推論パラメータの決定方法の流れを示すフローチャートである。推論パラメータ決定部１０４は、メタデータ取得部１０２が符号化データから取得した露出補正値を取得する（Ｓ３０１）。推論パラメータ決定部１０４は、取得した露出補正値が「０」より小さいかを判定する（Ｓ３０２）。露出補正値が「０」より小さい場合、Ｓ３０２でＹｅｓと判定される。この場合、推論パラメータ決定部１０４は、劣化復元処理部１０３が行う復元処理で用いる推論パラメータを、学習済みの露出アンダー用推論パラメータ（第１の推論パラメータ）に決定する（Ｓ３０３）。 Next, a method of determining the inference parameter performed by the inference parameter determination unit 104 will be described. FIG. 3 is a flowchart showing the flow of the method of determining the inference parameter in the first embodiment. The inference parameter determination unit 104 acquires the exposure compensation value acquired from the coded data by the metadata acquisition unit 102 (S301). The inference parameter determination unit 104 determines whether the acquired exposure compensation value is smaller than "0" (S302). If the exposure compensation value is smaller than "0", it is determined as Yes in S302. In this case, the inference parameter determination unit 104 determines the inference parameter used in the restoration process performed by the deterioration restoration processing unit 103 as the learned inference parameter for underexposure (first inference parameter) (S303).

露出補正値が「０」以上である場合、Ｓ３０２でＮｏと判定される。この場合、推論パラメータ決定部１０４は、露出補正値が「０」であるかを判定する（Ｓ３０４）。露出補正値が「０」である場合、Ｓ３０４でＹｅｓと判定される。この場合、露出補正値が「０」であるため、推論パラメータ決定部１０４は、劣化復元処理部１０３が行う復元処理で用いる推論パラメータを、学習済みの適正露出用推論パラメータ（第２の推論パラメータ）に決定する（Ｓ３０５）。露出補正値が「０」でない場合、Ｓ３０４でＮｏと判定される。この場合、露出補正値が「０」以上であるため、劣化復元処理部１０３が行う復元処理で用いる推論パラメータを、学習済みの露出オーバー用推論パラメータ（第３の推論パラメータ）に決定する（Ｓ３０６）。学習済みの露出オーバー用推論パラメータは、第３の推論パラメータに対応する。推論パラメータ決定部１０４は、Ｓ３０３、Ｓ３０５またはＳ３０６の何れかで決定された推論パラメータを劣化復元処理部１０３に出力する（Ｓ３０７）。 When the exposure compensation value is "0" or more, it is determined as No in S302. In this case, the inference parameter determination unit 104 determines whether the exposure compensation value is “0” (S304). When the exposure compensation value is "0", it is determined as Yes in S304. In this case, since the exposure compensation value is "0", the inference parameter determination unit 104 uses the inference parameters used in the restoration process performed by the deterioration restoration processing unit 103 as the learned proper exposure inference parameters (second inference parameter). ) (S305). If the exposure compensation value is not "0", it is determined as No in S304. In this case, since the exposure compensation value is "0" or more, the inference parameter used in the restoration process performed by the deterioration restoration processing unit 103 is determined as the learned inference parameter for overexposure (third inference parameter) (S306). ). The learned inference parameters for overexposure correspond to the third inference parameter. The inference parameter determination unit 104 outputs the inference parameter determined by any one of S303, S305 or S306 to the deterioration restoration processing unit 103 (S307).

従って、推論パラメータ決定部１０４は、露出アンダー用推論パラメータ、適正露出用推論パラメータおよび露出オーバー用推論パラメータの３つの推論パラメータを学習して生成する。そして、学習された各推論パラメータは、記憶領域２０３Ａに記憶される。推論パラメータ決定部１０４は、メタデータ取得部１０２が取得した露出補正値（符号化データのメタデータ）に応じて、３つの推論パラメータから、復元処理に用いる推論パラメータを決定する。推論パラメータ決定部１０４は、露出補正値に応じて決定された推論パラメータを劣化復元処理部１０３のニューラルネットワークに適用する。これにより、符号化データのメタデータが示す撮像条件に応じて、適正な推論パラメータが適用されたニューラルネットワークを用いて復元処理を行うことができる。従って、撮像が行われた際の明るさが異なる場合であっても、符号化による劣化の復元を高い精度で行うことが可能になる。つまり、符号化された画像を復元する際に、撮像条件（明るさ）によらず、良好な画質の画像に復元することができる。 Therefore, the inference parameter determination unit 104 learns and generates three inference parameters, an inference parameter for underexposure, an inference parameter for proper exposure, and an inference parameter for overexposure. Then, each inference parameter learned is stored in the storage area 203A. The inference parameter determination unit 104 determines the inference parameter to be used for the restoration process from the three inference parameters according to the exposure compensation value (metadata of the coded data) acquired by the metadata acquisition unit 102. The inference parameter determination unit 104 applies the inference parameter determined according to the exposure compensation value to the neural network of the deterioration restoration processing unit 103. As a result, the restoration process can be performed using the neural network to which the appropriate inference parameters are applied according to the imaging conditions indicated by the metadata of the coded data. Therefore, even if the brightness at the time of imaging is different, it is possible to restore the deterioration due to coding with high accuracy. That is, when restoring the encoded image, it is possible to restore the image with good image quality regardless of the imaging conditions (brightness).

第１実施形態では、露出アンダー用推論パラメータ、適正露出用推論パラメータおよび露出オーバー用推論パラメータの３つの推論パラメータが学習により生成される例について説明した。つまり、上記３つの推論パラメータの学習が行われ、何れか１つの推論パラメータが適用されたニューラルネットワークを用いて復元処理が行われる。推論パラメータの数は、３つには限定されず、２つであってもよいし、４つ以上であってもよい。また、劣化復元処理部１０３はニューラルネットワークで構成される例について説明したが、劣化復元処理部１０３は、撮像条件ごとに事前に学習済みの推論パラメータを用いた任意の学習モデルであってもよい。さらに、劣化復元処理部１０３は、圧縮符号化により画質が劣化したＲＡＷ画像の画質復元処理だけでなく、例えば、画像サイズを縮小して画質劣化したＲＡＷ画像に対する超解像処理を行ってもよい。これらの点は、以下の各実施形態でも同様である。 In the first embodiment, an example in which three inference parameters, an inference parameter for underexposure, an inference parameter for proper exposure, and an inference parameter for overexposure are generated by learning has been described. That is, learning of the above three inference parameters is performed, and restoration processing is performed using a neural network to which any one of the inference parameters is applied. The number of inference parameters is not limited to three, and may be two or four or more. Further, although the deterioration restoration processing unit 103 has described an example in which the neural network is configured, the deterioration restoration processing unit 103 may be an arbitrary learning model using inference parameters learned in advance for each imaging condition. .. Further, the deterioration restoration processing unit 103 may perform not only the image quality restoration processing of the RAW image whose image quality has deteriorated due to compression coding, but also, for example, super-resolution processing on the RAW image whose image quality has deteriorated by reducing the image size. .. These points are the same in each of the following embodiments.

ここで、本実施形態では、劣化復元処理部１０３のニューラルネットワークに適用される推論パラメータは、推論パラメータ更新部２０３が決定する。これにより、ニューラルネットワークの推論パラメータの調整を人手（設計者）により行う必要がなくなる。また、設計者は、ニューラルネットワークの各ノードの演算内容を把握する必要がない。特に、ニューラルネットワークの構成が高度化且つ複雑化すると、把握する対象のノードの数も必然的に増加する。本実施形態では、ニューラルネットワークの構成が高度化且つ複雑化しても、ニューラルネットワークに対する推論パラメータの適用は、推論パラメータ更新部２０３が行うため、人手（設計者）による作業負荷が高くなることもない。 Here, in the present embodiment, the inference parameter update unit 203 determines the inference parameter applied to the neural network of the deterioration restoration processing unit 103. This eliminates the need to manually adjust the inference parameters of the neural network (designer). Further, the designer does not need to grasp the operation contents of each node of the neural network. In particular, as the configuration of a neural network becomes more sophisticated and complicated, the number of nodes to be grasped inevitably increases. In the present embodiment, even if the configuration of the neural network becomes sophisticated and complicated, the inference parameter update unit 203 applies the inference parameters to the neural network, so that the workload by the manual (designer) does not increase. ..

＜第２実施形態＞
次に、第２実施形態について説明する。第２実施形態の画像処理装置１００は、撮像条件に起因するノイズを考慮した符号化劣化の復元を行う例である。一般的に、高感度撮像したＲＡＷ画像は、画素値の増幅と共にノイズ成分（高周波成分）も増幅されるため、空間周波数が高まる。高感度画像と低感度画像とでは、空間周波数の分布が大きく異なる。従って、高感度画像と低感度画像とを、同一の推論パラメータが適用されたニューラルネットワークに対して入力したとしても、高い復元効果を得ることは難しい。画像の復元効果を向上させるためには、ノイズに影響する撮像条件に応じた学習用画像を用いて個別的に学習を行い、推論パラメータを最適化することが望ましい。第２実施形態の画像処理装置は、撮像感度を表すＩＳＯ値に応じて最適化した推論パラメータを適用して符号化劣化の復元を行う。 <Second Embodiment>
Next, the second embodiment will be described. The image processing device 100 of the second embodiment is an example of restoring coding deterioration in consideration of noise caused by imaging conditions. Generally, in a RAW image captured with high sensitivity, a noise component (high frequency component) is amplified as well as a pixel value, so that the spatial frequency is increased. The spatial frequency distribution differs greatly between high-sensitivity images and low-sensitivity images. Therefore, even if a high-sensitivity image and a low-sensitivity image are input to a neural network to which the same inference parameters are applied, it is difficult to obtain a high restoration effect. In order to improve the image restoration effect, it is desirable to perform individual learning using learning images according to the imaging conditions that affect noise and optimize the inference parameters. The image processing apparatus of the second embodiment restores the coding deterioration by applying inference parameters optimized according to the ISO value representing the imaging sensitivity.

第２の実施形態の画像処理装置１００の構成およびニューラルネットワークの学習方法は第１実施形態と同様であるため、説明を省略する。第２実施形態の画像処理装置１００は、撮像条件としてＩＳＯ値を使用する。第２実施形態では、ＩＳＯ値が所定の閾値よりも高い高感度とＩＳＯ値が所定の閾値以下の低感度との２つの条件に分けて個別に学習が行われ、推論パラメータが生成される。以下、各実施形態における所定の閾値は、任意に設定可能である。 Since the configuration of the image processing device 100 and the learning method of the neural network of the second embodiment are the same as those of the first embodiment, the description thereof will be omitted. The image processing apparatus 100 of the second embodiment uses an ISO value as an imaging condition. In the second embodiment, learning is performed individually under two conditions of high sensitivity having an ISO value higher than a predetermined threshold value and low sensitivity having an ISO value equal to or lower than a predetermined threshold value, and inference parameters are generated. Hereinafter, a predetermined threshold value in each embodiment can be arbitrarily set.

ここで、撮像感度を表すＩＳＯ値とノイズとの関係について説明する。ＩＳＯ値は、値が大きくなるとノイズ成分が増幅されるが、センササイズが大きくなるに応じて、また画素数が少なくなるに応じて、ノイズの発生が少なくなる。これは、撮像素子の画素と画素との間隔（以下、画素ピッチと称する）が広くなるに応じて相互の電気信号の干渉が少なくなり、結果としてノイズ発生の可能性が低下するからである。例えば、フルサイズのセンサではノイズが目立たないＩＳＯ値であったとしても、１型センサにおいてはノイズが目立つこともある。これは、１型センサの画素ピッチが狭く、常用できる感度がフルサイズのセンサと比べて相対的に低いためである。従って、学習用画像のセンササイズと復元対象となるＲＡＷ画像のセンササイズとが異なる場合、ノイズ量の大きさを判定するＩＳＯ閾値を画素ピッチに合わせてスケーリングする必要がある。 Here, the relationship between the ISO value representing the imaging sensitivity and noise will be described. As for the ISO value, the noise component is amplified as the value increases, but the generation of noise decreases as the sensor size increases and as the number of pixels decreases. This is because as the distance between the pixels of the image sensor (hereinafter, referred to as pixel pitch) becomes wider, the mutual interference of electric signals decreases, and as a result, the possibility of noise generation decreases. For example, even if the ISO value is such that noise is not noticeable in a full-size sensor, noise may be noticeable in a type 1 sensor. This is because the pixel pitch of the 1-inch sensor is narrow and the sensitivity that can be used regularly is relatively low compared to the full-size sensor. Therefore, when the sensor size of the learning image and the sensor size of the RAW image to be restored are different, it is necessary to scale the ISO threshold value for determining the magnitude of the noise amount according to the pixel pitch.

図４は、第２実施形態における推論パラメータの決定方法の流れを示すフローチャートである。第２実施形態では、学習が行われる際に使用される学習用画像はフルサイズで撮影した画像であり、復元対象となるＲＡＷ画像は１型センサで撮像した画像であるものとして説明する。推論パラメータ決定部１０４は、メタデータ取得部１０２が符号化データから取得したＩＳＯ値およびセンサ情報を取得する（Ｓ４０１）。センサ情報には、センササイズとセンサ画素数が含まれるものとする。推論パラメータ決定部１０４は、復元対象ＲＡＷ画像の画素ピッチを算出する（Ｓ４０２）。画素ピッチは、センササイズをセンサ画素数で除算した値の平方根で表される。例えば、センサ画素数が１２００万画素であった場合、１型センサのセンササイズは「１３．２［ｍｍ］×８．８［ｍｍ］」であるため、画素ピッチは３．１１［μｍ］となる。メタデータに画素ピッチに関する情報が含まれている場合、Ｓ４０２の処理は省略できる。 FIG. 4 is a flowchart showing the flow of the method of determining the inference parameter in the second embodiment. In the second embodiment, the learning image used when learning is performed is an image taken in full size, and the RAW image to be restored is an image taken by a type 1 sensor. The inference parameter determination unit 104 acquires the ISO value and the sensor information acquired from the coded data by the metadata acquisition unit 102 (S401). It is assumed that the sensor information includes the sensor size and the number of sensor pixels. The inference parameter determination unit 104 calculates the pixel pitch of the RAW image to be restored (S402). The pixel pitch is represented by the square root of the value obtained by dividing the sensor size by the number of sensor pixels. For example, when the number of sensor pixels is 12 million pixels, the sensor size of the 1-inch sensor is "13.2 [mm] x 8.8 [mm]", so the pixel pitch is 3.11 [μm]. Become. If the metadata contains information about the pixel pitch, the processing of S402 can be omitted.

推論パラメータ決定部１０４は、Ｓ４０２で算出された画素ピッチに基づき閾値Ｔを更新する（Ｓ４０３）。閾値Ｔは、後述するＳ４０４でノイズ量の大きさを判定するために用いるＩＳＯ値の閾値であり、学習時に用いたフルサイズセンサを使用した場合の常用感度である。推論パラメータ決定部１０４は、閾値Ｔを復元対象ＲＡＷ画像のセンサに合わせて更新する。更新前の閾値をＴ１、更新後の閾値をＴ２とすると、Ｔ２とＴ１との関係は、以下の式で表される。
「Ｔ２＝Ｔ１×（復元対象ＲＡＷ画像の画素ピッチ／学習用画像の画素ピッチ）^２」 The inference parameter determination unit 104 updates the threshold value T based on the pixel pitch calculated in S402 (S403). The threshold value T is the threshold value of the ISO value used for determining the magnitude of the noise amount in S404 described later, and is the normal sensitivity when the full-size sensor used at the time of learning is used. The inference parameter determination unit 104 updates the threshold value T according to the sensor of the RAW image to be restored. Assuming that the threshold value before the update is T1 and the threshold value after the update is T2, the relationship between T2 and T1 is expressed by the following equation.
"T2 = T1 x (pixel pitch of RAW image to be restored / pixel pitch of learning image) ² "

第２実施形態における学習用画像の画素ピッチはフルサイズセンサを用いて撮像した画像である。センサ画素数が１２００万画素であるとすると、センササイズは「３６［ｍｍ］×２４［ｍｍ］」であるため、学習用画像の画素ピッチは８．４９［μｍ］となる。復元対象ＲＡＷ画像の画素ピッチは３．１１［μｍ］であることから、更新後の閾値Ｔ２は、Ｔ１×０．１３となる。このようにして、フルサイズセンサの常用感度が１型センサの常用感度にスケーリングされ、閾値Ｔが更新される。 The pixel pitch of the learning image in the second embodiment is an image captured by using a full-size sensor. Assuming that the number of sensor pixels is 12 million pixels, the sensor size is "36 [mm] x 24 [mm]", so that the pixel pitch of the learning image is 8.49 [μm]. Since the pixel pitch of the RAW image to be restored is 3.11 [μm], the updated threshold value T2 is T1 × 0.13. In this way, the normal sensitivity of the full-size sensor is scaled to the normal sensitivity of the type 1 sensor, and the threshold value T is updated.

推論パラメータ決定部１０４は、Ｓ４０１で取得したＩＳＯ値と、Ｓ４０３で更新された閾値Ｔとを比較し、ＩＳＯ値が閾値Ｔ（所定値）より小さいかを判定する（Ｓ４０４）。比較の結果、ＩＳＯ値が閾値Ｔより小さい場合、Ｓ４０４でＹｅｓと判定される。この場合、推論パラメータ決定部１０４は、劣化復元処理部１０３が行う復元処理で用いる推論パラメータを、学習済みの低感度用推論パラメータ（第４の推論パラメータ）に決定する（Ｓ４０５）。ＩＳＯ値が閾値Ｔ以上（所定値以上）である場合、Ｓ４０４でＮｏと判定される。この場合、推論パラメータ決定部１０４は、劣化復元処理部１０３が行う復元処理で用いる推論パラメータを高感度用推論パラメータ（第５の推論パラメータ）に決定する（Ｓ４０６）。そして、推論パラメータ決定部１０４は、Ｓ４０５またはＳ４０６の何れかで決定された推論パラメータを劣化復元処理部１０３に出力する（Ｓ４０７）。 The inference parameter determination unit 104 compares the ISO value acquired in S401 with the threshold value T updated in S403, and determines whether the ISO value is smaller than the threshold value T (predetermined value) (S404). As a result of comparison, when the ISO value is smaller than the threshold value T, it is determined as Yes in S404. In this case, the inference parameter determination unit 104 determines the inference parameter used in the restoration process performed by the deterioration restoration processing unit 103 as the learned low-sensitivity inference parameter (fourth inference parameter) (S405). When the ISO value is equal to or greater than the threshold value T (greater than or equal to a predetermined value), S404 determines No. In this case, the inference parameter determination unit 104 determines the inference parameter used in the restoration process performed by the deterioration restoration processing unit 103 as the high-sensitivity inference parameter (fifth inference parameter) (S406). Then, the inference parameter determination unit 104 outputs the inference parameter determined in either S405 or S406 to the deterioration restoration processing unit 103 (S407).

以上のように、第２実施形態では、低感度用推論パラメータおよび高感度用推論パラメータの２つの推論パラメータが、学習により生成される。学習された各推論パラメータは、記憶領域２０３Ａに記憶される。推論パラメータ決定部１０４は、メタデータ取得部１０２が取得したＩＳＯ値に応じて、２つの推論パラメータから、復元処理に用いる推論パラメータを選択して決定する。推論パラメータ決定部１０４は、ＩＳＯ値に応じて決定された推論パラメータを劣化復元処理部１０３のニューラルネットワークに適用する。これにより、符号化データのメタデータが示すＩＳＯ値に応じて、適正な推論パラメータが適用されたニューラルネットワークを用いて復元処理を行うことができる。従って、撮像が行われた際のノイズ量が異なる場合であっても、符号化による劣化の復元を高い精度で行うことが可能になる。つまり、符号化された画像を復元する際に、撮像条件（ＩＳＯ感度）によらず、良好な画質の画像に復元することができる。劣化復元処理部１０３に適用される推論パラメータは３つ以上であってもよい。第２実施形態は単独で適用されてもよいし、第１実施形態とともに適用されてもよい。 As described above, in the second embodiment, two inference parameters, a low-sensitivity inference parameter and a high-sensitivity inference parameter, are generated by learning. Each learned inference parameter is stored in the storage area 203A. The inference parameter determination unit 104 selects and determines the inference parameter to be used for the restoration process from the two inference parameters according to the ISO value acquired by the metadata acquisition unit 102. The inference parameter determination unit 104 applies the inference parameter determined according to the ISO value to the neural network of the deterioration restoration processing unit 103. As a result, the restoration process can be performed using the neural network to which the appropriate inference parameters are applied according to the ISO value indicated by the metadata of the coded data. Therefore, even if the amount of noise at the time of imaging is different, it is possible to restore the deterioration due to coding with high accuracy. That is, when restoring the encoded image, it is possible to restore the image with good image quality regardless of the imaging conditions (ISO sensitivity). The number of inference parameters applied to the deterioration restoration processing unit 103 may be three or more. The second embodiment may be applied alone or together with the first embodiment.

＜第３実施形態＞
次に、第３実施形態について説明する。第３実施形態の画像処理装置１００は、被写界深度の範囲外に生みだされるボケを考慮した符号化劣化の復元を行う。一般的に、被写体までの距離が同一の場合、焦点距離を長く、絞り値を小さく設定して撮像した方が被写界深度は浅くなるため、ボケが発生しやすい。ボケが発生する画像とボケが発生しない画像とでは、同じ被写体であっても空間周波数の分布が異なる。従って、ボケが発生する画像とボケが発生しない画像と同一の推論パラメータが適用されたニューラルネットワークに対して入力されたとしても、高い復元効果を得ることは難しい。画像の復元効果を向上させるためには、ボケに影響する撮像条件に応じた学習用画像を用いて個別的に学習を行い、推論パラメータを最適化することが望ましい。第３実施形態の画像処理装置１００は、絞り値、焦点距離およびＡＦ合焦点数に基づき最適化した推論パラメータをニューラルネットワークに適用して、符号化劣化の復元を行う。 <Third Embodiment>
Next, the third embodiment will be described. The image processing apparatus 100 of the third embodiment restores the coding deterioration in consideration of the blur generated outside the range of the depth of field. Generally, when the distance to the subject is the same, the depth of field becomes shallower when the focal length is long and the aperture value is set small, so that blurring is likely to occur. Even if the subject is the same, the spatial frequency distribution is different between the image in which blurring occurs and the image in which blurring does not occur. Therefore, it is difficult to obtain a high restoration effect even if the image is input to a neural network to which the same inference parameters as the blurred image and the non-blurred image are applied. In order to improve the image restoration effect, it is desirable to individually learn using the learning image according to the imaging conditions that affect the blur and optimize the inference parameters. The image processing apparatus 100 of the third embodiment applies inference parameters optimized based on the aperture value, the focal length, and the number of AF focal points to the neural network to restore the coding deterioration.

第３実施形態の画像処理装置１００の構成およびニューラルネットワークの学習方法は第１実施形態および第２実施形態と同様であるため説明を省略する。第３実施形態の画像処理装置１００は、絞り値と焦点距離とに基づいてボケ評価値を算出する。そして、画像処理装置１００は、算出したボケ評価値とＡＦ合焦点数とに基づき、ボケ面積大とボケ面積小と２つの条件に分けて個別的に学習を行い、推論パラメータを生成する。 Since the configuration of the image processing device 100 and the learning method of the neural network of the third embodiment are the same as those of the first embodiment and the second embodiment, the description thereof will be omitted. The image processing apparatus 100 of the third embodiment calculates the blur evaluation value based on the aperture value and the focal length. Then, the image processing device 100 individually learns based on the calculated blur evaluation value and the number of AF focal points under two conditions, that is, a large blur area and a small blur area, and generates an inference parameter.

ここで、ボケ評価値について説明する。ボケ評価値は、ボケが発生しやすい撮像条件であるかどうかを示す評価値である。ボケ評価値が大きいほどボケが発生しやすい撮像条件であることを表し、ボケ評価値が小さいほどボケが発生しにくい撮像条件であることを表す。ボケ評価値の算出方法は、ボケ評価値をＢとし、焦点距離をＤとし、絞り値をＦとすると、「Ｂ＝Ｄ／Ｆ」で算出することができる。これは焦点距離が長く、絞り値が小さいほどボケが発生しやすいという性質を利用した簡易的な計算式となっている。 Here, the blur evaluation value will be described. The blur evaluation value is an evaluation value indicating whether or not the imaging condition is such that blur is likely to occur. The larger the blur evaluation value, the more likely the imaging condition is to cause blurring, and the smaller the blurring evaluation value, the less likely the blurring is to occur. The blur evaluation value can be calculated by "B = D / F", where B is the blur evaluation value, D is the focal length, and F is the aperture value. This is a simple calculation formula that utilizes the property that the longer the focal length and the smaller the aperture value, the more likely it is that blurring will occur.

しかし、上述したボケ評価値は、ボケが発生する面積を正確に算出するものではない。仮に、絞り値と焦点距離とからボケが発生しやすい条件が設定されていたとしても、画面全域が合焦している場合、ボケ発生面積大用の推論パラメータがニューラルネットワークに適用されると画像の復元効果が低下する可能性もある。従って、実際の撮影を想定すると、その他の撮像条件も用いて正確なボケ面積が推定されることが望ましい。そこで、第３実施形態では、ＡＦ合焦点数が用いられる。ＡＦ合焦点数は、オートフォーカスで撮像した際に被写体に合焦したポイントの数を表す。合焦点数が所定の閾値よりも大きい場合は合焦面積が広く、ボケ発生面積は狭いと考えられる。一方、合焦点数が所定の閾値以下である場合は合焦面積が狭く、ボケ発生面積は広いと考えられる。従って、ボケ評価値だけでなく、撮像条件に含まれるＡＦ合焦点数も併せて用いることで、より正確にボケが発生する面積を推定することが可能となる。 However, the above-mentioned blur evaluation value does not accurately calculate the area where blur occurs. Even if the conditions that are likely to cause blurring are set from the aperture value and focal length, if the entire screen is in focus, the inference parameter for the large blurring area will be applied to the neural network. There is a possibility that the restoration effect of the will be reduced. Therefore, assuming actual shooting, it is desirable to estimate the accurate blurred area using other imaging conditions. Therefore, in the third embodiment, the AF in-focus number is used. The AF in-focus number represents the number of points in focus on the subject when the image is taken with autofocus. When the number of in-focus points is larger than a predetermined threshold value, it is considered that the in-focus area is large and the area where blurring occurs is small. On the other hand, when the number of focal points is equal to or less than a predetermined threshold value, it is considered that the focusing area is narrow and the blurring area is large. Therefore, by using not only the blur evaluation value but also the number of AF focal points included in the imaging conditions, it is possible to estimate the area where the blur occurs more accurately.

図５は、第３実施形態における推論パラメータの決定方法の流れを示すフローチャートである。推論パラメータ決定部１０４は、符号化データから取得された絞り値、焦点距離およびＡＦ合焦点数をメタデータ取得部１０２から取得する（Ｓ５０１）。推論パラメータ決定部１０４は、Ｓ５０１で取得された絞り値および焦点距離に基づきボケ評価値を算出する（Ｓ５０２）。推論パラメータ決定部１０４は、Ｓ５０２で算出されたボケ評価値と所定の閾値Ｔ１とを比較し、ボケ評価値が所定の閾値Ｔ１より小さいかを判定する（Ｓ
より小さいかを判定する（Ｓ５０３）。 FIG. 5 is a flowchart showing the flow of the method of determining the inference parameter in the third embodiment. The inference parameter determination unit 104 acquires the aperture value, focal length, and AF focal length acquired from the encoded data from the metadata acquisition unit 102 (S501). The inference parameter determination unit 104 calculates the blur evaluation value based on the aperture value and the focal length acquired in S501 (S502). The inference parameter determination unit 104 compares the blur evaluation value calculated in S502 with the predetermined threshold value T1 and determines whether the blur evaluation value is smaller than the predetermined threshold value T1 (S).
It is determined whether it is smaller (S503).

比較の結果、ボケ評価値が所定の閾値Ｔ１より小さい場合、Ｓ５０３でＹｅｓと判定される。この場合、推論パラメータ決定部１０４は、劣化復元処理部１０３が行う復元処理で用いる推論パラメータを、学習済みのボケ面積小用推論パラメータに決定する。学習済みのボケ面積小用推論パラメータは、ボケが生じている面積が所定面積より小さい場合に用いられる第６の推論パラメータに対応する。ボケ評価値が閾値Ｔ１以上である場合、Ｓ５０３でＮｏと判定される。この場合、推論パラメータ決定部１０４は、ＡＦ合焦点数と所定の閾値Ｔ２とを比較し、ＡＦ合焦点数が所定の閾値Ｔ２より多いかを判定する（Ｓ５０５）。 As a result of the comparison, when the blur evaluation value is smaller than the predetermined threshold value T1, it is determined as Yes in S503. In this case, the inference parameter determination unit 104 determines the inference parameter used in the restoration process performed by the deterioration restoration processing unit 103 as the learned defocused area small inference parameter. The learned inference parameter for reducing the blurred area corresponds to the sixth inference parameter used when the area where the blur occurs is smaller than the predetermined area. When the blur evaluation value is equal to or higher than the threshold value T1, it is determined as No in S503. In this case, the inference parameter determination unit 104 compares the number of AF focus points with the predetermined threshold value T2, and determines whether the number of AF focus points is larger than the predetermined threshold value T2 (S505).

比較の結果、ＡＦ合焦点数が所定の閾値Ｔ２より多い場合、Ｓ５０５でＹｅｓと判定される。この場合、ボケ発生面積は狭いと考えられるため、フローは、Ｓ５０４に移行する。一方、ＡＦ合焦点数が所定の閾値Ｔ２以下である場合、Ｓ５０５でＮｏと判定される。この場合、ボケ発生面積は広いと考えられる。従って、推論パラメータ決定部１０４は、劣化復元処理部１０３が行う復元処理で用いる推論パラメータを、学習済みのボケ面積大用推論パラメータに決定する。学習済みのボケ面積大用推論パラメータは、ボケが生じている面積が所定面積以上である場合に用いられる第７の推論パラメータに対応する。 As a result of comparison, when the number of AF focal points is larger than the predetermined threshold value T2, it is determined as Yes in S505. In this case, since the blurring area is considered to be small, the flow shifts to S504. On the other hand, when the number of AF focus points is equal to or less than the predetermined threshold value T2, it is determined as No in S505. In this case, the area where blurring occurs is considered to be large. Therefore, the inference parameter determination unit 104 determines the inference parameter used in the restoration process performed by the deterioration restoration processing unit 103 as the learned blurred area large-use inference parameter. The learned inference parameter for increasing the blurred area corresponds to the seventh inference parameter used when the area in which the blur occurs is equal to or larger than a predetermined area.

従って、第３実施形態では、ボケ発生面積大用推論パラメータとボケ発生面積小用推論パラメータとの２つの推論パラメータが生成される。推論パラメータ決定部１０４は、符号化劣化の復元処理に用いる推論パラメータを、絞り値と焦点距離とＡＦ合焦点数とに応じて決定し、決定した推論パラメータをニューラルネットワークに適用する。これにより、ボケ発生面積が異なる場合であっても符号化劣化の復元を高い精度で行うことが可能になる。つまり、符号化された画像を復元する際に、撮像条件（ボケ発生面積）によらず、良好な画質の画像に復元することができる。 Therefore, in the third embodiment, two inference parameters, an inference parameter for increasing the area where blurring occurs and an inference parameter for reducing the area where blurring occurs, are generated. The inference parameter determination unit 104 determines the inference parameters used for the restoration process of the coding deterioration according to the aperture value, the focal length, and the number of AF focal points, and applies the determined inference parameters to the neural network. This makes it possible to restore the coding deterioration with high accuracy even when the blurring area is different. That is, when restoring the encoded image, it is possible to restore the image with good image quality regardless of the imaging conditions (blurring area).

劣化復元処理部１０３に適用される推論パラメータは３つ以上であってもよい。第３実施形態は単独で適用されてもよいし、第１実施形態および第２実施形態とともに適用されてもよい。また、第３実施形態は、第１実施形態と第２実施形態との何れかとともに適用されてもよい。 The number of inference parameters applied to the deterioration restoration processing unit 103 may be three or more. The third embodiment may be applied alone or together with the first and second embodiments. Further, the third embodiment may be applied together with either the first embodiment and the second embodiment.

＜第４実施形態＞
次に、第４実施形態について説明する。第４実施形態の画像処理装置１００は、複数の撮像条件の組み合わせを考慮した符号化劣化の復元を行う。ここで、カメラの撮像条件の組み合わせを全て網羅するように推論パラメータを生成すると、各推論パラメータを生成するための学習時間が非常に長くなる。第４実施形態の画像処理装置１００は、カメラの全ての撮像条件の組み合わせごとに学習を行うのではなく、撮像条件の組み合わせを絞って学習を行い、推論パラメータを生成する。これにより、符号化劣化を高い精度で復元することが図られる。第４実施形態で用いられる撮像条件は、露出補正値、ＩＳＯ値、絞り値および焦点距離である。これらの撮像条件は、第１実施形態、第２実施形態および第３実施形態で説明した撮像条件である。第４実施形態の画像処理装置１００は、第１実施形態、第２実施形態および第３実施形態の画像処理装置１００と同様の構成であるため、説明を省略する。また、ニューラルネットワークの学習方法も同様であるため、説明を省略する。 <Fourth Embodiment>
Next, the fourth embodiment will be described. The image processing apparatus 100 of the fourth embodiment restores the coding deterioration in consideration of the combination of a plurality of imaging conditions. Here, if the inference parameters are generated so as to cover all the combinations of the imaging conditions of the camera, the learning time for generating each inference parameter becomes very long. The image processing device 100 of the fourth embodiment does not perform learning for each combination of all imaging conditions of the camera, but learns by narrowing down the combination of imaging conditions to generate inference parameters. As a result, it is possible to restore the coding deterioration with high accuracy. The imaging conditions used in the fourth embodiment are an exposure compensation value, an ISO value, an aperture value, and a focal length. These imaging conditions are the imaging conditions described in the first embodiment, the second embodiment, and the third embodiment. Since the image processing device 100 of the fourth embodiment has the same configuration as the image processing device 100 of the first embodiment, the second embodiment, and the third embodiment, the description thereof will be omitted. Further, since the learning method of the neural network is the same, the description thereof will be omitted.

上述したように、撮像条件の組み合わせは多岐にわたるため、撮像条件の組み合わせを全て網羅するように推論パラメータを生成すると、膨大な学習時間が必要となる。しかしながら、撮像条件の組み合わせの中には、一部冗長な組み合わせも存在する。ここで、高感度且つボケ発生面積小の条件を想定する。高感度で撮像した画像は、画面全域にランダムノイズが印加される。この場合、画面内の情報の多くがランダムノイズに埋もれてしまうため、ボケに関する画像情報はほとんどなくなる。 As described above, since there are a wide variety of combinations of imaging conditions, it takes a huge amount of learning time to generate inference parameters so as to cover all combinations of imaging conditions. However, some combinations of imaging conditions are redundant. Here, it is assumed that the conditions are high sensitivity and the area where blurring occurs is small. Random noise is applied to the entire screen of an image captured with high sensitivity. In this case, most of the information on the screen is buried in random noise, so that there is almost no image information related to blurring.

すなわち、高感度という撮像条件は、ボケ発生面積小という撮像条件よりも相対的に画質に及ぼす影響度が高い。よって、高感度且つボケ発生面積小の条件における推論パラメータを別途生成する必要はない。この場合、第２実施形態の高感度用推論パラメータが生成されればよい。また、ボケ発生面積小の撮像条件は画質に及ぼす影響度が低く画面内の空間周波数を極端に変化させるものではない。従って、第３実施形態のボケ発生面積小用の推論パラメータを生成する必要もない。以上のように、画質に及ぼす影響度の低い撮像条件（画質に及ぼす影響度が所定の度合いより低い撮像条件）が除外されることで、不要な推論パラメータを生成する必要がなくなる。 That is, the imaging condition of high sensitivity has a relatively higher influence on the image quality than the imaging condition of small blurring area. Therefore, it is not necessary to separately generate inference parameters under the condition of high sensitivity and small blurring area. In this case, the high-sensitivity inference parameters of the second embodiment may be generated. Further, the imaging condition with a small blurring area has a low influence on the image quality and does not extremely change the spatial frequency in the screen. Therefore, it is not necessary to generate an inference parameter for reducing the blurring area of the third embodiment. As described above, by excluding the imaging conditions having a low influence on the image quality (imaging conditions having an influence on the image quality lower than a predetermined degree), it is not necessary to generate unnecessary inference parameters.

第４実施形態の画像処理装置１００は、各撮像条件が画質に及ぼす影響度を考慮して推論パラメータを生成する。画質に及ぼす影響度が高い撮像条件は復元の難易度が高いため、画像処理装置１００は、画質に及ぼす影響度が高い撮像条件の学習用画像を用いて最適化した推論パラメータを生成する。一方、画質に及ぼす影響度が低い撮像条件は復元の難易度が低いため、画像処理装置１００は、画質に及ぼす影響度が低い他の複数の撮像条件が混在した学習用画像を用いて汎用的な推論パラメータ（汎用推論パラメータ）を生成する。生成される推論パラメータは、露出アンダー用推論パラメータ、高感度用推論パラメータおよびボケ発生面積大用推論パラメータの３種類の推論パラメータと汎用推論パラメータである。汎用推論パラメータは、露出アンダー、高感度およびボケ発生面積大の何れの条件も満たさない画像を学習用画像にして生成された推論パラメータである。 The image processing apparatus 100 of the fourth embodiment generates inference parameters in consideration of the degree of influence of each imaging condition on the image quality. Since the imaging condition having a high influence on the image quality has a high degree of difficulty in restoration, the image processing device 100 generates an optimized inference parameter using the learning image of the imaging condition having a high influence on the image quality. On the other hand, since the difficulty of restoration is low under the imaging conditions having a low influence on the image quality, the image processing device 100 is versatile by using a learning image in which a plurality of other imaging conditions having a low influence on the image quality are mixed. Inference parameters (general-purpose inference parameters) are generated. The generated inference parameters are three types of inference parameters, an inference parameter for underexposure, an inference parameter for high sensitivity, and an inference parameter for large blurring area, and a general-purpose inference parameter. The general-purpose inference parameter is an inference parameter generated by using an image that does not satisfy any of the conditions of underexposure, high sensitivity, and large blurring area as a learning image.

上記の露出アンダー、高感度およびボケ発生面積大の３つの撮像条件は何れも画質に及ぼす影響度は異なる。高感度の撮像条件は、画面内全域に影響し、明るさと空間周波数とが上昇する。露出アンダーの撮像条件は、画面内全域に影響し、明るさが低下する。ボケ発生面積大の撮像条件は、画面内の局所領域に影響し、空間周波数が低下する。これは、撮像時のフォーカス合わせでミスが起きない限り画面全域にボケが発生する可能性は低いためである。このように、明るさ、空間周波数および影響面積の観点から、各撮像条件が画質に及ぼす影響度は、高感度、露出アンダー、ボケ発生面積大の順に大きい。そこで、第４実施形態では、推論パラメータが決定される際に、画質に及ぼす影響度が高い撮像条件から順に優先的に選択される。 All of the above three imaging conditions of underexposure, high sensitivity, and large blurring area have different degrees of influence on image quality. High-sensitivity imaging conditions affect the entire screen and increase brightness and spatial frequency. Underexposure imaging conditions affect the entire screen and reduce brightness. Imaging conditions with a large area of blurring affect the local area in the screen and reduce the spatial frequency. This is because it is unlikely that blurring will occur in the entire screen unless a mistake occurs in focusing at the time of imaging. As described above, from the viewpoints of brightness, spatial frequency, and affected area, the degree of influence of each imaging condition on the image quality increases in the order of high sensitivity, underexposure, and large blurred area. Therefore, in the fourth embodiment, when the inference parameter is determined, it is preferentially selected in order from the imaging condition having the highest degree of influence on the image quality.

図６は、第４実施形態における推論パラメータの決定方法の流れを示すフローチャートである。第１実施形態〜第３実施形態と重複する箇所については、説明を省略する。推論パラメータ決定部１０４は、メタデータ取得部１０２が符号化データから取得した露出補正値、ＩＳＯ値、絞り値、焦点距離およびＡＦ合焦点数を取得する（Ｓ６０１）。推論パラメータ決定部１０４は、センサ画素ピッチおよびボケ評価値を算出する（Ｓ６０２）。センサ画素ピッチの算出は第２実施形態のＳ４０２に相当し、ボケ評価値の算出は第３実施形態のＳ５０２に相当する。推論パラメータ決定部１０４は、閾値Ｔ０を更新する（Ｓ６０３）。閾値Ｔ０は、第２実施形態の閾値Ｔに対応する。また、Ｓ６０３の処理は、第２実施形態のＳ４０３に相当する。 FIG. 6 is a flowchart showing the flow of the method of determining the inference parameter in the fourth embodiment. The description of the parts that overlap with the first to third embodiments will be omitted. The inference parameter determination unit 104 acquires the exposure compensation value, ISO value, aperture value, focal length, and AF focus number acquired from the coded data by the metadata acquisition unit 102 (S601). The inference parameter determination unit 104 calculates the sensor pixel pitch and the blur evaluation value (S602). The calculation of the sensor pixel pitch corresponds to S402 of the second embodiment, and the calculation of the blur evaluation value corresponds to S502 of the third embodiment. The inference parameter determination unit 104 updates the threshold value T0 (S603). The threshold value T0 corresponds to the threshold value T of the second embodiment. Further, the processing of S603 corresponds to S403 of the second embodiment.

推論パラメータ決定部１０４は、復元対象ＲＡＷ画像のＩＳＯ値がＳ６０３で更新された閾値Ｔ０よりも小さいかを判定する（Ｓ６０４）。Ｓ６０４は、第２実施形態のＳ４０４に相当する。ＩＳＯ値が閾値Ｔ０以上である場合、Ｓ６０４でＮｏと判定される。この場合、推論パラメータ決定部１０４は、劣化復元処理部１０３が行う復元処理で用いる推論パラメータを高感度用推論パラメータに決定する（Ｓ６０５）。Ｓ６０５は、第２実施形態のＳ４０６に相当する。ＩＳＯ値が閾値Ｔ０より小さい場合、Ｓ６０４でＹｅｓと判定される。この場合、推論パラメータ決定部１０４は、Ｓ６０１で取得した露出補正値が「０」より小さいかを判定する（Ｓ６０６）。Ｓ６０６の処理は、第１実施形態のＳ３０２に相当する。 The inference parameter determination unit 104 determines whether the ISO value of the RAW image to be restored is smaller than the threshold value T0 updated in S603 (S604). S604 corresponds to S404 of the second embodiment. When the ISO value is equal to or higher than the threshold value T0, it is determined as No in S604. In this case, the inference parameter determination unit 104 determines the inference parameter used in the restoration process performed by the deterioration restoration processing unit 103 as the high-sensitivity inference parameter (S605). S605 corresponds to S406 of the second embodiment. If the ISO value is smaller than the threshold value T0, S604 determines Yes. In this case, the inference parameter determination unit 104 determines whether the exposure compensation value acquired in S601 is smaller than "0" (S606). The process of S606 corresponds to S302 of the first embodiment.

露出補正値が「０」より小さい場合、Ｓ６０６でＹｅｓと判定される。この場合、推論パラメータ決定部１０４は、劣化復元処理部１０３が行う復元処理で用いる推論パラメータを露出アンダー用推論パラメータに決定する（Ｓ６０７）。Ｓ６０７は、第１実施形態のＳ３０３に相当する。露出補正値が「０」以上である場合、Ｓ６０６でＮｏと判定される。この場合、推論パラメータ決定部１０４は、Ｓ６０１で取得した絞り値と焦点距離とに基づきボケ評価値を算出する（Ｓ６０８）。推論パラメータ決定部１０４は、Ｓ６０８で算出したボケ評価値が所定の閾値Ｔ１より小さいかを判定する（Ｓ６０９）。Ｓ６０９は、第３実施形態のＳ５０３に相当する。ボケ評価値が閾値Ｔ１以上である場合、Ｓ６１０でＮｏと判定される。この場合、推論パラメータ決定部１０４は、Ｓ６０１で取得したＡＦ合焦点数が所定の閾値Ｔ２より多いかを判定する（Ｓ６１０）。Ｓ６１０は、第３実施形態のＳ５０５に相当する。 If the exposure compensation value is smaller than "0", it is determined as Yes in S606. In this case, the inference parameter determination unit 104 determines the inference parameter used in the restoration process performed by the deterioration restoration processing unit 103 as the underexposure inference parameter (S607). S607 corresponds to S303 of the first embodiment. When the exposure compensation value is "0" or more, it is determined as No in S606. In this case, the inference parameter determination unit 104 calculates the blur evaluation value based on the aperture value and the focal length acquired in S601 (S608). The inference parameter determination unit 104 determines whether the blur evaluation value calculated in S608 is smaller than the predetermined threshold value T1 (S609). S609 corresponds to S503 of the third embodiment. When the blur evaluation value is equal to or higher than the threshold value T1, it is determined as No in S610. In this case, the inference parameter determination unit 104 determines whether the number of AF focal points acquired in S601 is greater than the predetermined threshold value T2 (S610). S610 corresponds to S505 of the third embodiment.

ＡＦ合焦点数が所定の閾値Ｔ２以上である場合、Ｓ６１０でＮｏと判定される。この場合、推論パラメータ決定部１０４は、劣化復元処理部１０３が行う復元処理で用いる推論パラメータをボケ面積大用推論パラメータに決定する（Ｓ６１１）。Ｓ６１１は、第３実施形態のＳ５０６に相当する。ボケ評価値が所定の閾値Ｔ１より小さい場合、およびＡＦ合焦点数が所定の閾値Ｔ２より多い場合、推論パラメータ決定部１０４は、劣化復元処理部１０３が行う復元処理で用いる推論パラメータを汎用推論パラメータに決定する（Ｓ６１２）。推論パラメータ決定部１０４は、Ｓ６０５、Ｓ６０７、Ｓ６１１またはＳ６１２で決定された推論パラメータを出力する（Ｓ６１３）。 When the number of AF focus points is equal to or greater than the predetermined threshold value T2, No is determined in S610. In this case, the inference parameter determination unit 104 determines the inference parameter used in the restoration process performed by the deterioration restoration processing unit 103 as the defocused area large-use inference parameter (S611). S611 corresponds to S506 of the third embodiment. When the blur evaluation value is smaller than the predetermined threshold value T1 and the number of AF focal points is larger than the predetermined threshold value T2, the inference parameter determination unit 104 uses the inference parameter used in the restoration process performed by the deterioration restoration processing unit 103 as a general-purpose inference parameter. (S612). The inference parameter determination unit 104 outputs the inference parameter determined in S605, S607, S611 or S612 (S613).

以上のように、第４実施形態では、推論パラメータ決定部１０４は、画質に及ぼす影響度が高い撮像条件では、その撮像条件ごとに最適化した専用の推論パラメータをニューラルネットワークに適用すると決定する。一方、画質に及ぼす影響が低い撮像条件については、汎用的な推論パラメータをニューラルネットワークに適用すると決定する。そして、推論パラメータ決定部１０４は、画質に及ぼす影響度が高い撮像条件（画質に及ぼす影響度が所定の度合い以上の高い撮像条件）から順に優先的に、対応する推論パラメータをニューラルネットワークに適用すると決定する。これにより、画像復元効果の悪化を抑制できるとともに、撮像条件の組み合わせごとに学習を行って推論パラメータを生成する必要がなくなる。従って、推論パラメータの学習時間を大幅に削減することが可能となる。なお、第４実施形態において、露出補正値、ＩＳＯ値、絞り値および焦点距離の４つの撮像条件以外の撮像条件が追加されてもよい。 As described above, in the fourth embodiment, the inference parameter determination unit 104 determines that the inference parameter optimized for each imaging condition is applied to the neural network under the imaging condition having a high influence on the image quality. On the other hand, for imaging conditions that have a low effect on image quality, it is decided to apply general-purpose inference parameters to the neural network. Then, the inference parameter determination unit 104 applies the corresponding inference parameters to the neural network in order from the imaging conditions having a high influence on the image quality (imaging conditions having a high influence on the image quality of a predetermined degree or more). decide. As a result, deterioration of the image restoration effect can be suppressed, and it is not necessary to perform learning for each combination of imaging conditions to generate inference parameters. Therefore, it is possible to significantly reduce the learning time of the inference parameters. In the fourth embodiment, imaging conditions other than the four imaging conditions of exposure compensation value, ISO value, aperture value, and focal length may be added.

＜第５実施形態＞
上述した第１実施形態から第４実施形態では、符号化データに含まれるメタデータが示す撮像条件に応じて推論パラメータを適用することで、符号化劣化を復元する。第５実施形態では、復元対象ＲＡＷ画像のＰＳＮＲ（ピーク信号対雑音比：ＰｅａｋＳｉｇｎａｌｔｏＮｏｉｓｅＲａｔｉｏ）が、符号化データのメタデータとして新たに記録される。そして、ＰＳＮＲに応じて最適化した推論パラメータを適用して符号化劣化が復元される。ＰＳＮＲについて説明する。符号化分野では、画像再現性の尺度としてＰＳＮＲと称される指標値が用いられている。ＰＳＮＲは、値が大きいほど画像の再現性が高く、値が小さいほど画像の再現性が低いことを表す。すなわち、ＰＳＮＲが低い画像は符号化の過程でデータが欠損し、画質が大きく低下している可能性が高い。従って、画像の復元効果を高めるためには、画質劣化の度合いを示すＰＳＮＲに応じた学習用画像を用いて個別的に学習を行って推論パラメータを最適化することが望ましい。 <Fifth Embodiment>
In the first to fourth embodiments described above, the coding deterioration is restored by applying the inference parameters according to the imaging conditions indicated by the metadata included in the coded data. In the fifth embodiment, the PSNR (Peak Signal to Noise Ratio) of the RAW image to be restored is newly recorded as the metadata of the encoded data. Then, the inference parameters optimized according to the PSNR are applied to restore the coding deterioration. PSNR will be described. In the coding field, an index value called PSNR is used as a measure of image reproducibility. The PSNR indicates that the larger the value, the higher the image reproducibility, and the smaller the value, the lower the image reproducibility. That is, it is highly possible that the image quality of an image having a low PSNR is significantly deteriorated due to data loss in the coding process. Therefore, in order to enhance the image restoration effect, it is desirable to optimize the inference parameters by individually learning using the learning image according to the PSNR indicating the degree of image quality deterioration.

第５実施形態の画像処理装置１００の構成およびニューラルネットワークの学習方法は、第１実施形態から第４実施形態と同様であるため、説明を省略する。第５実施形態の画像処理装置１００は、撮像情報としてＰＳＮＲを使用し、ＰＳＮＲが所定の閾値よりも低い場合は、ＰＳＮＲが低い学習用画像を用いて生成した推論パラメータを適用して画像復元を行う。また、ＰＳＮＲが所定の閾値以上である場合は、画像の再現性が高く画質劣化が少ないと考えられる。この場合、第５実施形態の画像処理装置１００は、推論パラメータを用いた画像復元を行わない。図７は、符号化データのメタデータにＰＳＮＲを埋め込む方法を説明する図である。 Since the configuration of the image processing device 100 and the learning method of the neural network of the fifth embodiment are the same as those of the first to fourth embodiments, the description thereof will be omitted. The image processing apparatus 100 of the fifth embodiment uses PSNR as the imaging information, and when the PSNR is lower than a predetermined threshold value, the image restoration is performed by applying the inference parameters generated by using the learning image having a low PSNR. Do. Further, when the PSNR is equal to or higher than a predetermined threshold value, it is considered that the image reproducibility is high and the image quality is less deteriorated. In this case, the image processing apparatus 100 of the fifth embodiment does not perform image restoration using inference parameters. FIG. 7 is a diagram illustrating a method of embedding PSNR in the metadata of the coded data.

図７に示される各部は、画像処理装置１００に含まれる。記憶部７０４Ａは、非圧縮ＲＡＷ画像を記憶する。記憶部７０４Ｂは、符号化データを記憶する。記憶部７０４Ｃは、復号ＲＡＷ画像を記憶する。図７に示される各記憶部は、第１実施形態で説明した各記憶部と一体的に構成されてもよいし、別個に設けられてもよい。ＲＡＷ符号化部７０１は、非圧縮ＲＡＷ画像を圧縮符号化して符号化データを生成する。ＲＡＷ符号化部７０１は、第１実施形態のＲＡＷ符号化部２０１に相当する。ＲＡＷ復号部７０２は、符号化データを復号して復号ＲＡＷ画像を生成する。ＲＡＷ復号部７０２は、第１実施形態のＲＡＷ復号部１０１に相当する。ＰＳＮＲ算出部７０３は、復号ＲＡＷ画像と非圧縮ＲＡＷ画像とを取得してＰＳＮＲを算出する。現画像が取り得る最大画素値をＭＡＸ、評価対象画像と現画像との最大二乗誤差をＭＳＥとすると、ＰＳＮＲは以下の「数１」で表される。 Each part shown in FIG. 7 is included in the image processing apparatus 100. The storage unit 704A stores an uncompressed RAW image. The storage unit 704B stores the coded data. The storage unit 704C stores the decoded RAW image. Each storage unit shown in FIG. 7 may be integrally configured with each storage unit described in the first embodiment, or may be provided separately. The RAW coding unit 701 compresses and encodes an uncompressed RAW image to generate coded data. The RAW coding unit 701 corresponds to the RAW coding unit 201 of the first embodiment. The RAW decoding unit 702 decodes the encoded data to generate a decoded RAW image. The RAW decoding unit 702 corresponds to the RAW decoding unit 101 of the first embodiment. The PSNR calculation unit 703 acquires the decoded RAW image and the uncompressed RAW image and calculates the PSNR. Assuming that the maximum pixel value that the current image can take is MAX and the maximum square error between the evaluation target image and the current image is MSE, the PSNR is represented by the following "Equation 1".

算出されたＰＳＮＲは、符号化データのメタデータ部９０２に記録される。

The calculated PSNR is recorded in the metadata unit 902 of the coded data.

図８は、第５実施形態における推論パラメータの決定方法の流れを示すフローチャートである。推論パラメータ決定部１０４は、符号化データから取得したＰＳＮＲをメタデータ取得部１０２から取得する（Ｓ８０１）。推論パラメータ決定部１０４は、Ｓ８０１で取得したＰＳＮＲが所定の閾値Ｔ３より低いかを判定する（Ｓ８０２）。ＰＳＮＲが所定の閾値Ｔ３より低い場合、Ｓ８０２でＹｅｓと判定される。この場合、推論パラメータ決定部１０４は、劣化復元処理部１０３が行う復元処理で用いる推論パラメータを低ＰＳＮＲ用推論パラメータに決定する（Ｓ８０３）。 FIG. 8 is a flowchart showing the flow of the method of determining the inference parameter in the fifth embodiment. The inference parameter determination unit 104 acquires the PSNR acquired from the coded data from the metadata acquisition unit 102 (S801). The inference parameter determination unit 104 determines whether the PSNR acquired in S801 is lower than the predetermined threshold value T3 (S802). If the PSNR is lower than the predetermined threshold value T3, S802 determines Yes. In this case, the inference parameter determination unit 104 determines the inference parameter used in the restoration process performed by the deterioration restoration processing unit 103 as the inference parameter for low PSNR (S803).

低ＰＳＮＲ用推論パラメータは、低ＰＳＮＲの条件条件で撮像された学習用画像を用いて個別的に学習されることで生成される推論パラメータである。ＰＳＮＲが所定の閾値Ｔ３以上である場合、Ｓ８０２でＮｏと判定される。この場合、ＰＳＮＲの値が高いため、劣化復元処理部１０３のニューラルネットワークに対して、低ＰＳＮＲ用推論パラメータは適用されない。 The low PSNR inference parameter is an inference parameter generated by individually learning using a learning image captured under the condition of low PSNR. When the PSNR is equal to or higher than the predetermined threshold value T3, it is determined as No in S802. In this case, since the PSNR value is high, the low PSNR inference parameter is not applied to the neural network of the deterioration restoration processing unit 103.

つまり、推論パラメータ決定部１０４は、劣化復元処理部１０３のニューラルネットワークに推論パラメータを適用するかを制御する。これにより、符号化劣化の度合いが異なる場合であっても、符号化劣化の復元効果を高めることが可能である。ＰＳＮＲ用推論パラメータのパターンの数は任意の数であってもよい。また、画質を定量化する指標値としてＰＳＮＲが用いられる例について説明したが、その他の指標値が用いられてもよい。 That is, the inference parameter determination unit 104 controls whether to apply the inference parameter to the neural network of the deterioration restoration processing unit 103. Thereby, even if the degree of coding deterioration is different, it is possible to enhance the restoration effect of the coding deterioration. The number of patterns of inference parameters for PSNR may be arbitrary. Moreover, although the example in which PSNR is used as an index value for quantifying the image quality has been described, other index values may be used.

以上、本発明の好ましい実施の形態について説明したが、本発明は上述した各実施の形態に限定されず、その要旨の範囲内で種々の変形及び変更が可能である。上述した実施形態では、劣化復元処理部１０３は、画像処理装置１００に含まれるものとしたが、画像処理装置１００ではなく、外部装置に劣化復元処理部１０３の処理を実行させてもよい。その場合、画像処理装置１００の通信部（不図示）を介して、クラウドサーバなどの外部装置に、ＲＡＷ複合部１０１により復号された複号ＲＡＷ画像と推論パラメータ決定部１０４により決定された推論パラメータを送信する。そして、外部装置は、複号ＲＡＷ画像と推論パラメータとから劣化復元処理部１０３の処理を実行し、劣化を復元したＲＡＷ画像を画像処理装置１００に送信するようにするとよい。また、本実施形態の画像処理装置１００は、符号化データを取得してＲＡＷ複合部１０１により復号処理を行って復号ＲＡＷ画像を取得するものとした。しかし、画像処理装置１００は、ＲＡＷ複合部１０１を有さず、外部装置から。複号ＲＡＷ画像とその画像のメタデータとを取得して、劣化復元処理部１０３、推論パラメータ決定部１０４による復元処理を行い、劣化復元ＲＡＷ画像を取得するようにしてもよい。 Although the preferred embodiments of the present invention have been described above, the present invention is not limited to the above-described embodiments, and various modifications and modifications can be made within the scope of the gist thereof. In the above-described embodiment, the deterioration / restoration processing unit 103 is included in the image processing device 100, but an external device may be used instead of the image processing device 100 to execute the processing of the deterioration / restoration processing unit 103. In that case, the multiple RAW image decoded by the RAW composite unit 101 and the inference parameter determined by the inference parameter determination unit 104 are sent to an external device such as a cloud server via the communication unit (not shown) of the image processing device 100. To send. Then, the external device may execute the processing of the deterioration restoration processing unit 103 from the compound RAW image and the inference parameter, and transmit the RAW image restored to the deterioration to the image processing device 100. Further, the image processing apparatus 100 of the present embodiment acquires encoded data and performs decoding processing by the RAW composite unit 101 to acquire a decoded RAW image. However, the image processing device 100 does not have the RAW composite unit 101, and is from an external device. The multiple RAW image and the metadata of the image may be acquired, and the deterioration restoration processing unit 103 and the inference parameter determination unit 104 may perform the restoration processing to acquire the deterioration restoration RAW image.

本発明は、上述の各実施の形態の１以上の機能を実現するプログラムを、ネットワークや記憶媒体を介してシステムや装置に供給し、そのシステム又は装置のコンピュータの１つ以上のプロセッサがプログラムを読み出して実行する処理でも実現可能である。また、本発明は、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 The present invention supplies a program that realizes one or more functions of each of the above-described embodiments to a system or device via a network or storage medium, and one or more processors of the computer of the system or device transmits the program. It can also be realized by the process of reading and executing. The present invention can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１００画像処理装置
１０１ＲＡＷ復号部
１０２メタデータ取得部
１０３劣化復元処理部
１０４推論パラメータ決定部
２０１ＲＡＷ符号化部
２０２画質比較部
２０３推論パラメータ更新部
７０３ＰＳＮＲ算出部 100 Image processing device 101 RAW decoding unit 102 Metadata acquisition unit 103 Deterioration restoration processing unit 104 Inference parameter determination unit 201 RAW coding unit 202 Image quality comparison unit 203 Inference parameter update unit 703 PSNR calculation unit

Claims

Determining means for determining the inference parameters in order to obtain an image in which the deterioration due to the encoding is restored by performing inference based on the inference parameters on the decoded image obtained by decoding the encoded image.
An acquisition means for acquiring the imaging conditions of the image is provided.
The determination means is an image processing apparatus characterized in that the inference parameters are determined according to the imaging conditions acquired by the acquisition means.

The image processing apparatus according to claim 1, wherein the determination means determines the inference parameter based on the result of comparing the imaging condition with the threshold value.

Further, it has a second acquisition means for acquiring information about the imaging means that captured the image.
The image processing apparatus according to claim 2, wherein the determination means determines the threshold value according to information about the imaging means.

Any one of claims 1 to 3, further comprising an inference parameter determined by the determination means and an inference means for inferring an image in which deterioration due to coding is restored based on the decoded image. The image processing apparatus according to item 1.

The image processing apparatus according to any one of claims 1 to 4, further comprising a compound means for decoding the encoded image.

The imaging condition includes an exposure compensation value and includes an exposure compensation value.
The plurality of inference parameters include a first inference parameter for underexposure, a second inference parameter for proper exposure, and a third inference parameter for overexposure.
Claims 1 to 1, wherein the determination means determines any one of the first inference parameter, the second inference parameter, and the third inference parameter according to the exposure compensation value. The image processing apparatus according to any one of 5.

The imaging conditions include ISO values.
The plurality of inference parameters include a fourth inference parameter used when the ISO value is smaller than the predetermined value and a fifth inference parameter used when the ISO value is greater than or equal to the predetermined value.
The determination means according to any one of claims 1 to 6, wherein one of the fourth inference parameter and the fifth inference parameter is determined according to the ISO value included in the imaging condition. The image processing apparatus according to any one item.

The image processing apparatus according to claim 7, wherein the predetermined value is updated based on a pixel pitch calculated by the number of pixels of the image sensor and the size of the image sensor.

The imaging conditions include the aperture value, the focal length, and the number of AF focal points.
The plurality of inference parameters include a sixth inference parameter used when the blurred area is smaller than a predetermined area and a seventh inference parameter used when the blurred area is equal to or larger than the predetermined area. Including
The determining means is characterized in that it determines one of the sixth inference parameter and the seventh inference parameter according to the aperture value, the focal length, and the number of AF focal points included in the imaging condition. The image processing apparatus according to any one of claims 1 to 8.

The image processing apparatus according to any one of claims 1 to 9, wherein the determination means determines corresponding inference parameters in order from an imaging condition having a high degree of influence on image quality.

The image processing apparatus according to claim 10, wherein inference parameters for imaging conditions whose degree of influence on image quality is lower than a predetermined degree are not individually generated.

The image processing according to claim 11, wherein the inference parameter for an imaging condition whose degree of influence on the image quality is lower than the predetermined degree is learned using a learning image in which a plurality of imaging conditions are mixed. apparatus.

The determining means includes an inference parameter for underexposure, an inference parameter used when the ISO value is a predetermined value or more, and an area where blurring occurs is a predetermined area or more according to the imaging condition. The image processing apparatus according to claim 12, wherein one of the inference parameters used and the inference parameters learned using the learning image in which the plurality of imaging conditions are mixed is determined.

The determination means is in the order of inference parameters used when the ISO value is equal to or greater than a predetermined value, inference parameters for underexposure, and inference parameters used when the area where the blur occurs is equal to or greater than a predetermined area. The image processing apparatus according to claim 13, wherein the inference parameters are preferentially determined.

The determination means according to any one of claims 1 to 14, wherein the determination means determines whether or not to apply the inference parameter to the learning model according to an index value for quantifying the image quality. Image processing device.

The inference is made by a neural network
The image processing apparatus according to any one of claims 1 to 15, wherein the inference parameter is a weight and a bias obtained by learning the neural network.

The image processing apparatus according to any one of claims 1 to 16, wherein the encoded image is a RAW image.

Imaging unit and
The image processing apparatus according to any one of claims 1 to 17.
An imaging device characterized by comprising.

A step of determining the inference parameters in order to obtain an image in which the deterioration due to the encoding is restored by performing inference based on the inference parameters on the decoded image obtained by decoding the encoded image.
It is provided with an acquisition step of acquiring the imaging conditions of the image.
An image processing method characterized in that the inference parameters are determined according to the acquired imaging conditions.

A program for causing a computer to execute the image processing apparatus according to any one of claims 1 to 17.