JP2024077434A

JP2024077434A - Image processing device, image processing method, program, and storage medium

Info

Publication number: JP2024077434A
Application number: JP2022189529A
Authority: JP
Inventors: 法史樫山; 徹小池; 竣介川原
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2022-11-28
Filing date: 2022-11-28
Publication date: 2024-06-07

Abstract

【課題】ディープラーニングにおける量子化による演算精度の低下を抑制できる画像処理装置を提供する。【解決手段】ニューラルネットワークの学習に使用される訓練画像をニューラルネットワークに入力して得られた出力画像の特徴を表す評価値を取得する取得手段と、評価値に基づいて、出力画像の特徴を簡易的に表わす入力画像を生成する生成手段と、生成された入力画像をニューラルネットワークに入力して、ニューラルネットワークの層を構成するパラメータのビット数を削減する削減手段と、を備える。【選択図】図２[Problem] To provide an image processing device capable of suppressing a decrease in calculation accuracy due to quantization in deep learning. [Solution] The image processing device includes an acquisition means for acquiring an evaluation value representing the characteristics of an output image obtained by inputting a training image used for learning the neural network into the neural network, a generation means for generating an input image that simply represents the characteristics of the output image based on the evaluation value, and a reduction means for inputting the generated input image into the neural network and reducing the number of bits of the parameters that constitute the layers of the neural network. [Selected Figure] Figure 2

Description

本発明は、ニューラルネットワークの層を構成するパラメータのビット数を削減する技術に関するものである。 The present invention relates to a technology for reducing the number of bits in the parameters that make up the layers of a neural network.

近年、ディープラーニングを用いた画像処理技術が数多く提案されている。一般的にディープラーニングの推論に利用されているニューラルネットワークの演算には多大な時間がかかる。そのため、ニューラルネットワークの演算におけるメモリ使用量の削減や推論速度の高速化のために、ニューラルネットワークの軽量化が求められている。 In recent years, many image processing technologies using deep learning have been proposed. Generally, the calculations of neural networks used for deep learning inference take a long time. Therefore, there is a demand for lightweight neural networks to reduce memory usage in neural network calculations and increase inference speed.

ニューラルネットワークの軽量化の手法の一つとして、ニューラルネットワークの層を構成するパラメータのビット数を削減する手法が提案されている。ここではその手法を量子化と呼ぶ。特に、ニューラルネットワークが組み込まれる組み込み機器（カメラ、自動車など）では、ニューラルネットワークに用いることができる計算リソースが制限されるので、ニューラルネットワークの軽量化のアプローチとして量子化が多く採用されている。 One method that has been proposed for making neural networks lighter is to reduce the number of bits in the parameters that make up the layers of the neural network. Here, this method is called quantization. In particular, in embedded devices (cameras, automobiles, etc.) in which neural networks are incorporated, the computing resources available for the neural network are limited, so quantization is often used as an approach to making neural networks lighter.

また、特許文献１には、量子化によるニューラルネットワークの演算精度の低下の影響度を導出する方法が開示されている。 Patent document 1 also discloses a method for deriving the degree of impact of a decrease in the computational accuracy of a neural network due to quantization.

特開２０１８－１４２０４９号公報JP 2018-142049 A

しかしながら、上記の従来技術では、量子化による演算精度の低下のばらつきを確認することはできるものの、演算精度の低下のばらつきを抑制することができない。そのため、ノイズ除去や超解像などの画像回復を目的とするディープラーニングにおいて、量子化による演算精度の低下のばらつきが大きくなるという問題があった。 However, with the above conventional technology, although it is possible to confirm the variation in the decrease in calculation accuracy due to quantization, it is not possible to suppress the variation in the decrease in calculation accuracy. Therefore, in deep learning aimed at image restoration such as noise removal and super-resolution, there is a problem in that the variation in the decrease in calculation accuracy due to quantization becomes large.

本発明は上述した課題に鑑みてなされたものであり、その目的は、ディープラーニングにおける量子化による演算精度の低下を抑制できる画像処理装置を提供することである。 The present invention has been made in consideration of the above-mentioned problems, and its purpose is to provide an image processing device that can suppress the decrease in calculation accuracy due to quantization in deep learning.

本発明に係わる画像処理装置は、ニューラルネットワークの学習に使用される訓練画像を前記ニューラルネットワークに入力して得られた出力画像の特徴を表す評価値を取得する取得手段と、前記評価値に基づいて、前記出力画像の特徴を簡易的に表わす入力画像を生成する生成手段と、生成された前記入力画像を前記ニューラルネットワークに入力して、前記ニューラルネットワークの層を構成するパラメータのビット数を削減する削減手段と、を備えることを特徴とする。 The image processing device according to the present invention is characterized by comprising: an acquisition means for acquiring an evaluation value representing the characteristics of an output image obtained by inputting a training image used for learning the neural network into the neural network; a generation means for generating an input image that simply represents the characteristics of the output image based on the evaluation value; and a reduction means for inputting the generated input image into the neural network and reducing the number of bits of the parameters that constitute the layers of the neural network.

本発明によれば、ディープラーニングにおける量子化による演算精度の低下を抑制することが可能となる。 The present invention makes it possible to suppress the decrease in computational accuracy caused by quantization in deep learning.

第１の実施形態における画像処理装置の構成を示す図。FIG. 1 is a diagram showing the configuration of an image processing apparatus according to a first embodiment. 第１の実施形態における量子化工程を示すフローチャート。5 is a flowchart showing a quantization process in the first embodiment. 第１の実施形態におけるニューラルネットワークの初期化処理を示すフローチャート。5 is a flowchart showing an initialization process of a neural network according to the first embodiment. 第１の実施形態におけるニューラルネットワークの構造を示す概念図。FIG. 2 is a conceptual diagram showing the structure of a neural network according to the first embodiment. 第１の実施形態におけるニューラルネットワークの処理を示すフローチャート。5 is a flowchart showing processing of a neural network in the first embodiment. ＲＡＷ画像からカラーチャネルに分離した画像の作成を説明する図。1A to 1C are diagrams for explaining the creation of images separated into color channels from a RAW image. 畳み込み演算を説明するための概念図。FIG. 1 is a conceptual diagram for explaining a convolution operation. カラーチャネルに分離した画像からのＲＡＷ画像の作成を説明する図。1A and 1B are diagrams for explaining the creation of a RAW image from an image separated into color channels. 画素に関する評価値を説明するための概念図。FIG. 1 is a conceptual diagram for explaining an evaluation value relating to a pixel. 第１の実施形態における量子化の処理を示すフローチャート。5 is a flowchart showing a quantization process according to the first embodiment. 第２の実施形態における量子化工程を示すフローチャート。10 is a flowchart showing a quantization process according to the second embodiment.

以下、添付図面を参照して実施形態を詳しく説明する。なお、以下の実施形態は特許請求の範囲に係る発明を限定するものではない。実施形態には複数の特徴が記載されているが、これらの複数の特徴の全てが発明に必須のものとは限らず、また、複数の特徴は任意に組み合わせられてもよい。さらに、添付図面においては、同一若しくは同様の構成に同一の参照番号を付し、重複した説明は省略する。 The following embodiments are described in detail with reference to the attached drawings. Note that the following embodiments do not limit the invention according to the claims. Although the embodiments describe multiple features, not all of these multiple features are necessarily essential to the invention, and multiple features may be combined in any manner. Furthermore, in the attached drawings, the same reference numbers are used for the same or similar configurations, and duplicate explanations are omitted.

＜第１の実施形態＞
図１は、本発明の第１の実施形態におけるニューラルネットワークの学習と量子化を行う画像処理装置１００の構成を示す図である。 First Embodiment
FIG. 1 is a diagram showing the configuration of an image processing apparatus 100 for performing neural network learning and quantization according to a first embodiment of the present invention.

図１において、画像処理装置１００は、ストレージ駆動部１０３、ＲＯＭ１０４、画像処理部１０５、ＲＡＭ１０６、ＣＰＵ１０７、ＧＰＵ１０８を備え、それらが内部バス１０９で接続されて構成されている。画像処理装置１００を構成する各構成要素は、内部バス１０９を介して互いにデータのやりとりを行うことができる。 In FIG. 1, the image processing device 100 includes a storage drive unit 103, a ROM 104, an image processing unit 105, a RAM 106, a CPU 107, and a GPU 108, which are connected via an internal bus 109. Each component of the image processing device 100 can exchange data with each other via the internal bus 109.

ストレージ装置１０１は、膨大な画像データを学習用画像として記憶させるため、及び学習時に作成したネットワークパラメータを記憶させるために用いられる。また、量子化に用いる画像を記憶させるため、量子化時に作成したネットワークパラメータを記憶させるためなどに用いてもよい。画像処理装置１００は、ストレージ接続部１０２、ストレージ駆動部１０３を介して、ストレージ装置１０１とデータのやり取りを行う。 The storage device 101 is used to store huge amounts of image data as learning images and to store network parameters created during learning. It may also be used to store images to be used for quantization, to store network parameters created during quantization, and so on. The image processing device 100 exchanges data with the storage device 101 via the storage connection unit 102 and storage drive unit 103.

画像処理部１０５は、ストレージ装置１０１から読み出された画像に対して、様々な画像処理を行う。また、画像処理部１０５の処理は、ＣＰＵ１０７により制御されることで実行される。例えば、ニューラルネットワークの学習時の訓練画像や量子化時の入力画像を生成する処理を行う。具体的には、ノイズ除去を目的としたニューラルネットワークであれば、ノイズ付加処理などを行い、超解像を目的としたニューラルネットワークであれば、サイズ縮小などの劣化処理を行う。また、画像の輝度、色に関する補正を行ったり、画像のヒストグラム算出などを行うこともできる。 The image processing unit 105 performs various image processing on the image read from the storage device 101. The processing of the image processing unit 105 is also executed under the control of the CPU 107. For example, the image processing unit 105 performs processing to generate training images for learning a neural network and input images for quantization. Specifically, if the neural network is intended for noise removal, it performs noise addition processing, and if the neural network is intended for super-resolution, it performs degradation processing such as size reduction. It can also correct the brightness and color of the image and calculate the histogram of the image.

画像処理装置１００の各機能を制御するためのユニットとして、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１０７が配置されており、主に画像処理や量子化を行うために使用される。ＣＰＵ１０７を駆動するために、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１０４及びＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１０６が接続されている。 A CPU (Central Processing Unit) 107 is arranged as a unit for controlling each function of the image processing device 100, and is mainly used for image processing and quantization. A ROM (Read Only Memory) 104 and a RAM (Random Access Memory) 106 are connected to drive the CPU 107.

ＲＡＭ１０６は揮発性の素子であり、画像データや演算結果を一時的に記憶し、必要な時に読み出すことが可能である。近年では、ＤＤＲ４－ＳＤＲＡＭ（ＤｕａｌＤａｔａＲａｔｅ４－ＳｙｎｃｈｒｏｎｏｕｓＤｙｎａｍｉｃＲＡＭ）などが用いられることが多い。 RAM 106 is a volatile element that temporarily stores image data and calculation results and can be read out when necessary. In recent years, DDR4-SDRAM (Dual Data Rate 4 - Synchronous Dynamic RAM) and the like are often used.

ＲＯＭ１０４は不揮発性の素子であり、ＣＰＵ１０７を動作させるためのプログラムや、各種調整パラメータなどが記憶されている。ＲＯＭ１０４から読み出されたプログラムは、揮発性のＲＡＭ１０６に展開されて実行される。 ROM 104 is a non-volatile element that stores programs for operating CPU 107, various adjustment parameters, etc. Programs read from ROM 104 are deployed to volatile RAM 106 and executed.

ＧＰＵ１０８は、ディープラーニングによるニューラルネットワークの処理を行うために使用される。学習時には膨大な計算を並列して処理することが必要とされるため、ＣＰＵに比べて並列処理能力の高いＧＰＵを用いることが好適である。また、量子化を行うために使用してもよい。 The GPU 108 is used to process neural networks using deep learning. Since learning requires processing a huge amount of calculations in parallel, it is preferable to use a GPU, which has higher parallel processing capabilities than a CPU. It may also be used to perform quantization.

次に、図２～図１０を参照して、画像処理装置１００で行う量子化工程について説明する。 Next, the quantization process performed by the image processing device 100 will be described with reference to Figures 2 to 10.

図２は、本実施形態における量子化工程を示すフローチャートである。図３は、図２のステップＳ２０１をより詳細に説明した、ニューラルネットワークの初期化の処理を示すフローチャートである。また、図４は、ニューラルネットワークの構造を示す概念図である。図５は、図２のステップＳ２０４をより詳細に説明した、ニューラルネットワークの処理を示すフローチャートである。 Figure 2 is a flowchart showing the quantization process in this embodiment. Figure 3 is a flowchart showing the process of initializing the neural network, explaining step S201 in Figure 2 in more detail. Figure 4 is a conceptual diagram showing the structure of the neural network. Figure 5 is a flowchart showing the process of the neural network, explaining step S204 in Figure 2 in more detail.

図２において、まずステップＳ２０１では、ＣＰＵ１０７は、ニューラルネットワークの構造に関するデータとパラメータを初期化し、ＧＰＵ１０８に展開する。構造に関するデータとパラメータについては後述する。なお、構造に関するデータとパラメータは、ストレージ装置１０１に保存したり、ＲＡＭ１０６に展開してもよい。ステップＳ２０１で行われるニューラルネットワークの処理に関して、図３のフローチャートを用いて説明する。 In FIG. 2, first, in step S201, the CPU 107 initializes data and parameters related to the structure of the neural network, and loads them in the GPU 108. The data and parameters related to the structure will be described later. The data and parameters related to the structure may be stored in the storage device 101, or loaded in the RAM 106. The neural network processing performed in step S201 will be described using the flowchart in FIG. 3.

まず、ステップＳ３０１では、ＣＰＵ１０７は、ニューラルネットワークの構造に関するデータを作成する。本実施形態では、ＣＮＮ（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）を用いる場合を例に、図４を用いて説明する。 First, in step S301, the CPU 107 creates data regarding the structure of a neural network. In this embodiment, an example in which a CNN (Convolutional Neural Network) is used will be described with reference to FIG. 4.

図４において、ニューラルネットワーク６０１は、複数のニューロン６０３を含む入力層６０５、隠れ層６０６、出力層６０７の階層構造を有し、各層のニューロンのノードを結合して構成されている。ニューラルネットワーク６０１では、入力ノード６０２からデータが入力されると、複数のニューロン６０３で計算が行われ、計算結果が出力ノード６０４に出力される。各ニューロンで計算される計算式の例を式（１）および式（２）に示す。 In FIG. 4, neural network 601 has a hierarchical structure of input layer 605 containing multiple neurons 603, hidden layer 606, and output layer 607, and is configured by connecting neuron nodes in each layer. When data is input from input node 602 in neural network 601, calculations are performed in multiple neurons 603, and the calculation results are output to output node 604. Examples of calculation formulas calculated by each neuron are shown in formula (1) and formula (2).

Ｙ＝ｆ（Ｚ） …（１）
Ｚ＝Ｗ×Ｘ＋ｂ …（２）
ここで、式（１）のＹはニューロンの出力データ、ｆ（Ｚ）は活性化関数（ａｃｔｉｖａｔｉｏｎ）を表す。式（２）のＺは、式（１）の活性化関数の入力データ、Ｗは重み（ｗｅｉｇｈｔ）、Ｘはニューロンの入力データ、ｂはバイアス（ｂｉａｓ）を表す。 Y = f(Z) ... (1)
Z = W × X + b ... (2)
Here, Y in formula (1) represents the output data of the neuron, f(Z) represents the activation function, Z in formula (2) represents the input data of the activation function in formula (1), W represents the weight, X represents the input data of the neuron, and b represents the bias.

ここで、ニューラルネットワーク６０１の構造に関するデータとは、ＣＮＮで行われる畳み込み演算やプーリング演算、アップサンプリング演算などの各階層の演算、各階層での活性化関数の演算、階層の層数、ニューロン６０３の数、ノードの結合情報などを含んだものである。また、本実施形態では、ニューロン６０３の数をチャネルと定義する。 Here, data related to the structure of the neural network 601 includes operations at each layer, such as convolution operations, pooling operations, and upsampling operations, performed by the CNN, operations of activation functions at each layer, the number of layers in the layer, the number of neurons 603, node connectivity information, and the like. In this embodiment, the number of neurons 603 is defined as a channel.

なお、本実施形態はあくまで一例であり、ニューラルネットワークはＣＮＮに限定されるものではない。例えば、ＧＡＮ（ＧｅｎｅｒａｔｉｖｅＡｄｖｅｒｓａｒｉａｌＮｅｔｗｏｒｋ）などを用いてもよいし、スキップコネクションなどを有してもよいし、ＲＮＮ（ＲｅｃｕｒｒｅｎｔＮｅｕｒａｌＮｅｔｗｏｒｋ）などのように再帰型であってもよい。また、図４は概念図であるため、隠れ層やチャネルを少なく表現しているが、これに限定されるものではない。例えば、入力層６０５、隠れ層６０６、出力層６０７はこれに限らずに、各層のチャネルを変えてもよいし、隠れ層６０６の数を増やしてもよい。 Note that this embodiment is merely an example, and the neural network is not limited to CNN. For example, a Generative Adversarial Network (GAN) may be used, a skip connection may be included, or a recurrent type such as a Recurrent Neural Network (RNN) may be used. Also, since FIG. 4 is a conceptual diagram, the number of hidden layers and channels is small, but this is not limiting. For example, the input layer 605, hidden layer 606, and output layer 607 are not limited to these, and the channels of each layer may be changed, or the number of hidden layers 606 may be increased.

次に、ステップＳ３０２では、ＣＰＵ１０７は、ステップＳ３０１でＧＰＵ１０８に展開されたニューラルネットワークのパラメータを初期化する。ここで、ニューラルネットワークのパラメータは、式（１）、式（２）で用いられる重みと、バイアスと、活性化関数のパラメータである。例えば、ニューラルネットワークのパラメータの初期化では、Ｈｅ初期化の手法を用いて、ＦＬＯＡＴ３２のビット精度でパラメータの初期値を設定する。 Next, in step S302, the CPU 107 initializes the parameters of the neural network deployed in the GPU 108 in step S301. Here, the parameters of the neural network are the weights, biases, and activation function parameters used in equations (1) and (2). For example, in initializing the parameters of the neural network, the He initialization method is used to set the initial values of the parameters with FLOAT32 bit accuracy.

なお、本実施形態は、あくまで一例であり、ニューラルネットワークパラメータの初期化の手法は、Ｈｅ初期化に限られず、それ以外の手法を用いてもよい。また、ニューラルネットワークのパラメータのビット精度は、ＦＬＯＡＴ３２に限らず、ＦＬＯＡＴ６４やＦＬＯＡＴ１６などのデータ形式であってもよい。 Note that this embodiment is merely an example, and the method for initializing the neural network parameters is not limited to He initialization, and other methods may be used. Furthermore, the bit precision of the neural network parameters is not limited to FLOAT32, and may be a data format such as FLOAT64 or FLOAT16.

ステップＳ３０３では、ＣＰＵ１０７は、ステップＳ３０１で作成したニューラルネットワークの構造に関するデータと、ステップＳ３０２で設定したニューラルネットワークのパラメータとを、ＧＰＵ１０８に展開する。 In step S303, the CPU 107 loads data regarding the structure of the neural network created in step S301 and the parameters of the neural network set in step S302 onto the GPU 108.

ここで、ニューラルネットワークの構造に関するデータとパラメータは、ＧＰＵ１０８に限らず、ストレージ装置１０１やＲＯＭ１０４やＲＡＭ１０６に一時的に保存してもよい。また、ニューラルネットワークの構造に関するデータとパラメータは、ＰｒｏｔｏｃｏｌＢｕｆｆｅｒｓのデータ形式のファイルを用いて同一ファイルに保存してもよいし、個別に保存してもよい。また、保存するファイルのデータ形式は、ＰｒｏｔｏｃｏｌＢｕｆｆｅｒｓに限定されず、ＨＤＦ（ＨｉｅｒａｒｃｈｉｃａｌＤａｔａＦｏｒｍａｔ）など他のデータ形式でもよい。 Here, the data and parameters related to the structure of the neural network may be temporarily stored not only in the GPU 108 but also in the storage device 101, the ROM 104, or the RAM 106. The data and parameters related to the structure of the neural network may be stored in the same file using a file in the Protocol Buffers data format, or may be stored separately. The data format of the saved file is not limited to Protocol Buffers, and may be another data format such as HDF (Hierarchical Data Format).

図２の説明に戻り、ステップＳ２０２では、ＣＰＵ１０７は、ストレージ装置１０１から学習用画像を取得し、ＲＡＭ１０６に展開する。ここで、学習用画像とは、ニューラルネットワークの学習時の入力画像となる訓練画像や正解画像の元となる画像のことである。本実施形態では、学習用画像はＲ、Ｇｒ、Ｇｂ、Ｂのカラーチャネルを持つベイヤー配列のＲＡＷ画像であり、取りうる値が０～１６３８３の１４ｂｉｔであることを例に説明する。ここで、ベイヤー配列のＲＡＷ画像をカラー画像に変換する処理を、ディベイヤー処理と呼ぶ。学習用画像はベイヤー配列のＲＡＷ画像に限るものではなく、ディベイヤー処理を行った後のＲＧＢ、ＸＹＺ、Ｌ＊ａ＊ｂ＊、ＨＳＶなどの色空間の画像でもよいし、ＹＵＶ、ＹＣｂＣｒ、ＹＰｂＰｒなどの輝度信号および色差信号をもつ画像でもよい。また、学習用画像は１枚取得してＲＡＭ１０６に展開もよいし、複数枚まとめて取得してＲＡＭ１０６に展開してもよい。 Returning to the explanation of FIG. 2, in step S202, the CPU 107 acquires a learning image from the storage device 101 and expands it in the RAM 106. Here, the learning image refers to an image that is the source of a training image or a correct answer image that is an input image when learning a neural network. In this embodiment, the learning image is a Bayer-array RAW image having R, Gr, Gb, and B color channels, and the possible values are 14 bits from 0 to 16383. Here, the process of converting a Bayer-array RAW image into a color image is called a de-Bayer process. The learning image is not limited to a Bayer-array RAW image, but may be an image in a color space such as RGB, XYZ, L*a*b*, or HSV after de-Bayer process, or an image having a luminance signal and a color difference signal such as YUV, YCbCr, or YPbPr. In addition, one learning image may be acquired and expanded in the RAM 106, or multiple learning images may be acquired together and expanded in the RAM 106.

ステップＳ２０３では、ＣＰＵ１０７は、画像処理部１０５を用いて、ＲＡＭ１０６に展開された学習用画像から正解画像と訓練画像を作成し、ＲＡＭ１０６に展開する。正解画像とは学習時にニューラルネットワークの出力画像の期待値となる画像である。訓練画像とは学習時にニューラルネットワークに入力される画像である。例えば、ノイズ除去を目的とするニューラルネットワークの場合、ノイズの非常に少ない画像を正解画像とする。本実施形態では、正解画像は学習用画像に対して前処理を行うことで作成する。前処理の例として、ガンマ補正やホワイトバランス補正などの画像の補正処理を行う。 In step S203, the CPU 107 uses the image processing unit 105 to create a correct answer image and a training image from the learning image expanded in the RAM 106, and expands them in the RAM 106. The correct answer image is an image that is the expected value of the output image of the neural network during learning. The training image is an image that is input to the neural network during learning. For example, in the case of a neural network intended for noise removal, an image with very little noise is considered to be the correct answer image. In this embodiment, the correct answer image is created by performing preprocessing on the learning image. Examples of preprocessing include image correction processing such as gamma correction and white balance correction.

次に、訓練画像とは、学習時にニューラルネットワークに入力する画像である。例えば、ノイズ除去を目的とするニューラルネットワークの場合、正解画像と比べてノイズを多く含んだ画像を訓練画像とする。本実施形態では、訓練画像は学習用画像に対して劣化処理と前処理を行うことで作成する。劣化処理の例として、画像にノイズパターンを付与する。また、訓練画像を作成するために行う前処理として、補正処理を行った後の画像をニューラルネットワークに入力することを前提とする場合に、ガンマ補正やホワイトバランス補正などの補正処理が行われる。また、ニューラルネットワークに画像回復の機能に加えて補正処理の機能を持たせるために、正解画像のみに前処理を行ってもよい。さらに、正解画像を作成するための前処理と訓練画像を作成するための前処理は異なる処理を行ってもよい。あるいは、前処理を行わず、学習用画像を正解画像として使用してもよいし、学習用画像に劣化処理のみを行った画像を訓練画像として使用してもよい。 Next, a training image is an image input to a neural network during learning. For example, in the case of a neural network for noise removal, an image containing more noise than a correct answer image is used as a training image. In this embodiment, a training image is created by performing degradation processing and preprocessing on a learning image. As an example of degradation processing, a noise pattern is added to an image. In addition, as a preprocessing performed to create a training image, correction processing such as gamma correction and white balance correction is performed on the assumption that an image after correction processing is input to the neural network. In addition, in order to give the neural network a correction processing function in addition to the image recovery function, preprocessing may be performed only on the correct answer image. Furthermore, different processing may be performed for the preprocessing for creating the correct answer image and the preprocessing for creating the training image. Alternatively, the learning image may be used as the correct answer image without performing preprocessing, or an image after only degradation processing is performed on the learning image may be used as the training image.

なお、前処理として、ガンマ補正やホワイトバランス補正を例に挙げたが、前処理はこれらに限定されるものではない。また、劣化処理はノイズパターンの付与に限らず、ニューラルネットワークが実現する機能の目的に応じて、画像の縮小処理やぼかし処理などを行ってもよい。 Note that, although gamma correction and white balance correction have been given as examples of pre-processing, pre-processing is not limited to these. Furthermore, degradation processing is not limited to adding noise patterns, and image reduction processing or blurring processing may also be performed depending on the purpose of the function realized by the neural network.

本実施形態では、ステップＳ２０２で取得した学習用画像に対して、ステップＳ２０３で正解画像と訓練画像を作成する手法を例として挙げたが、あらかじめ学習用画像に前処理を行った正解画像と劣化処理と前処理を行った訓練画像をストレージ装置１０１に保存しておいてもよい。また、同一の撮像装置、画角、被写体において、劣化が生じないあるいは劣化が軽微な条件の画像と、劣化が大きく生じる条件の画像とを撮影し、それぞれを正解画像と訓練画像としてもよい。 In this embodiment, a method of creating a correct image and a training image in step S203 from a learning image acquired in step S202 has been given as an example, but a correct image obtained by preprocessing the learning image and a training image obtained by degrading and preprocessing may be stored in the storage device 101. Also, with the same imaging device, angle of view, and subject, an image under conditions where no degradation or only slight degradation occurs and an image under conditions where significant degradation occurs may be captured, and these may be used as the correct image and the training image, respectively.

ステップＳ２０４では、ＣＰＵ１０７は、ステップＳ２０３でＲＡＭ１０６に展開された訓練画像を、ステップＳ２０１またはステップＳ２０９でＧＰＵ１０８に展開されたニューラルネットワークに入力する。そして、さらにニューラルネットワークの処理を行い、出力画像をＲＡＭ１０６に展開する。ここで、訓練画像をニューラルネットワークの入力画像と定義する。ステップＳ２０４で行われるニューラルネットワークの処理に関して、図５のフローチャートを用いて説明する。 In step S204, the CPU 107 inputs the training image expanded in the RAM 106 in step S203 to the neural network expanded in the GPU 108 in step S201 or step S209. Then, the CPU 107 performs further neural network processing and expands the output image in the RAM 106. Here, the training image is defined as an input image for the neural network. The neural network processing performed in step S204 will be described using the flowchart in FIG. 5.

まず、ステップＳ４０１では、ＣＰＵ１０７は、ステップＳ２０１またはステップＳ２０９で保存されたニューラルネットワークの構造に関するデータとパラメータを取得し、ニューラルネットワークをＧＰＵ１０８に展開する。ここで、ステップＳ２０１またはステップＳ２０９で既にニューラルネットワークをＧＰＵ１０８に展開している場合、この処理は省略する。 First, in step S401, the CPU 107 acquires the data and parameters related to the neural network structure stored in step S201 or step S209, and deploys the neural network on the GPU 108. Here, if the neural network has already been deployed on the GPU 108 in step S201 or step S209, this process is omitted.

本実施形態では、図４のように入力画像６０８をニューラルネットワーク６０１の入力ノード６０２に入力し、ニューラルネットワーク６０１で演算した結果をニューラルネットワーク６０１の出力ノード６０４から出力画像６０９として出力する例を説明する。また、本実施形態では、ニューラルネットワークはＵ－Ｎｅｔである場合を例に説明する。ここで、Ｕ－Ｎｅｔは、畳み込み演算、プーリング演算、アップサンプリング演算、および活性化関数の演算などを行うＣＮＮの一種である。なお、Ｕ－Ｎｅｔで行う演算については、後述する。 In this embodiment, an example will be described in which an input image 608 is input to an input node 602 of a neural network 601 as shown in FIG. 4, and the result of calculation in the neural network 601 is output as an output image 609 from an output node 604 of the neural network 601. Also, in this embodiment, an example will be described in which the neural network is a U-Net. Here, a U-Net is a type of CNN that performs convolution calculations, pooling calculations, upsampling calculations, activation function calculations, and the like. The calculations performed by a U-Net will be described later.

ステップＳ４０２では、ＣＰＵ１０７は、ステップＳ２０３でＲＡＭ１０６に展開されたニューラルネットワークの入力画像をステップＳ４０１でＧＰＵ１０８に展開されたニューラルネットワークに入力する。本実施形態では、図６のようにベイヤー配列のＲＡＷ画像であるニューラルネットワークの入力画像からＲ、Ｇｒ、Ｇｂ、Ｂのカラーチャネルに分離した画像を、図４のニューラルネットワーク６０１に入力する入力画像６０８とする。ここで、図４を例に説明すると、入力画像６０８のデータを入力ノード６０２に入力し、さらに、入力層６０５のニューロンに入力する。 In step S402, the CPU 107 inputs the input image of the neural network expanded in the RAM 106 in step S203 to the neural network expanded in the GPU 108 in step S401. In this embodiment, an image separated into R, Gr, Gb, and B color channels from the input image of the neural network, which is a RAW image with a Bayer array as shown in FIG. 6, is used as the input image 608 to be input to the neural network 601 in FIG. 4. Here, taking FIG. 4 as an example, the data of the input image 608 is input to the input node 602, and further input to the neurons of the input layer 605.

ステップＳ４０３では、ＧＰＵ１０８は、入力層６０５に入力されたデータに対してニューラルネットワーク６０１の各ニューロンで演算を行い、出力層６０７から出力データを出力する。 In step S403, the GPU 108 performs calculations on the data input to the input layer 605 in each neuron of the neural network 601, and outputs output data from the output layer 607.

ここで、ニューラルネットワーク６０１の演算の例として、縦３画素×横３画素のフィルタを用いた畳み込み演算について図７を用いて説明する。畳み込み演算とは、図７において、まず、入力データ８０１の縦３画素×横３画素のフィルタ８０２の領域の９画素に対して、重み８０３を用いた積和演算を行い、バイアス８０４を加算する計算のことを指すものとする。畳み込み演算の例を式（３）に示す。 As an example of the computation of the neural network 601, a convolution computation using a filter of 3 pixels vertical by 3 pixels horizontal will be described with reference to FIG. 7. In FIG. 7, the convolution computation refers to a calculation in which a product-sum computation is first performed on 9 pixels in an area of a filter 802 of 3 pixels vertical by 3 pixels horizontal of the input data 801 using weights 803, and then a bias 804 is added. An example of the convolution computation is shown in equation (3).

ここで、式（３）のＺ（ｘ，ｙ）は座標位置（ｘ，ｙ）における畳み込み演算結果８０５、Ｘは入力データ８０１の画素値、Ｗは重み８０３、ｂはバイアス８０４である。 Here, Z(x, y) in formula (3) is the convolution operation result 805 at the coordinate position (x, y), X is the pixel value of the input data 801, W is the weight 803, and b is the bias 804.

また、図７を例に説明すると、まず、入力データ８０１の注目画素Ｘ（ｘ，ｙ）を中心とした縦３画素×横３画素の９画素に対して、Ｗ（０，０）を中心とした縦３画素×横３画素の重み８０３をもつフィルタ８０２を用いて積和演算を行う。次に、積和演算結果にさらに、バイアス８０４を加えた結果を入力データ８０１の注目画素Ｘ（ｘ，ｙ）に対する畳み込み演算結果８０５とする。 Using Figure 7 as an example, first, a product-sum operation is performed on 9 pixels (3 pixels vertical x 3 pixels horizontal) centered on the pixel of interest X(x,y) of the input data 801 using a filter 802 with weights 803 (3 pixels vertical x 3 pixels horizontal) centered on W(0,0). Next, a bias 804 is further added to the product-sum operation result, and the result is used as the convolution operation result 805 for the pixel of interest X(x,y) of the input data 801.

次に、注目画素Ｘ（ｘ，ｙ）に対する畳み込み演算結果８０５を活性化関数８０６に入力して、ニューロンの出力データを出力する。また、活性化関数８０６には、ＬｅａｋｙＲｅＬＵ関数を用いる。ＬｅａｋｙＲｅＬＵ関数の式を式（４）に示す。 Next, the convolution operation result 805 for the pixel of interest X(x, y) is input to the activation function 806, and the neuron output data is output. In addition, the Leaky ReLU function is used for the activation function 806. The formula for the Leaky ReLU function is shown in formula (4).

ｆ(Ｚ(ｘ，ｙ))＝ｍａｘ(Ｚ(ｘ，ｙ)，Ｚ(ｘ，ｙ)×ａ) …（４）
ここで、式（４）のｆ（Ｚ（ｘ，ｙ））は座標（ｘ，ｙ）における活性化関数の結果、Ｚ（ｘ，ｙ）は式（３）の畳み込み演算結果８０５、ａはＬｅａｋｙＲｅＬＵ関数の係数、ｍａｘは、引数のうち最大値を出力する関数である。そして、式（３）、式（４）を用いた畳み込み演算を入力データ８０１の全ての画素に対して行い、その結果を出力データ８０７とする。 f(Z(x,y))=max(Z(x,y),Z(x,y)×a) ... (4)
Here, f(Z(x, y)) in formula (4) is the result of the activation function at coordinates (x, y), Z(x, y) is the convolution operation result 805 of formula (3), a is the coefficient of the Leaky ReLU function, and max is a function that outputs the maximum value among the arguments. Then, the convolution operation using formulas (3) and (4) is performed on all pixels of the input data 801, and the result is output data 807.

以上のように、ニューラルネットワーク６０１の各ニューロンで畳み込み演算、プーリング演算、アップサンプリング演算、および活性化関数の演算などの演算が行われ、出力層６０７から出力データが出力される。 As described above, each neuron in the neural network 601 performs operations such as convolution, pooling, upsampling, and activation function calculations, and output data is output from the output layer 607.

なお、本実施形態では、Ｕ－Ｎｅｔを例として、ニューラルネットワークの演算について説明したが、これに限らず、他の演算を行ってもよい。また、畳み込み演算のフィルタ８０２は縦３画素、横３画素とは異なる大きさとしてもよい。また、活性化関数８０６はＬｅａｋｙＲｅＬＵ関数以外のものでもよい。 In this embodiment, the neural network calculations have been described using U-Net as an example, but other calculations may be performed without being limited to this. Also, the filter 802 for the convolution calculation may have a size other than 3 pixels vertically and 3 pixels horizontally. Also, the activation function 806 may be something other than the Leaky ReLU function.

次に、ステップＳ４０４では、ＣＰＵ１０７は、ニューラルネットワーク６０１で行われた演算の出力データを取得し、ニューラルネットワークの出力画像をＲＡＭ１０６に展開する。本実施形態では、図８のように、図４のニューラルネットワーク６０１からＲ、Ｇｒ、Ｇｂ、Ｂのカラーチャネルに分離した画像である出力画像６０９が出力され、出力画像６０９をベイヤー配列のＲＡＷ画像に戻すことを例に説明する。 Next, in step S404, the CPU 107 acquires output data of the calculation performed by the neural network 601, and loads the output image of the neural network in the RAM 106. In this embodiment, as shown in FIG. 8, an output image 609, which is an image separated into the R, Gr, Gb, and B color channels, is output from the neural network 601 in FIG. 4, and the output image 609 is returned to a RAW image in a Bayer array.

ここで、図４を例に説明すると、出力層６０７のニューロンの出力データは出力ノード６０４に出力され、さらに、出力画像６０９が出力される。そして、出力画像６０９をベイヤー配列のＲＡＷ画像に戻した画像を、ニューラルネットワークの出力画像としてＲＡＭ１０６に展開する。なお、本実施形態では、ニューラルネットワークへ入力する訓練画像を、ベイヤー配列のＲＡＷ画像をＲ、Ｇｒ、Ｇｂ、Ｂのカラーチャネルに分離した画像にしている。しかし、本実施形態はあくまで一例であり、カラーチャネルに分離しなくてもよいし、複数枚まとめて入力してもよい。さらに、本実施形態では、ニューラルネットワーク６０１の出力は出力層６０７のニューロンからの出力のみとしているが、これに加えて、隠れ層６０６のニューロンの出力を特徴マップとして出力し、ＲＡＭ１０６に展開してもよい。 Here, taking FIG. 4 as an example, the output data of the neurons in the output layer 607 is output to the output node 604, and further, the output image 609 is output. Then, the image obtained by converting the output image 609 back into a RAW image in a Bayer array is expanded in the RAM 106 as the output image of the neural network. Note that in this embodiment, the training image input to the neural network is an image in which the RAW image in the Bayer array is separated into the color channels R, Gr, Gb, and B. However, this embodiment is merely an example, and it is not necessary to separate into color channels, and multiple images may be input together. Furthermore, in this embodiment, the output of the neural network 601 is only the output from the neurons in the output layer 607, but in addition to this, the output of the neurons in the hidden layer 606 may be output as a feature map and expanded in the RAM 106.

図２の説明に戻り、ステップＳ２０５では、画像処理部１０５は、ステップＳ２０４でＲＡＭ１０６に展開された出力画像に画像処理を行い、画像処理の行われた出力画像をＲＡＭ１０６に展開する。例えば、出力画像に行う画像処理としては、ニューラルネットワークの出力画像に対して補正処理を行った画像を使用することを前提とする場合に、ガンマ補正やホワイトバランス補正などの補正処理を行う。ここで、補正処理としてガンマ補正やホワイトバランス補正を例に挙げたが、補正処理はこれらに限定されるものではない。 Returning to the explanation of FIG. 2, in step S205, the image processing unit 105 performs image processing on the output image expanded in the RAM 106 in step S204, and expands the processed output image in the RAM 106. For example, the image processing performed on the output image includes correction processing such as gamma correction and white balance correction, assuming that an image that has been subjected to correction processing on the output image of a neural network is to be used. Here, gamma correction and white balance correction are given as examples of correction processing, but the correction processing is not limited to these.

ステップＳ２０６では、ＣＰＵ１０７は、ステップＳ２０５でＲＡＭ１０６に展開された出力画像と、ステップＳ２０３でＲＡＭ１０６に展開された正解画像の誤差を算出する。本実施形態では、出力画像と出力画像に対応する正解画像の画素値を損失関数に入力し、損失関数の出力結果を誤差とする。損失関数の例として平均二乗誤差を求める式を式（５）に示す。 In step S206, the CPU 107 calculates the error between the output image expanded in the RAM 106 in step S205 and the correct image expanded in the RAM 106 in step S203. In this embodiment, the pixel values of the output image and the correct image corresponding to the output image are input to a loss function, and the output result of the loss function is regarded as the error. As an example of a loss function, an equation for calculating the mean squared error is shown in equation (5).

ここで、式（５）のＭは平均二乗誤差、ｍは画像の水平の画素数、ｎは画像の垂直の画素数、Ｙ（ｘ，ｙ）は出力画像の注目画素の画素値、Ｊ（ｘ，ｙ）は正解画像の注目画素の画素値である。なお、損失関数は式（５）に限らない。また、損失関数は平均絶対誤差や交差エントロピー誤差などの他の関数を用いてもよい。 In equation (5), M is the mean square error, m is the number of horizontal pixels in the image, n is the number of vertical pixels in the image, Y(x, y) is the pixel value of the pixel of interest in the output image, and J(x, y) is the pixel value of the pixel of interest in the correct image. Note that the loss function is not limited to equation (5). Other functions such as mean absolute error and cross entropy error may also be used as the loss function.

ステップＳ２０７では、ＣＰＵ１０７は、ステップＳ２０４でＲＡＭ１０６に展開されている出力画像を、画素に関する評価値の算出に使用するか否かを判定する。例えば、ステップＳ２０６で算出された誤差を判定基準として、誤差が一定の値より小さい場合はステップＳ２０８に進み、画素に関する評価値を算出することを選択する。一方で誤差が一定の値より大きい場合はステップＳ２０９に進み、画素に関する評価値を算出しない。なお、判定基準はこれに限られるものではなく、学習回数やＳＳＩＭ（ＳｔｒｕｃｔｕｒａｌＳｉｍｉｌａｒｉｔｙＩｎｄｅｘＭｅａｓｕｒｅ）などの評価値を用いてもよい。 In step S207, the CPU 107 determines whether or not to use the output image expanded in the RAM 106 in step S204 to calculate an evaluation value for a pixel. For example, using the error calculated in step S206 as a criterion for determination, if the error is smaller than a certain value, the process proceeds to step S208, and a selection is made to calculate an evaluation value for the pixel. On the other hand, if the error is larger than the certain value, the process proceeds to step S209, and an evaluation value for the pixel is not calculated. Note that the criterion for determination is not limited to this, and evaluation values such as the number of learnings or SSIM (Structural Similarity Index Measure) may also be used.

ステップＳ２０８では、ＣＰＵ１０７は、画像処理部１０５を用いて、ステップＳ２０４でＲＡＭ１０６に展開された出力画像の画素に関する評価値を算出する。画素に関する評価値とは出力画像の画素値から算出した統計分布である。本実施形態では、画素に関する評価値を、ＲＡＷ画像の画素値の出現回数から平均ヒストグラムを逐次算出して得る場合を例に説明する。また、ステップＳ２０４で出力画像を１枚取得した場合を例に説明する。 In step S208, the CPU 107 uses the image processing unit 105 to calculate evaluation values for the pixels of the output image expanded in the RAM 106 in step S204. The evaluation values for the pixels are statistical distributions calculated from the pixel values of the output image. In this embodiment, an example will be described in which the evaluation values for the pixels are obtained by sequentially calculating an average histogram from the number of occurrences of pixel values in the RAW image. Also, an example will be described in which one output image is obtained in step S204.

まず、出力画像の１４ビットの取りうる値である０～１６３８３の画素値ごとに出現回数をカウントする。次に、ＲＡＭ１０６に展開されているこれまでの計算に使用した出力画像の枚数と現状算出されている平均ヒストグラムを取得する。ここで、出力画像が１枚目の場合は、出力画像の枚数と平均ヒストグラムの全ての画素値の出現回数は０とする。次に、０～１６３８３の画素値ごとに、式（６）を用いて逐次計算することで平均ヒストグラムを算出する。逐次計算の式の例を式（６）に示す。 First, the number of occurrences is counted for each pixel value from 0 to 16383, which are the possible 14-bit values of the output image. Next, the number of output images used in the calculations up to now and the currently calculated average histogram are obtained, which are expanded in RAM 106. Here, if the output image is the first one, the number of output images and the number of occurrences of all pixel values in the average histogram are set to 0. Next, the average histogram is calculated by sequentially calculating each pixel value from 0 to 16383 using equation (6). An example of the equation for sequential calculation is shown in equation (6).

Ｘn+1＝(ｎ×Ｘn ＋ｘn+1）／(ｎ＋１) …（６）
ここで、式（６）のｎは計算時までに平均値算出に使用した合計画像枚数、Ｘnはｎ番目の平均値、Ｘn+1はｎ＋１番目の平均値、ｘn+1はｎ＋１番目に使用した画像における画素値の出現回数を表す。 Xn+1=(n×Xn+xn+1)/(n+1) ... (6)
Here, n in equation (6) represents the total number of images used to calculate the average value up to the time of calculation, Xn represents the nth average value, Xn+1 represents the (n+1)th average value, and xn+1 represents the number of times the pixel value appears in the (n+1)th image used.

最後に、これまでの計算に使用した出力画像の枚数と現状算出されている平均ヒストグラムをＲＡＭ１０６に展開する。ここで、画素に関する評価値は平均ヒストグラムに限定せず、ディベイヤー処理を行った画像から画像の定量的な評価値である明度、彩度、コントラスト、シャープネスを算出し、出力画像の各画像の評価値から統計分布を算出してもよい。また、画素に関する評価値は、ＲＡＷ画像の全ての画素値から算出してもよいし、図６のようにＲＡＷ画像をＲ、Ｇｒ、Ｇｂ、Ｂのカラーチャネルに分離してカラーチャネルごとに算出してもよい。また、ステップＳ２０４にてＲＡＭ１０６に展開されている特徴マップから画素に関する評価値を算出してもよい。 Finally, the number of output images used in the calculations up to this point and the currently calculated average histogram are loaded into RAM 106. Here, the evaluation value for a pixel is not limited to the average histogram, and quantitative evaluation values of the image such as brightness, saturation, contrast, and sharpness may be calculated from the image that has been de-Bayered, and a statistical distribution may be calculated from the evaluation values of each image in the output images. Furthermore, the evaluation value for a pixel may be calculated from all pixel values of the RAW image, or may be calculated for each color channel by separating the RAW image into the R, Gr, Gb, and B color channels as shown in FIG. 6. Furthermore, the evaluation value for a pixel may be calculated from the feature map loaded into RAM 106 in step S204.

ステップＳ２０９では、ＣＰＵ１０７は、ステップＳ２０６で算出された誤差に基づいて、ＧＰＵ１０８に展開されているニューラルネットワークのパラメータを更新する。例えば、確率的勾配降下法などの最適化手法を用いて、ステップＳ２０６で算出された誤差が小さくなるようにパラメータを更新する。そして、ステップＳ２０２～Ｓ２０９の処理を繰り返し実行することで、ニューラルネットワークのパラメータが最適化される。なお、ニューラルネットワークのパラメータの最適化手法は確率的勾配降下法に限定されず、Ａｄａｍなど他の最適化手法でもよい。また、更新したニューラルネットワークのパラメータをストレージ装置１０１に一時的に保存してもよい。 In step S209, the CPU 107 updates the parameters of the neural network deployed in the GPU 108 based on the error calculated in step S206. For example, the parameters are updated using an optimization method such as stochastic gradient descent so as to reduce the error calculated in step S206. The processing of steps S202 to S209 is then repeated to optimize the parameters of the neural network. Note that the optimization method for the parameters of the neural network is not limited to stochastic gradient descent, and other optimization methods such as Adam may be used. The updated parameters of the neural network may also be temporarily stored in the storage device 101.

ステップＳ２１０では、ＣＰＵ１０７は、ニューラルネットワークの学習終了条件を満たすか否かを判定する。例えば、学習回数が既定の回数に達しているかを確認し、達している場合は、学習終了条件を満たしていると判断して、ステップＳ２１１へ進み学習を終了する。学習終了条件を満たしていない場合は、ステップＳ２０２へ戻り、学習を継続する。なお、学習終了条件はこれに限るものではない。例えば、ステップＳ２０６で算出された誤差が閾値以下となった場合に学習を終了してもよい。 In step S210, the CPU 107 determines whether the neural network learning end condition is met. For example, it checks whether the number of learning attempts has reached a preset number, and if so, it determines that the learning end condition is met, and proceeds to step S211 to end the learning. If the learning end condition is not met, the CPU 107 returns to step S202 and continues the learning. Note that the learning end condition is not limited to this. For example, the learning may be ended when the error calculated in step S206 falls below a threshold value.

ステップＳ２１１では、ＣＰＵ１０７は、学習が終了したニューラルネットワークのパラメータをストレージ装置１０１に保存する。また、パラメータの保存先はこれに限らず、ＲＯＭ１０４やＲＡＭ１０６を用いてもよい。 In step S211, the CPU 107 stores the parameters of the neural network for which learning has been completed in the storage device 101. The destination for storing the parameters is not limited to this, and the ROM 104 or the RAM 106 may also be used.

ステップＳ２１２では、ＣＰＵ１０７は、画像処理部１０５を用いて、ステップＳ２０８で算出された画素に関する評価値に基づいて量子化用画像を作成し、ＲＡＭ１０６に展開する。ここで、量子化用画像とは量子化時の入力画像の元となる画像のことである。例えば、ステップＳ２０８で画素に関する評価値として、図９のような出力画像の平均ヒストグラムを算出し、その結果に基づいて量子化用画像を１０枚作成する場合を例に図９を用いて説明する。 In step S212, the CPU 107 uses the image processing unit 105 to create an image for quantization based on the pixel evaluation value calculated in step S208, and loads it in the RAM 106. Here, the image for quantization refers to an image that is the source of the input image at the time of quantization. For example, an average histogram of the output image as shown in FIG. 9 is calculated as the pixel evaluation value in step S208, and an example of creating 10 images for quantization based on the result will be described with reference to FIG. 9.

ここで、図９は横軸を画素値、縦軸を画素数としたヒストグラムのグラフである。また、本実施形態において、出力画像は縦１２８×横１２８のサイズで、合計画素数が１６３８４である画像を例に説明する。 Here, FIG. 9 is a histogram graph with the horizontal axis representing pixel values and the vertical axis representing pixel counts. In this embodiment, an output image having a size of 128 vertical x 128 horizontal, and a total number of pixels of 16,384, will be described as an example.

まず、平均ヒストグラムとして、図９のような統計分布データから、画素値で昇順に並び替えた１６３８４個のデータを持つ一次元配列データを作成する。次に、平均ヒストグラムの一次元配列データから端のデータを除外する。例えば、データの開始位置から１９２個、データの終了位置から１９２個のデータを除外し、１６０００個のデータを持つ一次元配列データを作成する。次に、代表値の１個を求めるために使用するデータ数を算出する。ここで、代表値とは平均ヒストグラムを代表するような値でかつ、量子化用画像の作成する枚数と同じ個数のデータを持つ配列データである。そのため、本実施形態では代表値を１０個算出するので、代表値の１個を求めるために１６０００を１０で割った１６００個のデータを使用する。次に、式（７）を用いて、代表値を算出する。 First, one-dimensional array data having 16,384 pieces of data sorted in ascending order by pixel value is created as an average histogram from the statistical distribution data as shown in FIG. 9. Next, the data at the ends is excluded from the one-dimensional array data of the average histogram. For example, 192 pieces of data from the start position of the data and 192 pieces of data from the end position of the data are excluded to create one-dimensional array data having 16,000 pieces of data. Next, the number of pieces of data used to find one representative value is calculated. Here, a representative value is a value that represents the average histogram, and is array data having the same number of pieces of data as the number of images to be created for quantization. Therefore, in this embodiment, 10 representative values are calculated, so 1600 pieces of data, obtained by dividing 16,000 by 10, are used to find one representative value. Next, the representative value is calculated using formula (7).

ここで、式（７）のｄ［i］はインデックスｉにおける代表値の値、ｎは代表値の１個を求めるために使用するデータ数、Ｘは端のデータを除外した後の平均ヒストグラムの一次元配列データを表す。本実施形態では、ｎが１６００で、１６００個のデータ毎に平均値を算出し、１０個データを持つ代表値ｄを算出する。 Here, d[i] in formula (7) represents the value of the representative value at index i, n represents the number of data points used to find one representative value, and X represents the one-dimensional array data of the average histogram after excluding the data points at the ends. In this embodiment, n is 1600, and the average value is calculated for every 1600 data points, and a representative value d having 10 data points is calculated.

次に、代表値に基づいて量子化用画像を作成する。例えば、量子化用画像は全ての画素の画素値を代表値にした画像を１０枚作成する。その結果、出力画像の平均ヒストグラムの特徴を少ない枚数で再現するような、出力画像の特徴を簡易的に表わす量子化用画像を作成することができる。ここで、出力画像の画素数と、端のデータとして除外するデータ数と、作成する量子化用画像の枚数は一意の値に限定されるものではない。また、平均ヒストグラムから除外するデータは、画素値が５００以下または１４０００以上のデータは除外するといったように、画素値の条件に基づいて決定してもよいし、除外するデータをなくしてもよい。また、代表値の算出方法は、本実施形態の方法に限定されず、０～５００の画素値から１個の代表値を算出するといったように、画素値の範囲を指定してその範囲の中から代表値を算出してもよい。また、代表値は、平均値だけではなく、中央値など、それ以外の方法で算出してもよい。また、ステップＳ２０８で画素に関する評価値をチャネルごとなど複数算出している場合、代表値はチャネルごとに算出してもよい。また、ＲＡＭ１０６から出力画像またはそれ以外の画像を取得し、代表値を、取得した画像の画素値の平均値で割った値を補正係数とし、補正係数を取得した画像に掛けることで、代表値と同等の値になるような画像を作成してもよい。また、代表値はチャネルごとに算出した場合、チャネルごとに上記の処理を行い、量子化用画像を作成してもよい。 Next, a quantization image is created based on the representative value. For example, ten images for quantization are created in which the pixel values of all pixels are set to the representative value. As a result, a quantization image that simply represents the characteristics of the output image, such as reproducing the characteristics of the average histogram of the output image with a small number of images, can be created. Here, the number of pixels of the output image, the number of data to be excluded as edge data, and the number of images for quantization to be created are not limited to unique values. In addition, the data to be excluded from the average histogram may be determined based on pixel value conditions, such as excluding data with pixel values of 500 or less or 14,000 or more, or the excluded data may be eliminated. In addition, the method of calculating the representative value is not limited to the method of this embodiment, and a range of pixel values may be specified and a representative value may be calculated from within the range, such as calculating one representative value from pixel values of 0 to 500. In addition, the representative value may be calculated not only by the average value but also by other methods such as the median. In addition, if multiple evaluation values related to pixels are calculated in step S208, such as for each channel, the representative value may be calculated for each channel. Alternatively, an output image or another image may be obtained from the RAM 106, the representative value divided by the average pixel value of the obtained image may be used as a correction coefficient, and the obtained image may be multiplied by the correction coefficient to create an image having a value equivalent to the representative value. If the representative value is calculated for each channel, the above process may be performed for each channel to create an image for quantization.

次に、ステップＳ２１３では、ＣＰＵ１０７は、画像処理部１０５を用いて、ステップＳ２１２でＲＡＭ１０６に展開された量子化用画像に基づいて量子化のための入力画像を作成し、ＲＡＭ１０６に展開する。本実施形態では、量子化のための入力画像が、量子化用画像に対してニューラルネットワーク学習時の入力画像である訓練画像を作成する際と同様の劣化処理と前処理を行うことで作成される場合を例に説明する。まず、量子化用画像に対して、ステップＳ２０３の訓練画像を作成するために行う劣化処理と同様の処理を行う。次に、劣化処理を行った量子化用画像に対して、ステップＳ２０３の訓練画像を作成するために行う前処理と同様の処理を行う。ここで、ニューラルネットワークの学習時に訓練画像を作成するための前処理を行わない場合、量子化用画像に前処理を行わなくてもよい。 Next, in step S213, the CPU 107 uses the image processing unit 105 to create an input image for quantization based on the image for quantization expanded in the RAM 106 in step S212, and expands it in the RAM 106. In this embodiment, an example will be described in which the input image for quantization is created by performing degradation processing and preprocessing on the image for quantization similar to that performed when creating a training image, which is an input image during neural network training. First, the image for quantization is subjected to a process similar to the degradation processing performed to create the training image in step S203. Next, the image for quantization that has been subjected to degradation processing is subjected to a process similar to the preprocessing performed to create the training image in step S203. Here, if preprocessing is not performed to create a training image during neural network training, it is not necessary to perform preprocessing on the image for quantization.

ステップＳ２１４では、ＣＰＵ１０７は、ステップＳ２１１でストレージ装置１０１に保存されたニューラルネットワークのパラメータを取得し、ステップＳ２１２でＲＡＭ１０６に展開された量子化のための入力画像を使用して、パラメータの量子化を行う。 In step S214, the CPU 107 acquires the neural network parameters stored in the storage device 101 in step S211, and quantizes the parameters using the input image for quantization expanded in the RAM 106 in step S212.

ステップＳ２１３の処理に関して、学習後量子化（ＰｏｓｔＴｒａｉｎｉｎｇＱｕａｎｔｉｚａｔｉｏｎ）を静的量子化（ｓｔａｔｉｃｑｕａｎｔｉｚａｔｉｏｎ）で行う場合を例に図１０のフローチャートを用いて説明する。本実施形態では、量子化で、ニューラルネットワークの層を構成するパラメータである重み（ｗｅｉｇｈｔ）、バイアス（ｂｉａｓ）、活性化関数（ａｃｔｉｖａｔｉｏｎ）のビット精度をＦＬＯＡＴ３２からＩＮＴ８へ変換することを前提に説明する。 The process of step S213 will be described with reference to the flowchart in FIG. 10, taking as an example a case where post-training quantization is performed using static quantization. In this embodiment, the description will be given on the premise that the bit precision of the weights, biases, and activation functions, which are parameters constituting the layers of the neural network, is converted from FLOAT32 to INT8 in quantization.

まず、ステップＳ５０１では、ＣＰＵ１０７は、量子化のための入力画像を学習済みのニューラルネットワークに入力する。ステップＳ５０２では、ＣＰＵ１０７は、ニューラルネットワークの層ごとに出力データの最大値と最小値を算出する。ステップＳ５０３では、ＣＰＵ１０７は、最大値と最小値に基づいて、ニューラルネットワークの層ごとの出力データのスケールとオフセットを算出する。ステップＳ５０４では、ＣＰＵ１０７は、スケールとオフセットに基づいて、重みとバイアスを量子化し、さらに活性化関数は、量子化後のビット精度で再現できる個数以下の線形関数に近似することで、量子化する。 First, in step S501, the CPU 107 inputs an input image for quantization to a trained neural network. In step S502, the CPU 107 calculates the maximum and minimum values of the output data for each layer of the neural network. In step S503, the CPU 107 calculates the scale and offset of the output data for each layer of the neural network based on the maximum and minimum values. In step S504, the CPU 107 quantizes the weights and biases based on the scale and offset, and further quantizes the activation function by approximating it to a linear function with a number of digits or less that can be reproduced with the bit accuracy after quantization.

ステップＳ５０５では、ＣＰＵ１０７は、ニューラルネットワークのパラメータを量子化した場合のニューラルネットワークの層ごとの出力データのスケールとオフセットを算出する。ステップＳ５０６では、ＣＰＵ１０７は、量子化前の場合のスケールとオフセットとの一致精度の評価を行い、量子化したニューラルネットワークのパラメータをＲＡＭ１０６に展開する。 In step S505, the CPU 107 calculates the scale and offset of the output data for each layer of the neural network when the neural network parameters are quantized. In step S506, the CPU 107 evaluates the accuracy of matching between the scale and offset before quantization, and loads the quantized neural network parameters into the RAM 106.

ステップＳ５０７では、ＣＰＵ１０７は、量子化の終了条件を満たすか否かを判定し、条件を満たさない場合、ステップＳ５０１に戻り、新たな量子化のための入力画像を用いて量子化を進める。一方、量子化の終了条件を満たす場合、ステップＳ５０８に進む。量子化の終了条件とは、量子化回数が規定値に達した場合でもよいし、量子化後の結果を確認し、ユーザの判断で終了してもよい。 In step S507, the CPU 107 determines whether the quantization end condition is satisfied. If the condition is not satisfied, the process returns to step S501, and the quantization continues using a new input image for quantization. On the other hand, if the quantization end condition is satisfied, the process proceeds to step S508. The quantization end condition may be when the number of quantizations reaches a specified value, or the quantization may be terminated at the user's discretion after checking the results after quantization.

最後に、ステップＳ５０８では、ＣＰＵ１０７は、ステップＳ５０６の結果に基づいて、層ごとにスケールとオフセットの一致精度が最も高い評価となったときの量子化されたパラメータに更新する。 Finally, in step S508, the CPU 107 updates the quantized parameters for each layer based on the results of step S506 to those for which the scale and offset matching accuracy is evaluated to be the highest.

なお、本実施形態では、ＣＰＵ１０７で量子化を行うと説明したが、ＧＰＵ１０８で量子化を行ってもよい。また、学習後量子化を静的量子化で行う場合を例に説明したが、量子化の方法はそれに限定されるものではなく、量子化を考慮した学習（ＱｕａｎｔｉｚａｔｉｏｎＡｗａｒｅＴｒａｉｎｉｎｇ）などでもよい。また、量子化は、重み、バイアス、活性化関数のいずれか１つ以上に対して行ってもよい。また、量子化するビット精度は、ＩＮＴ８に限らず、ＦＬＯＡＴ１６、ＩＮＴ１６、ＩＮＴ４など、ニューラルネットワークの学習時のビット精度より低いビット精度のデータ形式であればよい。 In the present embodiment, the quantization is performed by the CPU 107, but the quantization may be performed by the GPU 108. Although the quantization after learning is performed by static quantization, the quantization method is not limited to this, and learning that takes quantization into consideration (QuantizationAwareTraining) may also be used. Quantization may be performed on one or more of the weights, biases, and activation functions. The bit precision for quantization is not limited to INT8, and may be any data format with a lower bit precision than the bit precision during learning of the neural network, such as FLOAT16, INT16, or INT4.

図２の説明に戻り、ステップＳ２１５では、ＣＰＵ１０７は、ステップＳ２１３で量子化したニューラルネットワークのパラメータをストレージ装置１０１に記憶させる。本実施形態においては、ストレージ装置に記憶させる前提で説明しているが、その他の記憶媒体に記憶させてもよい。 Returning to the explanation of FIG. 2, in step S215, the CPU 107 stores the neural network parameters quantized in step S213 in the storage device 101. In this embodiment, the explanation is given on the assumption that the parameters are stored in the storage device, but the parameters may be stored in other storage media.

以上説明したように、本実施形態では、ニューラルネットワークの出力画像の画素に関する評価値に基づいて、量子化のための入力画像を作成する。これにより、ニューラルネットワークの出力画像の特徴を少ない枚数で精度良く再現した（出力画像に近似した）量子化のための入力画像を作成することができる。そして、作成した量子化のための入力画像を用いて量子化を行うことで、画像回復を目的とするディープラーニングにおいて、量子化による演算精度の低下のばらつきを抑え、さらに、演算精度の低下を抑制することが可能となる。 As described above, in this embodiment, an input image for quantization is created based on evaluation values for pixels of an output image of a neural network. This makes it possible to create an input image for quantization that accurately reproduces the characteristics of the output image of a neural network with a small number of images (approximates the output image). Then, by performing quantization using the created input image for quantization, it becomes possible to suppress the variation in the decrease in calculation accuracy due to quantization in deep learning aimed at image restoration, and further suppress the decrease in calculation accuracy.

＜第２の実施形態＞
次に、本発明の第２の実施形態の画像処理装置１００で行う量子化工程について、図１１を用いて説明する。図１１は第２の実施形態の量子化工程を示すフローチャートである。図１１のフローチャートにおいて、図２のフローチャートと同様の処理をするステップは図２と同じステップ番号を付加し、説明は省略する。 Second Embodiment
Next, the quantization process performed by the image processing device 100 according to the second embodiment of the present invention will be described with reference to Fig. 11. Fig. 11 is a flowchart showing the quantization process according to the second embodiment. In the flowchart of Fig. 11, steps that perform the same processes as those in the flowchart of Fig. 2 are given the same step numbers as those in Fig. 2, and descriptions thereof will be omitted.

また前提として、ニューラルネットワークは学習が完了し、構造に関するデータとパラメータがストレージ装置１０１に保存されているものとする。ここで、学習が完了しているニューラルネットワークとは、前述の図２において、ステップＳ２０２からステップＳ２０６までと、ステップＳ２０９と、ステップＳ２１１がそれぞれ１回以上実施され、パラメータが更新された状態のニューラルネットワークのことである。 It is also assumed that the neural network has completed learning, and that data and parameters related to the structure are stored in the storage device 101. Here, a neural network that has completed learning refers to a neural network in which steps S202 to S206, S209, and S211 in FIG. 2 have each been performed at least once, and the parameters have been updated.

ステップＳ１１０１では、ＣＰＵ１０７は、学習済みのニューラルネットワークの構造に関するデータおよびパラメータをストレージ装置１０１から取得しニューラルネットワークをＧＰＵ１０８に展開する。なお、学習済みのニューラルネットワークの構造に関するデータやパラメータがＲＡＭ１０６やＲＯＭ１０４などの他の記憶装置に保存されている場合は、ＲＯＭ１０４やＲＡＭ１０６などの他の記憶装置から取得してもよい。また、ニューラルネットワークの構造は、ストレージ装置１０１から取得せずに、学習時と同一のものを再構築してもよい。 In step S1101, the CPU 107 acquires data and parameters relating to the structure of the trained neural network from the storage device 101 and deploys the neural network on the GPU 108. Note that if the data and parameters relating to the structure of the trained neural network are stored in another storage device such as the RAM 106 or ROM 104, they may be acquired from the other storage device such as the ROM 104 or RAM 106. Also, the structure of the neural network may be reconstructed to be the same as that at the time of training, without acquiring it from the storage device 101.

ステップＳ１１０２では、ＣＰＵ１０７は、ストレージ装置１０１から推論用画像を取得して、ＲＡＭ１０６に展開する。推論用画像とは、学習済みニューラルネットワークの推論時の入力画像の元となる画像である。推論用画像は１枚または複数枚でもよい。また、学習用画像の一部を推論用画像として使用してもよい。あるいは、第１の実施形態で示した方法で作成した出力画像や量子化用画像を使用してもよい。 In step S1102, the CPU 107 acquires an image for inference from the storage device 101 and expands it in the RAM 106. The image for inference is an image that is the source of the input image during inference of the trained neural network. There may be one or more images for inference. Also, a part of the image for learning may be used as the image for inference. Alternatively, an output image or an image for quantization created by the method shown in the first embodiment may be used.

次にステップＳ１１０３では、ＣＰＵ１０７は、画像処理部１０５を用いて、ステップＳ１１０２でＲＡＭ１０６に展開されている推論用画像に対して画像処理を行い、ニューラルネットワークの入力画像としてＲＡＭ１０６に展開する。画像処理はステップＳ２０３の訓練画像を作成するための処理と同様の処理を行う。 Next, in step S1103, the CPU 107 uses the image processing unit 105 to perform image processing on the inference image expanded in the RAM 106 in step S1102, and expands the image in the RAM 106 as an input image for the neural network. The image processing is the same as the processing for creating the training image in step S203.

ステップＳ１１０４では、ＣＰＵ１０７は、ステップＳ１１０３でＲＡＭ１０６に展開されている入力画像を取得し、ＧＰＵ１０８に展開されているニューラルネットワークに入力し、ステップＳ２０４と同様の処理を行う。ニューラルネットワークの出力画像はＲＡＭ１０６に展開される。 In step S1104, the CPU 107 acquires the input image expanded in the RAM 106 in step S1103, inputs it to the neural network expanded in the GPU 108, and performs the same processing as in step S204. The output image of the neural network is expanded in the RAM 106.

ステップＳ２０８では、ＣＰＵ１０７は、図２と同様に、出力画像に基づいて画素に関する評価値を算出する。 In step S208, the CPU 107 calculates an evaluation value for the pixel based on the output image, as in FIG. 2.

ステップＳ１１０５では、ＣＰＵ１０７は、すべての推論用画像をニューラルネットワークに入力したかを判定する。すべての推論用画像をニューラルネットワークに入力した場合はステップＳ２１２に進み、画素に関する評価値から量子化用画像を作成する。一方でニューラルネットワークに入力していない推論用画像が存在する場合は、ステップＳ１１０２に戻り、次の推論用画像を取得してニューラルネットワークに入力する。 In step S1105, the CPU 107 determines whether all the images for inference have been input to the neural network. If all the images for inference have been input to the neural network, the process proceeds to step S212, where an image for quantization is created from the evaluation values for the pixels. On the other hand, if there are images for inference that have not been input to the neural network, the process returns to step S1102, where the next image for inference is obtained and input to the neural network.

ステップＳ２１２からステップＳ２１５までは、図２と同様の処理である。 Steps S212 to S215 are the same as those in FIG. 2.

以上説明したように、本実施形態によれば、推論用画像を学習済みニューラルネットワークに入力した際に得られる出力画像に基づいて、第１の実施形態と同様の手法により、量子化のための入力画像を作成する。これにより、ニューラルネットワークの出力画像の特徴を少ない枚数で精度良く再現した（出力画像に近似した）量子化のための入力画像を作成することができる。そして、作成した量子化のための入力画像を用いて量子化を行うことで、画像回復を目的とするディープラーニングにおいて、量子化による演算精度の低下のばらつきを抑え、さらに、演算精度の低下を抑制することが可能となる。 As described above, according to this embodiment, an input image for quantization is created using a method similar to that of the first embodiment, based on the output image obtained when an image for inference is input to a trained neural network. This makes it possible to create an input image for quantization that accurately reproduces the characteristics of the output image of the neural network with a small number of images (approximates the output image). Then, by performing quantization using the created input image for quantization, it is possible to suppress the variation in the decrease in calculation accuracy due to quantization in deep learning aimed at image recovery, and further suppress the decrease in calculation accuracy.

なお、上記の２つの実施形態で説明した画像処理装置１００を撮像装置が備える構成であってもよく、撮像装置が撮像して取得した画像データに対して各種の画像処理を実行するようにしてもよい。 The image processing device 100 described in the above two embodiments may be included in the imaging device, and various types of image processing may be performed on image data captured and acquired by the imaging device.

本明細書の開示は、以下の画像処理装置、方法、プログラムおよび記憶媒体を含む。 The disclosure of this specification includes the following image processing device, method, program, and storage medium.

（項目１）
ニューラルネットワークの学習に使用される訓練画像を前記ニューラルネットワークに入力して得られた出力画像の特徴を表す評価値を取得する取得手段と、
前記評価値に基づいて、前記出力画像の特徴を簡易的に表わす入力画像を生成する生成手段と、
生成された前記入力画像を前記ニューラルネットワークに入力して、前記ニューラルネットワークの層を構成するパラメータのビット数を削減する削減手段と、
を備えることを特徴とする画像処理装置。 (Item 1)
an acquisition means for acquiring an evaluation value representing a feature of an output image obtained by inputting a training image used for learning the neural network into the neural network;
a generating means for generating an input image simply expressing the characteristics of the output image based on the evaluation value;
a reduction means for inputting the generated input image to the neural network and reducing the number of bits of parameters constituting a layer of the neural network;
An image processing device comprising:

（項目２）
前記削減手段は、前記入力画像を学習済みの前記ニューラルネットワークに入力して、前記ビット数を削減することを特徴とする項目１に記載の画像処理装置。 (Item 2)
2. The image processing device according to item 1, wherein the reduction means inputs the input image to the trained neural network to reduce the number of bits.

（項目３）
前記評価値は、前記出力画像の平均ヒストグラムであることを特徴とする項目１または２に記載の画像処理装置。 (Item 3)
3. The image processing device according to item 1 or 2, wherein the evaluation value is an average histogram of the output image.

（項目４）
前記平均ヒストグラムは、逐次計算することによって算出されることを特徴とする項目３に記載の画像処理装置。 (Item 4)
4. The image processing device according to item 3, wherein the average histogram is calculated by sequential calculation.

（項目５）
前記生成手段は、前記評価値の代表値に基づいて、前記入力画像を生成することを特徴とする項目１乃至４のいずれか１項目に記載の画像処理装置。 (Item 5)
5. The image processing device according to claim 1, wherein the generating means generates the input image based on a representative value of the evaluation values.

（項目６）
前記代表値は、前記入力画像と同じ個数のデータであることを特徴とする項目５に記載の画像処理装置。 (Item 6)
6. The image processing device according to item 5, wherein the representative values are data of the same number as the input image.

（項目７）
前記訓練画像と、前記ニューラルネットワークの出力画像の期待値である正解画像とは、１つの元になる画像から生成されることを特徴とする項目１乃至６のいずれか１項目に記載の画像処理装置。 (Item 7)
7. The image processing device according to any one of items 1 to 6, characterized in that the training image and a correct answer image, which is an expected value of the output image of the neural network, are generated from a single original image.

（項目８）
前記訓練画像は、前記元になる画像に、ノイズ付加処理または劣化処理を施すことにより生成されることを特徴とする項目７に記載の画像処理装置。 (Item 8)
8. The image processing device according to item 7, wherein the training images are generated by performing noise addition processing or degradation processing on the original images.

（項目９）
前記正解画像は、前記元になる画像に、前処理を施すことにより生成されることを特徴とする項目７に記載の画像処理装置。 (Item 9)
8. The image processing device according to item 7, wherein the correct image is generated by performing preprocessing on the original image.

（項目１０）
前記出力画像と前記ニューラルネットワークの出力画像の期待値である正解画像の誤差を算出する算出手段と、前記誤差に基づいて前記出力画像から前記評価値を取得するか否かを選択する選択手段とをさらに備えることを特徴とする項目１乃至９のいずれか１項目に記載の画像処理装置。 (Item 10)
The image processing device according to any one of items 1 to 9, further comprising a calculation means for calculating an error between the output image and a correct image which is an expected value of the output image of the neural network, and a selection means for selecting whether or not to obtain the evaluation value from the output image based on the error.

（項目１１）
項目１乃至１０のいずれか１項目に記載の画像処理装置を備えることを特徴とする撮像装置。 (Item 11)
11. An imaging device comprising the image processing device according to any one of items 1 to 10.

（項目１２）
ニューラルネットワークの学習に使用される訓練画像を前記ニューラルネットワークに入力して得られた出力画像の特徴を表す評価値を取得する取得工程と、
前記評価値に基づいて、前記出力画像の特徴を簡易的に表わす入力画像を生成する生成工程と、
生成された前記入力画像を前記ニューラルネットワークに入力して、前記ニューラルネットワークの層を構成するパラメータのビット数を削減する削減工程と、
を有することを特徴とする画像処理方法。 (Item 12)
an acquisition step of acquiring an evaluation value representing a feature of an output image obtained by inputting a training image used for learning the neural network into the neural network;
a generating step of generating an input image simply expressing the characteristics of the output image based on the evaluation value;
a reduction step of inputting the generated input image to the neural network and reducing the number of bits of parameters constituting a layer of the neural network;
13. An image processing method comprising:

（項目１３）
項目１２に記載の画像処理方法の各工程をコンピュータに実行させるためのプログラム。 (Item 13)
Item 13. A program for causing a computer to execute each step of the image processing method according to item 12.

（項目１４）
項目１２に記載の画像処理方法の各工程をコンピュータに実行させるためのプログラムを記憶したコンピュータが読み取り可能な記憶媒体。 (Item 14)
13. A computer-readable storage medium storing a program for causing a computer to execute each step of the image processing method according to item 12.

（他の実施形態）
また本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現できる。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現できる。 Other Embodiments
The present invention can also be realized by a process in which a program for realizing one or more of the functions of the above-described embodiments is supplied to a system or device via a network or a storage medium, and one or more processors in a computer of the system or device read and execute the program. The present invention can also be realized by a circuit (e.g., ASIC) for realizing one or more of the functions.

発明は上記実施形態に制限されるものではなく、発明の精神及び範囲から離脱することなく、様々な変更及び変形が可能である。従って、発明の範囲を公にするために請求項を添付する。 The invention is not limited to the above-described embodiment, and various modifications and variations are possible without departing from the spirit and scope of the invention. Therefore, the following claims are appended to disclose the scope of the invention.

１００：画像処理装置、１０１：ストレージ装置、１０２：ストレージ接続部、１０３：ストレージ駆動部、１０４：ＲＯＭ、１０５：画像処理部、１０６：ＲＡＭ、１０７：ＣＰＵ、１０８：ＧＰＵ、１０９：内部バス 100: Image processing device, 101: Storage device, 102: Storage connection unit, 103: Storage drive unit, 104: ROM, 105: Image processing unit, 106: RAM, 107: CPU, 108: GPU, 109: Internal bus

Claims

an acquisition means for acquiring an evaluation value representing a feature of an output image obtained by inputting a training image used for learning the neural network into the neural network;
a generating means for generating an input image simply expressing the characteristics of the output image based on the evaluation value;
a reduction means for inputting the generated input image to the neural network and reducing the number of bits of parameters constituting a layer of the neural network;
An image processing device comprising:

The image processing device according to claim 1, characterized in that the reduction means reduces the number of bits by inputting the input image into the trained neural network.

The image processing device according to claim 1, characterized in that the evaluation value is an average histogram of the output image.

The image processing device according to claim 3, characterized in that the average histogram is calculated by sequential calculation.

The image processing device according to claim 1, characterized in that the generating means generates the input image based on a representative value of the evaluation values.

The image processing device according to claim 5, characterized in that the representative values are the same number of data as the input image.

The image processing device according to claim 1, characterized in that the training image and the correct answer image, which is the expected value of the output image of the neural network, are generated from a single original image.

The image processing device according to claim 7, characterized in that the training images are generated by performing noise addition processing or degradation processing on the original images.

The image processing device according to claim 7, characterized in that the correct image is generated by performing preprocessing on the original image.

The image processing device according to claim 1, further comprising a calculation means for calculating an error between the output image and a correct image, which is an expected value of the output image of the neural network, and a selection means for selecting whether or not to obtain the evaluation value from the output image based on the error.

An imaging device comprising an image processing device according to any one of claims 1 to 10.

an acquisition step of acquiring an evaluation value representing a feature of an output image obtained by inputting a training image used for learning the neural network into the neural network;
a generating step of generating an input image simply expressing the characteristics of the output image based on the evaluation value;
a reduction step of inputting the generated input image to the neural network and reducing the number of bits of parameters constituting a layer of the neural network;
13. An image processing method comprising:

A program for causing a computer to execute each step of the image processing method according to claim 12.

A computer-readable storage medium storing a program for causing a computer to execute each step of the image processing method according to claim 12.