JP6886887B2

JP6886887B2 - Error calculator and its program

Info

Publication number: JP6886887B2
Application number: JP2017150154A
Authority: JP
Inventors: 伶遠藤; 吉彦河合
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2017-08-02
Filing date: 2017-08-02
Publication date: 2021-06-16
Anticipated expiration: 2037-08-02
Also published as: JP2019029938A

Description

本発明は、誤差計算器およびそのプログラムに関し、特に、ニューラルネットワークの学習における誤差を計算する誤差計算器およびそのプログラムに関する。 The present invention relates to an error computer and its program, and more particularly to an error computer and its program for calculating errors in learning a neural network.

従来、デジタルデータ化されたモノクロ画像への自動色付け技術が複数開発されている。デジタルデータ化されたモノクロ画像には、画像特徴量であるカラー情報の手がかりがほとんどない。このため、フィルムなどの物理媒体に記録されたアナログ画像のカラー化と比べて、難易度が高いといわれている。例えば、モノクロデータをカラーデータに変換する方法が知られている（特許文献１参照）。この方法は、モノクロデータに記録された特定の物体を仮定し、この特定の物体から色分布モデルを計算する。そして、計算した色分布モデルからカラー情報を推定する。この方法では、カラー化する対象を特定の物体と仮定しているため、この仮定した物体とモノクロ画像の中の物体とが異なる場合には、モノクロ画像を自然なカラー画像に変換することは難しい。 Conventionally, a plurality of automatic coloring techniques for monochrome images converted into digital data have been developed. Monochrome images converted to digital data have few clues for color information, which is an image feature amount. Therefore, it is said that the difficulty level is higher than that of colorizing an analog image recorded on a physical medium such as a film. For example, a method of converting monochrome data into color data is known (see Patent Document 1). This method assumes a specific object recorded in monochrome data and calculates a color distribution model from this specific object. Then, the color information is estimated from the calculated color distribution model. In this method, it is assumed that the object to be colorized is a specific object, so if the assumed object and the object in the monochrome image are different, it is difficult to convert the monochrome image into a natural color image. ..

このような問題を解決するため、近年、いわゆる機械学習技術を用いて、カラー化する対象をより汎用的にしたカラー情報の推定方法が複数提案されている（非特許文献１、非特許文献２参照）。これらの機械学習技術を用いたカラー情報の推定方法では、さまざまな物体が写った膨大な量のカラー画像を用意することを前提としている。そして、このような膨大な量のカラー画像に基づいて機械学習し、カラー情報推定器を生成する。この
際、例えば、いわゆるニューラルネットワーク等で構成された機械学習器に、予め用意した膨大なカラー画像を入力し、モノクロ画像とカラー情報との対応関係を学習させる。このように生成したカラー情報推定器により、多様なモノクロ画像から精度よくカラー情報を推定することができる。これにより、デジタルデータ化されたモノクロ画像を自然なカラー画像に変換することが可能である。 In order to solve such a problem, in recent years, a plurality of methods for estimating color information that make the object to be colorized more general by using so-called machine learning technology have been proposed (Non-Patent Document 1 and Non-Patent Document 2). reference). The color information estimation method using these machine learning techniques is premised on preparing a huge amount of color images showing various objects. Then, machine learning is performed based on such an enormous amount of color images to generate a color information estimator. At this time, for example, a huge amount of color images prepared in advance are input to a machine learning device composed of a so-called neural network or the like, and the correspondence between the monochrome image and the color information is learned. With the color information estimator generated in this way, color information can be estimated accurately from various monochrome images. This makes it possible to convert a monochrome image converted into digital data into a natural color image.

特開２０１６−１４６５２９号公報Japanese Unexamined Patent Publication No. 2016-146529

Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa., ”Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification,” ACM Transaction on Graphics (Proc. Of SIGGRAPH), 35(4):110, 2016.Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa., "Let there be Color !: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification," ACM Transaction on Graphics (Proc. Of) SIGGRAPH), 35 (4): 110, 2016. Richard Zhang, Phillip Isola, and Alexei A. Efros. ”Colorful Image Colorization.” In ECCV 2016.Richard Zhang, Phillip Isola, and Alexei A. Efros. “Colorful Image Colorization.” In ECCV 2016.

しかしながら、従来技術では、推定されたカラー情報（例えば、画像特徴量）と真のカラー情報（例えば、画像特徴量）に関して、画素ごとに独立して比較して求めた値をもとにした誤差を計算する。そのため、例えばカラー情報に関して、画素ごとの誤差が小さくなる方向に学習が進むだけであり、ある画素とその隣接画素との間における色の均一性などが全く考慮されない。したがって、推定されたカラー情報において、本来は同じ色であると推定されるべき領域が斑になり易い。例えば、雲のない空の風景は、一般に一様に青くなるべきであるが、従来技術では、人間の目から見て正しくない、不自然な色むらが発生してしまうという問題がある。 However, in the prior art, an error based on a value obtained by independently comparing the estimated color information (for example, image feature amount) and the true color information (for example, image feature amount) for each pixel. To calculate. Therefore, for example, with respect to color information, learning proceeds only in a direction in which the error for each pixel becomes smaller, and color uniformity between a certain pixel and its adjacent pixels is not considered at all. Therefore, in the estimated color information, the region that should be estimated to be the same color is likely to be spotted. For example, a landscape in the sky without clouds should generally be uniformly blue, but the prior art has the problem of causing unnatural color unevenness that is incorrect to the human eye.

本発明は、以上のような問題点に鑑みてなされたものであり、ニューラルネットワークのパラメータを学習する際に計算される推定値と真の値との誤差に関して、人間の目から見てより正しい画像が出力できるような誤差を計算することができる誤差計算器およびそのプログラムを提供することを目的とする。 The present invention has been made in view of the above problems, and is more correct from the human eye with respect to the error between the estimated value and the true value calculated when learning the parameters of the neural network. It is an object of the present invention to provide an error calculator capable of calculating an error that can output an image and a program thereof.

前記課題を解決するため、本発明に係る誤差計算器は、推定カラー情報である第１の画像特徴量と真のカラー情報である第２の画像特徴量との誤差を計算する誤差計算器であって、特徴量マップ作成手段と、第１誤差算出手段と、誤差合成手段と、を備えることとした。 In order to solve the above problems, the error calculator according to the present invention is an error calculator that calculates the error between the first image feature amount which is the estimated color information and the second image feature amount which is the true color information. Therefore, it was decided to provide a feature amount map creating means, a first error calculating means, and an error synthesizing means.

かかる構成によれば、誤差計算器は、特徴量マップ作成手段によって、前記第１の画像特徴量および前記第２の画像特徴量から、所定演算により画像内における複数の画素の関係性を特徴付ける特徴量を抽出して第１の特徴量マップおよび第２の特徴量マップをそれぞれ作成する。
そして、誤差計算器は、第１誤差算出手段によって、前記第１の特徴量マップと前記第２の特徴量マップとに対応する画素間の画素値の誤差に基づく特徴量マップ間の誤差を算出する。
そして、誤差計算器は、誤差合成手段によって、前記第１の画像特徴量と前記第２の画像特徴量とに対応する画素間の画素値の誤差に基づいて算出された画像特徴量間の誤差を入力し、前記画像特徴量間の誤差と、前記特徴量マップ間の誤差と、を加算し、合成誤差を生成する。 According to such a configuration, the error calculator uses the feature amount map creating means to characterize the relationship of a plurality of pixels in the image from the first image feature amount and the second image feature amount by a predetermined calculation. The quantity is extracted to create a first feature map and a second feature map, respectively.
Then, the error calculator calculates the error between the feature amount maps based on the error of the pixel value between the pixels corresponding to the first feature amount map and the second feature amount map by the first error calculation means. To do.
Then, the error calculator is an error between the image feature amounts calculated based on the error of the pixel value between the pixels corresponding to the first image feature amount and the second image feature amount by the error synthesizing means. Is input, and the error between the image feature quantities and the error between the feature quantity maps are added to generate a composite error.

また、本発明は、コンピュータを、前記誤差計算器として機能させるための誤差計算プログラムで実現することもできる。 The present invention can also be realized by an error calculation program for causing the computer to function as the error calculator.

本発明は、以下に示す優れた効果を奏するものである。
本発明に係る誤差計算器によれば、複数の画素の関係性をそれぞれ反映した２つの特徴量マップ間の誤差を算出し、画素ごとに独立して比較して求めた値をもとにした誤差に対して、特徴量マップ間の誤差を加算することで合成誤差を求めることができる。
したがって、学習器が、この合成誤差を最小化するようにニューラルネットワークのパラメータを学習すれば、ある画素とその隣接画素との間における関係性についても学習することができる。そのため、この合成誤差を用いた学習を行った学習器によって推定された画像特徴量に基づく画像は、人間の目から見てより正しい画像となる。
したがって、本発明に係る誤差計算器を、カラー情報を出力するニューラルネットワークにおける学習に用いることで、不自然な色むらの発生が低減されたカラー画像を生成することができる。 The present invention has the following excellent effects.
According to the error computer according to the present invention, an error between two feature maps reflecting the relationship between a plurality of pixels is calculated, and each pixel is independently compared and obtained based on the calculated value. The combined error can be obtained by adding the error between the feature maps to the error.
Therefore, if the learner learns the parameters of the neural network so as to minimize this synthesis error, it is possible to learn the relationship between a certain pixel and its adjacent pixel. Therefore, the image based on the image feature amount estimated by the learner that has performed the learning using this synthesis error becomes a more correct image when viewed from the human eye.
Therefore, by using the error computer according to the present invention for learning in a neural network that outputs color information, it is possible to generate a color image in which the occurrence of unnatural color unevenness is reduced.

本発明の実施形態に係る誤差計算器を含む学習装置を模式的に示すブロック図である。It is a block diagram which shows typically the learning apparatus which includes the error calculator which concerns on embodiment of this invention. 本発明の実施形態に係る誤差計算器による特徴量マップの作成を模式的に示す説明図であって、（ａ）は入力画像の一例、（ｂ）および（ｃ）はフィルタの一例を示している。It is explanatory drawing which shows typically the creation of the feature amount map by the error computer which concerns on embodiment of this invention, (a) shows an example of an input image, (b) and (c) show an example of a filter. There is. 入力画像と同サイズの特徴量マップの作成方法を模式的に示す説明図であって、（ａ）は入力画像の一例、（ｂ）は入力画像の拡張例を示している。It is explanatory drawing which shows typically the method of making the feature amount map of the same size as an input image, (a) shows an example of an input image, and (b) shows an extended example of an input image. 本発明の実施形態に係る誤差計算器を学習に用いるカラー情報拡大器を含む自動色付け装置を模式的に示すブロック図である。It is a block diagram which shows typically the automatic coloring apparatus which includes the color information magnifier which uses the error calculator which concerns on embodiment of this invention for learning. 図４に示す低解像度カラー情報推定器の学習の流れを模式的に示すブロック図である。It is a block diagram which shows typically the learning flow of the low-resolution color information estimator shown in FIG. 図４に示すカラー情報拡大器の学習の流れを模式的に示すブロック図である。It is a block diagram which shows typically the learning flow of the color information magnifier shown in FIG. 図４に示すカラー情報拡大器の一例を模式的に示すブロック図である。It is a block diagram which shows typically an example of the color information magnifier shown in FIG. 図４に示すカラー情報拡大器の他の一例を模式的に示す説明図である。It is explanatory drawing which shows another example of the color information magnifier shown in FIG. 4 schematically.

以下、本発明の実施形態に係る誤差計算器について、図面を参照しながら説明する。
図１に示す学習装置Ｓは、学習器６０と、誤差計算器４０とを備えている。
学習装置Ｓは、第１の画像特徴量４０１である推定カラー情報と第２の画像特徴量４０２である真のカラー情報との誤差を誤差計算器４０により算出し、誤差計算器４０が算出した誤差を最小化するように学習器６０により、ニューラルネットワークを構成するためのパラメータを学習する。
ここで、学習器６０は、以下に詳述する画像特徴量を出力するニューラルネットワークで構成される。以下の説明では、この画像特徴量とは、例えば、輝度、色度、彩度等の色空間を表す量をいい、例えば、色空間を表す量から抽出された平均値、分散、畳み込み積分値等を含む。また、この画像特徴量の画素ごとの集合とは、例えばモノクロ画像（モノクロ情報）やカラー情報をいう。さらに、画像特徴量は、高さ方向および幅方向（縦横）に要素が並べられた行列で取り扱ってもよいし、１次元の多変数ベクトルで取り扱ってもよい。 Hereinafter, the error computer according to the embodiment of the present invention will be described with reference to the drawings.
The learning device S shown in FIG. 1 includes a learning device 60 and an error calculator 40.
The learning device S calculated the error between the estimated color information of the first image feature amount 401 and the true color information of the second image feature amount 402 by the error calculator 40, and the error calculator 40 calculated the error. The learner 60 learns the parameters for constructing the neural network so as to minimize the error.
Here, the learner 60 is composed of a neural network that outputs an image feature amount described in detail below. In the following description, the image feature amount means, for example, an amount representing a color space such as brightness, chromaticity, and saturation, and for example, an average value, a dispersion, and a convolution integral value extracted from the amount representing the color space. Etc. are included. Further, the set of image feature amounts for each pixel means, for example, a monochrome image (monochrome information) or color information. Further, the image feature amount may be handled by a matrix in which elements are arranged in the height direction and the width direction (vertical and horizontal), or may be handled by a one-dimensional multivariable vector.

また、ここで、モノクロ画像とは、具体的には色空間における輝度チャンネル（ＨＳＶ色空間におけるＶチャンネルや、Ｌａｂ色空間におけるＬチャンネルなど）のみから成る画像をいう。なお、画素の情報が輝度である場合、画素値（輝度値）は、８ビットの情報で表すとき、０〜２５５の値を有する。このモノクロ画像の画像特徴量であるモノクロ情報は、例えば輝度分布で表される。本明細書では、このモノクロ情報をモノクロ画像と同じ意味で用いている。また、カラー情報とは、例えば、輝度チャンネル以外の２チャンネルについての画像特徴量とすることができる。 Further, here, the monochrome image specifically refers to an image composed of only the luminance channels in the color space (V channel in the HSV color space, L channel in the Lab color space, etc.). When the pixel information is luminance, the pixel value (luminance value) has a value of 0 to 255 when represented by 8-bit information. The monochrome information, which is the image feature amount of this monochrome image, is represented by, for example, a luminance distribution. In this specification, this monochrome information is used in the same meaning as a monochrome image. Further, the color information can be, for example, an image feature amount for two channels other than the luminance channel.

学習器６０は、以下の実施例では、一例として、この画像特徴量としてカラー情報を出力するニューラルネットワークで構成されているものとして説明する。この学習器６０は、例えばモノクロ画像からカラー情報を推定するカラー情報推定器や、低解像度のカラー情報から高解像度のカラー情報を推定するカラー情報拡大器を作成するために用いるものである。学習器６０は、内部にニューラルネットワークを構成するためのパラメータ（パラメータ群）を持ち、その学習データとして入力に対して内部パラメータに応じた推定値を出力する。学習器６０は、この内部のパラメータを変更することにより、出力値を調整する。 In the following examples, the learner 60 will be described as being configured by a neural network that outputs color information as the image feature amount as an example. The learner 60 is used to create, for example, a color information estimator that estimates color information from a monochrome image and a color information magnifier that estimates high-resolution color information from low-resolution color information. The learner 60 has parameters (parameter group) for constructing a neural network inside, and outputs an estimated value according to the internal parameters to the input as the learning data. The learner 60 adjusts the output value by changing this internal parameter.

誤差計算器４０は、例えば、カラー情報（画像特徴量）から学習器６０がニューラルネットワークによりカラー情報を推定する学習のための推定カラー情報と真のカラー情報との誤差を計算する。誤差計算器４０は、図１に示すように、特徴量マップ作成手段４１と、第１誤差算出手段４２と、誤差合成手段４３と、を備えている。
誤差計算器４０は、第１の画像特徴量４０１および第２の画像特徴量４０２から所定演算により画像内における複数の画素の関係性を特徴付ける特徴量を抽出して第１の特徴量マップ４０３および第２の特徴量マップ４０４を作成し、第１の特徴量マップ４０３と第２の特徴量マップ４０４とに対応する画素間の画素値の誤差に基づく特徴量マップ間の誤差を算出し、第１の画像特徴量４０１と第２の画像特徴量４０２とに対応する画素間の画素値の誤差に基づいて算出された画像特徴量間の誤差を入力し、前記画像特徴量間の誤差と、前記特徴量マップ間の誤差と、を加算し、合成誤差を生成する。
なお、誤差計算器４０は、従来の誤差計算器と同様の構成として、第２誤差算出手段５１と、最小化手段５２と、を備えている。この誤差計算器４０は、例えば一般的なコンピュータで構成され、ＣＰＵ（Central Processing Unit）等の演算装置と、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）やＨＤＤ（Hard Disk Drive）と、入出力インタフェースと、を備えている。 The error calculator 40 calculates, for example, the error between the estimated color information for learning in which the learner 60 estimates the color information by the neural network from the color information (image feature amount) and the true color information. As shown in FIG. 1, the error calculator 40 includes a feature amount map creating means 41, a first error calculating means 42, and an error synthesizing means 43.
The error calculator 40 extracts a feature amount that characterizes the relationship between a plurality of pixels in the image from the first image feature amount 401 and the second image feature amount 402 by a predetermined calculation, and uses the first feature amount map 403 and the first feature amount map 403. A second feature map 404 is created, an error between the feature maps based on a pixel value error between the pixels corresponding to the first feature map 403 and the second feature map 404 is calculated, and a second feature map is calculated. The error between the image feature amounts calculated based on the error of the pixel value between the pixels corresponding to the image feature amount 401 of 1 and the image feature amount 402 of the second image feature amount 402 is input, and the error between the image feature amounts and the error between the image feature amounts are obtained. The error between the feature maps and the error are added to generate a composite error.
The error calculator 40 includes a second error calculating means 51 and a minimizing means 52 as a configuration similar to that of the conventional error calculator. The error calculator 40 is composed of, for example, a general computer, and includes a computing device such as a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), and an HDD (Hard Disk Drive). It has an input / output interface.

（第２誤差算出手段５１）
第２誤差算出手段５１は、第１の画像特徴量４０１と、第２の画像特徴量４０２と、を用いて、これら２つの画像特徴量間の誤差を算出するものである。ここで、第１の画像特徴量４０１は、学習器６０がその訓練データとして入力される入力値に対して内部のパラメータに応じて出力する推定カラー情報である。この第１の画像特徴量４０１は、例えば、学習器６０に対して学習用の訓練データとして入力される低解像度のカラー情報（画像特徴量）から、当該学習器６０によって推定された高解像度のカラー情報（推定カラー情報）である。 (Second error calculation means 51)
The second error calculating means 51 calculates an error between these two image feature amounts by using the first image feature amount 401 and the second image feature amount 402. Here, the first image feature amount 401 is estimated color information that the learner 60 outputs according to an internal parameter with respect to the input value input as the training data. The first image feature amount 401 is, for example, a high-resolution image feature estimated by the learner 60 from low-resolution color information (image feature amount) input to the learner 60 as training data for learning. Color information (estimated color information).

また、第２の画像特徴量４０２は、例えば、学習器６０の訓練データとして準備された正解の高解像度のカラー情報（真のカラー情報）である。
第２誤差算出手段５１は、第１の画像特徴量４０１と第２の画像特徴量４０２とに対応する画素間の画素値の誤差に基づくカラー情報（画像特徴量）間の誤差を算出して誤差合成手段４３に出力する。 Further, the second image feature amount 402 is, for example, high-resolution color information (true color information) of the correct answer prepared as training data of the learning device 60.
The second error calculating means 51 calculates an error between color information (image feature amount) based on the error of the pixel value between pixels corresponding to the first image feature amount 401 and the second image feature amount 402. It is output to the error synthesis means 43.

カラー情報間の誤差としては、例えば、従来手法と同様の平均二乗誤差や、交差エントロピーを用いる。具体例として、次の式（１）で定義された損失関数（Loss）は、従来手法と同様の平均二乗誤差を表している。 As the error between the color information, for example, the mean square error similar to the conventional method or the cross entropy is used. As a specific example, the loss function (Loss) defined by the following equation (1) represents the mean square error similar to the conventional method.

ここで、Ｈは、カラー情報として入力する画像の縦方向の画素数であり、Ｗは、カラー情報として入力する画像の横方向の画素数である。Ｃは、推定するカラー情報のチャンネル数であり、通常Ｃ＝２である。
ｙ_h,w,cは、画像上で座標（ｗ，ｈ）に位置する画素のチャンネルｃにおける推定カラー情報である。
ｙ^gt _h,w,cは、画像上で座標（ｗ，ｈ）に位置する画素のチャンネルｃにおける真のカラー情報の値である。なお、一般には、前記した式（１）におけるＨ×Ｗ×ＣをＮに置き換えて、ｈ，ｗ，ｃをまとめてｉに置き換えて、次の式（２）のように表すことが多い。 Here, H is the number of pixels in the vertical direction of the image to be input as color information, and W is the number of pixels in the horizontal direction of the image to be input as color information. C is the number of estimated color information channels, and usually C = 2.
y _{h, w, c} are estimated color information in the channel c of the pixels located at the coordinates (w, h) on the image.
y ^gt _{h, w, c} are the true color information values in the channel c of the pixels located at the coordinates (w, h) on the image. In general, H × W × C in the above equation (1) is replaced with N, and h, w, and c are collectively replaced with i, which is often expressed as the following equation (2).

（特徴量マップ作成手段４１）
特徴量マップ作成手段４１は、カラー情報である第１の画像特徴量４０１および第２の画像特徴量４０２から、所定演算により画像内における複数の画素の関係性を特徴付ける特徴量を抽出して第１の特徴量マップ４０３および第２の特徴量マップ４０４をそれぞれ作成するものである。ここで、複数の画素の関係性を特徴付ける特徴量とは、カラー情報（画像特徴量）において複数の画素の関係性を特徴付けるものであり、カラー情報において単独の画素から独立に得られる特徴ではない。また、画素の関係性は、例えば、関係性の対象とする２つの画素値の差分で表される。注目画素とその周辺画素とは、離間していてもよいし、隣接していてもよい。 (Feature quantity map creation means 41)
The feature amount map creating means 41 extracts a feature amount that characterizes the relationship between a plurality of pixels in the image from the first image feature amount 401 and the second image feature amount 402, which are color information, by a predetermined calculation. The feature amount map 403 of 1 and the feature amount map 404 of 2 are created respectively. Here, the feature amount that characterizes the relationship between a plurality of pixels is a feature that characterizes the relationship between a plurality of pixels in color information (image feature amount), and is not a feature that is independently obtained from a single pixel in color information. .. Further, the pixel relationship is represented by, for example, the difference between two pixel values to be the target of the relationship. The pixel of interest and its peripheral pixels may be separated or adjacent to each other.

隣接する複数の画素の関係性を特徴付ける特徴量を抽出するためには、空間フィルタリングをしてもよい。本実施形態では、特徴量マップ作成手段４１は、予め定められたフィルタを用いてカラー情報である第１の画像特徴量４０１または第２の画像特徴量４０２に対しフィルタ処理を施す演算を行うことで、隣接する複数の画素の関係性を特徴付ける特徴量を抽出して第１の特徴量マップ４０３または第２の特徴量マップ４０４を作成する。 Spatial filtering may be used to extract features that characterize the relationship between a plurality of adjacent pixels. In the present embodiment, the feature amount map creating means 41 performs an operation of performing a filter process on the first image feature amount 401 or the second image feature amount 402 which is color information by using a predetermined filter. Then, the feature amount that characterizes the relationship between the plurality of adjacent pixels is extracted to create the first feature amount map 403 or the second feature amount map 404.

ここで、フィルタは、空間フィルタリングに用いることができれば、目的や機能は特に限定されず、例えば、輪郭抽出フィルタ、ノイズ除去フィルタ、平滑化フィルタ、移動平均フィルタ、あるいは、メディアンフィルタ等の非線形フィルタ等を用いても構わない。 Here, the filter is not particularly limited in purpose and function as long as it can be used for spatial filtering, and for example, a contour extraction filter, a noise removal filter, a smoothing filter, a moving average filter, a non-linear filter such as a median filter, or the like. May be used.

本実施形態では、一例として、特徴量マップ作成手段４１は、カラー情報である第１の画像特徴量４０１または第２の画像特徴量４０２に基づく画像に含まれるエッジを検出するエッジフィルタを用いることでエッジマップを第１の特徴量マップ４０３または第２の特徴量マップ４０４として作成するものとして説明する。エッジマップは、周辺画素との関係性を画素値の勾配で表したマップ（勾配マップ）である。 In the present embodiment, as an example, the feature amount map creating means 41 uses an edge filter that detects an edge included in an image based on the first image feature amount 401 or the second image feature amount 402 which is color information. The edge map will be described as being created as the first feature map 403 or the second feature map 404. The edge map is a map (gradient map) in which the relationship with peripheral pixels is represented by a gradient of pixel values.

エッジフィルタは、例えば、ＳｏｂｅｌフィルタやＰｒｅｗｉｔｔフィルタ等の一次微分フィルタを用いることができる。また、一次微分フィルタに限らず、Ｌａｐｌａｃｉａｎフィルタ等の二次微分フィルタを用いてもよい。
また、例えば、周辺８画素を考慮した３×３のサイズのフィルタであってもよいし、周辺２４画素を考慮した５×５のサイズのフィルタ等であってもよい。
さらに、フィルタの形状は、正方形に限らず、例えば、上下左右に隣接する周辺４画素を考慮するような形状であってもよい。
以下では、一例として、Ｓｏｂｅｌフィルタを用いて勾配マップ（エッジマップ）を作成するものとして説明する。 As the edge filter, for example, a first-order differential filter such as a Sobel filter or a Prewitt filter can be used. Further, the present invention is not limited to the first-order differential filter, and a second-order differential filter such as a Laplacian filter may be used.
Further, for example, a filter having a size of 3 × 3 in consideration of 8 peripheral pixels may be used, or a filter having a size of 5 × 5 in consideration of 24 peripheral pixels may be used.
Further, the shape of the filter is not limited to a square shape, and may be, for example, a shape that considers four peripheral pixels adjacent to each other in the vertical and horizontal directions.
In the following, as an example, a gradient map (edge map) will be created using a Sobel filter.

特徴量マップ作成手段４１は、第１の画像特徴量４０１（推定カラー情報）から第１の特徴量マップ４０３（勾配マップ）を作成し、第１誤差算出手段４２に出力する。
特徴量マップ作成手段４１は、第２の画像特徴量４０２（真のカラー情報）から第２の特徴量マップ４０４（勾配マップ）を作成し、第１誤差算出手段４２に出力する。
第１の特徴量マップ４０３は、例えば、推定された高解像度のカラー情報についての勾配マップ（以下、推定カラー情報についての勾配マップという）である。
第２の特徴量マップ４０４は、例えば、準備された正解の高解像度のカラー情報についての勾配マップ（以下、真のカラー情報についての勾配マップという）である。
なお、特徴量マップ作成手段４１による勾配マップの作成についての詳細は後記する。 The feature amount map creating means 41 creates a first feature amount map 403 (gradient map) from the first image feature amount 401 (estimated color information), and outputs the first feature amount map 403 (gradient map) to the first error calculating means 42.
The feature amount map creating means 41 creates a second feature amount map 404 (gradient map) from the second image feature amount 402 (true color information), and outputs the second feature amount map 404 (gradient map) to the first error calculating means 42.
The first feature amount map 403 is, for example, a gradient map for estimated high-resolution color information (hereinafter, referred to as a gradient map for estimated color information).
The second feature map 404 is, for example, a gradient map for high-resolution color information of the prepared correct answer (hereinafter, referred to as a gradient map for true color information).
The details of creating the gradient map by the feature map creating means 41 will be described later.

（第１誤差算出手段４２）
第１誤差算出手段４２は、第１の特徴量マップ４０３と第２の特徴量マップ４０４とに対応する画素間の画素値の誤差に基づく勾配マップ間（特徴量マップ間）の誤差を算出するものである。第１誤差算出手段４２は、推定カラー情報についての勾配マップと、真のカラー情報についての勾配マップとを用いて、勾配マップ間の誤差を算出して誤差合成手段４３に出力する。この第１誤差算出手段４２は、第２誤差算出手段５１と比べると入力情報が異なるものの、誤差の計算手法については、第２誤差算出手段５１と同様の手法を適用することができる。 (First error calculation means 42)
The first error calculating means 42 calculates the error between the gradient maps (between the feature maps) based on the error of the pixel value between the pixels corresponding to the first feature map 403 and the second feature map 404. It is a thing. The first error calculating means 42 calculates the error between the gradient maps using the gradient map for the estimated color information and the gradient map for the true color information, and outputs the error to the error combining means 43. Although the input information of the first error calculation means 42 is different from that of the second error calculation means 51, the same method as that of the second error calculation means 51 can be applied to the error calculation method.

（誤差合成手段４３）
誤差合成手段４３は、第１の画像特徴量４０１と第２の画像特徴量４０２とのカラー情報間（画像特徴量間）の誤差と、第１の特徴量マップ４０３と第２の特徴量マップ４０４との勾配マップ間（特徴量マップ間）の誤差と、を加算し、合成誤差４０５を生成するものである。
誤差合成手段４３は、第２誤差算出手段５１から、推定カラー情報と真のカラー情報とのカラー情報間の誤差を受け付ける。
誤差合成手段４３は、第１誤差算出手段４２から、推定カラー情報についての勾配マップと真のカラー情報についての勾配マップとの勾配マップ間の誤差を受け付ける。
誤差合成手段４３は、第２誤差算出手段５１から取得したカラー情報間の誤差と、第１誤差算出手段４２から取得した勾配マップ間の誤差との和を合成誤差４０５として算出する。誤差合成手段４３は、算出した合成誤差４０５を、最小化手段５２に出力する。 (Error synthesis means 43)
The error synthesizing means 43 includes an error between the color information (between the image feature amounts) between the first image feature amount 401 and the second image feature amount 402, and the first feature amount map 403 and the second feature amount map. The error between the gradient maps (between the feature maps) and the 404 is added to generate a composite error 405.
The error synthesizing means 43 receives an error between the estimated color information and the true color information from the second error calculating means 51.
The error synthesizing means 43 receives an error between the gradient map for the estimated color information and the gradient map for the true color information from the first error calculating means 42.
The error synthesizing means 43 calculates the sum of the error between the color information acquired from the second error calculating means 51 and the error between the gradient maps acquired from the first error calculating means 42 as the synthesizing error 405. The error combining means 43 outputs the calculated synthesis error 405 to the minimizing means 52.

（最小化手段５２）
最小化手段５２は、第１の画像特徴量４０１と第２の画像特徴量４０２との組を順次入力し、入力した画像特徴量の組に応じて、所定演算により合成誤差４０５が小さくなるように学習器６０のパラメータを調整し、調整したパラメータを更新パラメータ４０６（更新用のパラメータ）として学習器６０に供給するものである。 (Minimizing means 52)
The minimizing means 52 sequentially inputs a set of the first image feature amount 401 and the second image feature amount 402, and the compositing error 405 is reduced by a predetermined calculation according to the set of the input image feature amounts. The parameters of the learner 60 are adjusted, and the adjusted parameters are supplied to the learner 60 as update parameters 406 (update parameters).

最小化手段５２は、ＳＧＤなどの誤差勾配に基づく最適化手法を用いて、合成誤差４０５が小さくなるように、学習器６０のパラメータを調整する。なお、ＳＧＤについては、次の参考文献に記載されているので説明を省略する。
（参考文献）L. Bottou., ”Stochastic Gradient Descent Tricks.,”Neural Networks: Tricks of the Trade: Springer, 2012. The minimizing means 52 adjusts the parameters of the learner 60 so that the synthesis error 405 is reduced by using an optimization method based on an error gradient such as SGD. Since SGD is described in the following references, the description thereof will be omitted.
(Reference) L. Bottou., "Stochastic Gradient Descent Tricks.," Neural Networks: Tricks of the Trade: Springer, 2012.

ここで、従来技術では、不自然な色むらが発生してしまうという問題について説明する。
例えば、雲のない空の風景画像（カラー推定された画像）において、１０個の画素からなる一列の画素領域を想定する。そして、正しい画素値を例えば「Ｂ」とし、この一列の１番目から５番目までの画素についてそれぞれ推定された画素値が「Ｂ＋１０」、６番目から１０番目までの画素についてそれぞれ推定された画素値が「Ｂ−１０」であるとする。
このとき、従来の誤差計算器は、画素ごとに独立に比較する誤差計算を行う。この例では、この一列の画素値は、正しい画素値との差分（絶対値）がすべて「１０」であることから、従来の誤差計算器は、誤差が最小化されたものと判定する場合がある。よって、このような学習をして作成された推定器を用いると、この一列の１番目から５番目までの画素に対して推定される色と、この一列の６番目から１０番目までの画素に対して推定される色と、が異なってしまうことなる。
これに対して、本実施形態の誤差計算器４０は、複数の画素の関係性として、隣接する画素との差分も計算しているので、この一列の５番目の画素値と６番目の画素値との間に、２０もの大きなギャップがあることを検知し、この一列のすべての画素値が「Ｂ＋１０」となるときよりも誤差が大きい、と判定することが期待できる。
つまり、従来技術では、一様に青くなるべき雲のない空の風景が斑になるなど、不自然な色むらが発生するところを、誤差計算器４０は、注目画素とその周辺画素との色の均一性が保存された画像、すなわち、人間の目から見てより正しい画像が出力できるような誤差を計算することができる。 Here, the problem that unnatural color unevenness occurs in the prior art will be described.
For example, in a cloudless sky landscape image (color-estimated image), a row of pixel regions consisting of 10 pixels is assumed. Then, the correct pixel value is set to, for example, "B", the estimated pixel value for each of the first to fifth pixels in this row is "B + 10", and the estimated pixel value for each of the sixth to tenth pixels is "B". Is "B-10".
At this time, the conventional error calculator performs an error calculation for comparing each pixel independently. In this example, since the difference (absolute value) from the correct pixel value is all "10" for the pixel values in this row, the conventional error computer may determine that the error is minimized. is there. Therefore, using the estimator created by such learning, the colors estimated for the 1st to 5th pixels in this row and the 6th to 10th pixels in this row can be obtained. On the other hand, the estimated color will be different.
On the other hand, the error calculator 40 of the present embodiment also calculates the difference between the adjacent pixels as the relationship between the plurality of pixels, so that the fifth pixel value and the sixth pixel value in this row are calculated. It can be expected that it is detected that there is a large gap of 20 between and, and it is determined that the error is larger than when all the pixel values in this row are "B + 10".
That is, in the prior art, the error calculator 40 uses the color of the pixel of interest and its peripheral pixels to display an unnatural color unevenness such as a mottled landscape in the sky without clouds that should be uniformly blue. It is possible to calculate an error that can output an image in which the uniformity of the image is preserved, that is, an image that is more correct from the human eye.

以下、数式を用いて、誤差計算器４０の説明を続ける。
本実施形態では、最小化手段５２は、例えば、次の式（３）で定義された損失関数（Loss）を用いて、学習器６０のパラメータの更新を行う。 Hereinafter, the description of the error computer 40 will be continued using a mathematical formula.
In the present embodiment, the minimization means 52 updates the parameters of the learner 60 by using, for example, the loss function (Loss) defined by the following equation (3).

誤差計算器４０は、学習に用いる誤差（合成誤差４０５）としては、従来手法と同様の平均二乗誤差や交差エントロピーに加えて、複数画素から計算する誤差を用いる。式（３）は、従来手法と同様の平均二乗誤差と、複数画素から計算する誤差としての、勾配マップにおける平均二乗誤差と、を用いる場合の例を示している。 As the error used for learning (composite error 405), the error calculator 40 uses an error calculated from a plurality of pixels in addition to the mean square error and cross entropy similar to those in the conventional method. Equation (3) shows an example in which the mean square error similar to the conventional method and the mean square error in the gradient map as the error calculated from a plurality of pixels are used.

具体的には、式（３）の第１項は、従来手法と同様の平均二乗誤差を表している。この式（３）の第１項は、前記した第２誤差算出手段５１の処理に相当する。なお、従来手法は、式（３）の第１項を最小化するように学習器のパラメータの更新を行う。 Specifically, the first term of the equation (3) represents the mean square error similar to that of the conventional method. The first term of this equation (3) corresponds to the processing of the second error calculating means 51 described above. In the conventional method, the parameters of the learner are updated so as to minimize the first term of the equation (3).

一方、式（３）の第２項は、複数画素から計算する誤差として、特徴量マップ作成手段４１で作成した勾配マップにおける平均二乗誤差を表している。この式（３）の第２項は、前記した第１誤差算出手段４２の処理に相当する。また、式（３）の第１項と第２項との和は、前記した誤差合成手段４３の処理に相当する。 On the other hand, the second term of the equation (3) represents the mean square error in the gradient map created by the feature map creating means 41 as an error calculated from a plurality of pixels. The second term of this equation (3) corresponds to the processing of the first error calculating means 42 described above. Further, the sum of the first term and the second term of the equation (3) corresponds to the processing of the error synthesis means 43 described above.

式（３）の第２項において、αは複数画素から計算する誤差の重み係数である。どのような学習器を作るかにもよるが、αは０．１程度の小さめの値が望ましい。
Ｈ^gradは、勾配マップの縦方向のサイズ（画素数）であり、Ｗ^gradは、勾配マップの横方向のサイズ（画素数）である。
Ｋは勾配マップを計算するのに用いるフィルタの個数を表している。 In the second term of the equation (3), α is a weighting coefficient of the error calculated from a plurality of pixels. Although it depends on what kind of learning device is made, it is desirable that α is a small value of about 0.1.
H ^grad is the vertical size (number of pixels) of ^{the gradient map, and W grad} is the horizontal size (number of pixels) of the gradient map.
K represents the number of filters used to calculate the gradient map.

また、式（３）の第２項において、ｇ_h,w,c,kは、推定カラー情報から作成された勾配マップ上で座標（ｗ，ｈ）に位置する画素のチャンネルｃにおける値（以下、推定カラー情報についての勾配マップの値という）である。
ｇ^gt _h,w,c,kは、真のカラー情報から作成された勾配マップ上で座標（ｗ，ｈ）に位置する画素のチャンネルｃにおける値（以下、真のカラー情報についての勾配マップの値という）である。
このうち、推定カラー情報についての勾配マップの値は、特徴量マップ作成手段４１によって、例えば、次の式（４）に基づいて算出される。 Further, in the second term of the equation (3), g _{h, w, c, k} are values in the channel c of the pixels located at the coordinates (w, h) on the gradient map created from the estimated color information (hereinafter, , The value of the gradient map for the estimated color information).
g ^gt _{h, w, c, k} are the values in the channel c of the pixels located at the coordinates (w, h) on the gradient map created from the true color information (hereinafter, the gradient map for the true color information). It is called a value).
Of these, the value of the gradient map for the estimated color information is calculated by the feature map creating means 41, for example, based on the following equation (4).

式（４）において、Ｍ_kは、k番目のフィルタの縦方向のサイズであり、Ｎ_kは、k番目のフィルタの横方向のサイズである。
ｓ（１≦ｓ≦Ｍ_k）は、k番目のフィルタの縦方向に配列された各係数の識別子であり、列の上端が１で表される。
ｔ（１≦ｔ≦Ｎ_k）は、k番目のフィルタの横方向に配列された各係数の識別子であり、行の左端が１で表される。
ω_s,t,kはフィルタ係数であり、k番目のフィルタのｓ行ｔ列に配置された係数である。 In equation (4), M _k is the vertical size of _{the k-th filter, and N k} is the horizontal size of the k-th filter.
s (1 ≦ s ≦ M _k ) is an identifier of each coefficient arranged in the vertical direction of the k-th filter, and the upper end of the column is represented by 1.
t (1 ≦ t ≦ N _k ) is an identifier of each coefficient arranged in the horizontal direction of the kth filter, and the left end of the row is represented by 1.
ω _{s, t, k} are filter coefficients, which are the coefficients arranged in the s row and t column of the kth filter.

同様に、真のカラー情報についての勾配マップの値は、特徴量マップ作成手段４１によって、例えば、次の式（５）に基づいて算出される。 Similarly, the value of the gradient map for the true color information is calculated by the feature map creating means 41, for example, based on the following equation (5).

これら式（４）および式（５）は、特徴量マップ作成手段４１の処理に相当する。
特徴量マップ作成手段４１が勾配マップを作成するために、垂直方向のエッジおよび水平方向のエッジをそれぞれ検出するＳｏｂｅｌフィルタを用いる場合、次の条件が設定される。
（条件）
フィルタ数Ｋ＝２
k＝１番目のフィルタに関して縦方向のサイズＭ_k=1＝３
k＝１番目のフィルタに関して横方向のサイズＮ_k=1＝３
k＝２番目のフィルタに関して縦方向のサイズＭ_k=2＝３
k＝２番目のフィルタに関して横方向のサイズＮ_k=2＝３
k＝１番目のフィルタが、垂直方向のエッジを検出するフィルタの場合、つまり、水平方向の差分（勾配）を検出する場合、そのフィルタ係数ω_k=1は、図２（ｂ）で表される。
k＝２番目のフィルタが、水平方向のエッジを検出するフィルタの場合、つまり、垂直方向の差分（勾配）を検出する場合、そのフィルタ係数ω_k=2は、図２（ｃ）で表される。 These equations (4) and (5) correspond to the processing of the feature amount map creating means 41.
When the feature map creating means 41 uses a Sobel filter that detects vertical edges and horizontal edges to create a gradient map, the following conditions are set.
(conditions)
Number of filters K = 2
k = Vertical size for the first filter M _{k = 1} = 3
k = Horizontal size for the first filter N _{k = 1} = 3
k = Vertical size for the second filter M _{k = 2} = 3
k = Horizontal size for the second filter N _{k = 2} = 3
When k = the first filter is a filter that detects vertical edges, that is, when it detects a horizontal difference (gradient), the filter coefficient ω _{k = 1} is represented by FIG. 2 (b). To.
If the k = second filter is a filter that detects horizontal edges, that is, if it detects a vertical difference (gradient), the filter coefficient ω _{k = 2} is represented by FIG. 2 (c). To.

特徴量マップ作成手段４１は、例えば入力する推定カラー情報である第１の画像特徴量４０１として、図２（ａ）に例示するような画素値があったとき、図２（ｂ）に例示する垂直方向のエッジを検出するフィルタを用いて、３×３の領域をスキャンする。例えば、図２（ａ）において２行２列目の画素を注目画素とした場合、その注目画素を中心とする３×３の領域の画素値と、フィルタ係数ω_k=1をコンボリューション（畳み込み）した結果は、２０となる。
同様に、２行３列目の画素を注目画素としたときの計算結果は、２０となる。このフィルタ（エッジフィルタ）は、エッジがあるところほど、値が高くなる。
一方、３×３の領域の画素値がすべて等しい領域に、このフィルタを適用すると、値が０となる。つまり、３行２列目の画素や３行３列目の画素を注目画素としたときの計算結果は、共に０となる。 The feature amount map creating means 41 illustrates, for example, when there is a pixel value as illustrated in FIG. 2A as the first image feature amount 401 which is the estimated color information to be input, FIG. 2B is illustrated. A 3x3 area is scanned using a filter that detects vertical edges. For example, when the pixel in the second row and the second column is the pixel of interest in FIG. 2A, the pixel value in the region of 3 × 3 centered on the pixel of interest and the filter coefficient ω _{k = 1} are convolved. ) Is 20.
Similarly, the calculation result when the pixels in the second row and the third column are the pixels of interest is 20. The value of this filter (edge filter) increases as there are edges.
On the other hand, when this filter is applied to an area where all the pixel values of the 3 × 3 area are the same, the value becomes 0. That is, the calculation results when the pixels in the third row and the second column and the pixels in the third row and the third column are the pixels of interest are both 0.

同様に、特徴量マップ作成手段４１は、例えば入力する推定カラー情報である第１の画像特徴量４０１として、図２（ａ）に例示するような画素値があったとき、図２（ｃ）に例示する水平方向のエッジを検出するフィルタ（フィルタ係数ω_k=2）を用いて、３×３の領域をスキャンする。特徴量マップ作成手段４１は、このときに得られた計算結果と、垂直方向のエッジを検出するフィルタ（フィルタ係数ω_k=1）を用いたときの計算結果とを足し合わせることで、エッジマップ（勾配マップ）を作成する。 Similarly, when the feature amount map creating means 41 has a pixel value as illustrated in FIG. 2A as the first image feature amount 401 which is the estimated color information to be input, FIG. 2C A 3 × 3 region is scanned using a filter (filter coefficient ω _{k = 2) that detects horizontal edges as illustrated in.} The feature map creating means 41 adds the calculation result obtained at this time and the calculation result when the filter for detecting the vertical edge (filter coefficient ω _{k = 1} ) is used to create an edge map. Create (gradient map).

こうして作成された勾配マップに対して、第１誤差算出手段４２が、第２誤差算出手段５１と同様の誤差計算を適用することで、勾配マップ間の誤差を算出することができる。なお、第１誤差算出手段４２および第２誤差算出手段５１は、誤差計算式に関して、平均二乗誤差ではなく交差エントロピーなど他の誤差関数を用いてもよい。また、第２誤差算出手段５１が行う式（３）の第２項の演算において、厳密に式（３）の第２項と同じ数式である必要はなく、複数の画素に関連する値の間で所定の演算を行うものであれば構わない。 By applying the same error calculation as the second error calculating means 51 to the gradient map created in this way, the first error calculating means 42 can calculate the error between the gradient maps. The first error calculation means 42 and the second error calculation means 51 may use other error functions such as cross entropy instead of the mean square error in the error calculation formula. Further, in the calculation of the second term of the formula (3) performed by the second error calculating means 51, it is not necessary to be exactly the same formula as the second term of the formula (3), and between the values related to a plurality of pixels. It does not matter as long as it performs a predetermined calculation with.

また、前記した具体例で作成されるエッジマップのサイズ（Ｈ^grad×Ｗ^grad）は、入力画像のサイズ（Ｈ×Ｗ）よりも縦横２画素ずつ小さくなる。フィルタサイズを一般化して説明すると、エッジマップのサイズは、Ｈ^grad＝Ｈ−（Ｍ_k−１）、Ｗ^grad＝Ｗ−（Ｎ_k−１）となる。ただし、エッジマップのサイズを、入力画像のサイズよりも必ずしも小さくする必要はなく、入力画像と同じサイズのエッジマップを次のように作成してもよい。 Further, the size of the edge map (H ^grad × W ^grad ) created in the specific example described above is smaller than the size of the input image (H × W) by two pixels in the vertical and horizontal directions. To explain the filter size in general, the size of the edge map is H ^grad = H − (M _k -1) and W ^grad = W − (N _k -1). However, the size of the edge map does not necessarily have to be smaller than the size of the input image, and an edge map having the same size as the input image may be created as follows.

例えば、入力画像のサイズを、一時的に、（Ｈ＋（Ｍ_k−１））×（Ｗ＋（Ｎ_k−１））のサイズへ拡張することで、入力画像と同サイズのエッジマップが得られる。上記サイズへ拡張するには、補間する画素を、元画像の外周に沿うように並べながら生成すればよい。ここで、サイズ拡張のために一時的に生成された画素の値を決定する方法としては、例えば、生成されるすべての画素に対して元画像の平均画素値を付与する方法や、生成される各画素に対して最も近い元画像の画素の値を付与する方法などを採用することができる。 For example, by temporarily _{expanding the size of the input image to the size of (H + (M k} -1)) × (W + (N _k -1)), an edge map having the same size as the input image can be obtained. .. In order to expand to the above size, the pixels to be interpolated may be generated while being arranged along the outer circumference of the original image. Here, as a method of determining the value of the pixel temporarily generated for size expansion, for example, a method of giving an average pixel value of the original image to all the generated pixels or a method of generating is generated. A method of assigning the value of the pixel of the original image closest to each pixel can be adopted.

このうち、平均画素値を付与する方法を用いる場合、図３（ａ）に示すような３×３の画像に対しては、元画像の周囲を埋めるように生成されるすべての画素に対して、画素値「４５」を付与すればよい。
また、最も近い画素の値を付与する方法を用いる場合、例えば、図３（ａ）に示すような３×３の画像に対しては、元画像の周囲を埋めるように生成される各画素には、図３（ｂ）にハッチングで示す画素の値をそれぞれ付与すればよい。 Of these, when the method of assigning the average pixel value is used, for a 3 × 3 image as shown in FIG. 3A, for all the pixels generated so as to fill the periphery of the original image. , The pixel value "45" may be given.
Further, when the method of assigning the value of the closest pixel is used, for example, for a 3 × 3 image as shown in FIG. 3A, each pixel generated so as to fill the periphery of the original image is used. 3 (b) may be given the values of the pixels shown by hatching.

本実施形態に係る誤差計算器４０は、最適化に用いる誤差として、従来使われてきた画素ごとに独立して比較して求める誤差に加えて、複数の画素から求める誤差を用いて行う学習に用いられる。これにより、学習装置Ｓは、画素間の関係性についても考慮した学習が行える。
したがって、誤差計算器４０を、例えばカラー情報を出力するニューラルネットワークにおける学習に用いることで、学習装置Ｓは、訓練データのカラー情報と、推定されるカラー情報との間で画素値の勾配のような、周辺画素との関係性が同じになるよう学習することができる。これにより、学習装置Ｓの学習器６０から、例えば、高精度なカラー推定器やカラー情報拡大器を作成できる。その結果として、カラー推定器やカラー情報拡大器が、不自然な色むらの発生が低減されたカラー画像を生成することができる。
上記の技術により、４Ｋ／８Ｋ画像のような高解像度のモノクロ画像に対しても色むらが少ない自然な色付けが可能になる。 The error computer 40 according to the present embodiment is used for learning performed by using an error obtained from a plurality of pixels in addition to an error obtained by independently comparing each pixel, which has been conventionally used, as an error used for optimization. Used. As a result, the learning device S can perform learning in consideration of the relationship between pixels.
Therefore, by using the error calculator 40 for learning in a neural network that outputs color information, for example, the learning device S looks like a gradient of pixel values between the color information of the training data and the estimated color information. It is possible to learn so that the relationship with the peripheral pixels is the same. Thereby, for example, a highly accurate color estimator and a color information magnifier can be created from the learning device 60 of the learning device S. As a result, the color estimator and the color information magnifier can generate a color image in which the occurrence of unnatural color unevenness is reduced.
With the above technique, even a high-resolution monochrome image such as a 4K / 8K image can be naturally colored with less color unevenness.

以上、本発明の実施形態について説明したが、本発明はこれに限定されず、その趣旨を変えない範囲で実施することができる。例えば、誤差計算器４０として説明したが、この装置の構成の処理を可能にするように、汎用または特殊なコンピュータ言語で記述した誤差計算プログラムとみなすことも可能である。 Although the embodiments of the present invention have been described above, the present invention is not limited to this, and can be carried out without changing the gist thereof. For example, although described as the error calculator 40, it can also be regarded as an error calculator written in a general-purpose or special computer language so as to enable processing of the configuration of this device.

また、前記実施形態では、誤差計算器４０は、従来の誤差計算器と同様の構成として、第２誤差算出手段５１と、最小化手段５２と、を備えていることとしたが、第２誤差算出手段５１や最小化手段５２は、誤差計算器とは別体であってもよい。この場合には、例えば、誤差計算器の前段に第２誤差算出手段５１を設けたり、誤差計算器の後段に最小化手段５２を設けたりすることができる。 Further, in the above-described embodiment, the error calculator 40 is provided with the second error calculating means 51 and the minimizing means 52 as the same configuration as the conventional error calculator, but the second error The calculation means 51 and the minimization means 52 may be separate from the error calculator. In this case, for example, the second error calculating means 51 may be provided in the front stage of the error calculator, or the minimizing means 52 may be provided in the rear stage of the error calculator.

また、学習装置Ｓは、誤差計算器４０と、例えば、カラー情報を推定する推定器を作成するために準備する学習器６０とによって、カラー情報推定学習装置を構成してもよい。
また、学習装置Ｓは、誤差計算器４０と、例えば、カラー情報を拡大する推定器を作成するために準備する学習器６０とによって、カラー情報拡大推定学習装置を構成してもよい。
さらに、学習装置Ｓは、誤差計算器４０と、例えば、低解像度画像から超解像画像等の高解像度画像を推定する推定器を作成するために準備する学習器６０とによって、超解像画像推定学習装置を構成してもよい。 Further, the learning device S may configure the color information estimation learning device by the error calculator 40 and, for example, the learning device 60 prepared for creating the estimator for estimating the color information.
Further, the learning device S may configure the color information expansion estimation learning device by the error calculator 40 and, for example, the learning device 60 prepared for creating the estimation device that expands the color information.
Further, the learning device S is provided with a super-resolution image by an error calculator 40 and a learning device 60 prepared for creating an estimator for estimating a high-resolution image such as a super-resolution image from a low-resolution image, for example. An estimation learning device may be configured.

以下、本発明の実施形態に係る誤差計算器４０を学習に用いるカラー情報拡大器について詳細に説明する。
図４に示す自動色付け装置１は、モノクロ画像からカラー情報を推定することにより、モノクロ画像へ自動的に色付けするものであり、カラー情報拡大器１０を含んでいる。自動色付け装置１は、図４に示すように、主として、カラー情報推定器３と、情報合成器９と、を備えている。
この自動色付け装置１は、例えば一般的なコンピュータで構成され、ＧＰＵ（Graphics Processing Units）等の演算装置と、ＲＯＭ、ＲＡＭ、ＨＤＤや一般的な画像メモリと、入出力インタフェースと、を備えている。 Hereinafter, the color information magnifier that uses the error computer 40 according to the embodiment of the present invention for learning will be described in detail.
The automatic coloring device 1 shown in FIG. 4 automatically colors a monochrome image by estimating color information from the monochrome image, and includes a color information magnifier 10. As shown in FIG. 4, the automatic coloring device 1 mainly includes a color information estimator 3 and an information synthesizer 9.
The automatic coloring device 1 is composed of, for example, a general computer, and includes an arithmetic unit such as a GPU (Graphics Processing Units), a ROM, a RAM, an HDD, a general image memory, and an input / output interface. ..

カラー情報推定器３は、入力される高解像度モノクロ画像１０１から、低解像度モノクロ画像１０３および低解像度カラー情報１０５を生成して、これらの情報を用いて高解像度カラー情報１０７を推定するものである。
高解像度モノクロ画像１０１は、第１解像度のモノクロ画像である。この高解像度モノクロ画像１０１は、例えば、過去の白黒フィルムや写真からスキャンによりデジタル化したモノクロ画像である。
低解像度モノクロ画像１０３は、前記第１解像度よりも低い第２解像度のモノクロ画像である。
低解像度カラー情報１０５は、前記第２解像度のカラー情報である。
高解像度カラー情報１０７は、前記第１解像度のカラー情報である。 The color information estimator 3 generates a low-resolution monochrome image 103 and a low-resolution color information 105 from the input high-resolution monochrome image 101, and estimates the high-resolution color information 107 using these information. ..
The high-resolution monochrome image 101 is a first-resolution monochrome image. The high-resolution monochrome image 101 is, for example, a monochrome image digitized by scanning from a past black-and-white film or photograph.
The low-resolution monochrome image 103 is a monochrome image having a second resolution lower than the first resolution.
The low resolution color information 105 is the color information of the second resolution.
The high-resolution color information 107 is the color information of the first resolution.

第１解像度の値（高解像度の値）は、第２解像度の値（低解像度の値）に比較して大きければ特に限定されない。例えば、第２解像度の画像の大きさを２５６×２５６ピクセル、第１解像度の画像の大きさを５１２×５１２ピクセルとしてもよい。また、例えば、第２解像度の画像の大きさを４８０×２７０ピクセル、第１解像度の画像の大きさを４Ｋ（３８４０×２１６０）としてもよい。さらには、第１解像度の画像の大きさを８Ｋ（７６８０×４３２０）としても構わない。 The value of the first resolution (value of high resolution) is not particularly limited as long as it is larger than the value of the second resolution (value of low resolution). For example, the size of the image of the second resolution may be 256 × 256 pixels, and the size of the image of the first resolution may be 512 × 512 pixels. Further, for example, the size of the image of the second resolution may be 480 × 270 pixels, and the size of the image of the first resolution may be 4K (3840 × 2160). Further, the size of the first resolution image may be 8K (7680 × 4320).

カラー情報推定器３は、図４に示すように、縮小器５と、低解像度カラー情報推定器７と、カラー情報拡大器１０と、を備えている。 As shown in FIG. 4, the color information estimator 3 includes a reducer 5, a low-resolution color information estimator 7, and a color information magnifier 10.

縮小器５は、入力される高解像度モノクロ画像１０１を縮小する処理を行って低解像度モノクロ画像１０３を生成するものである。ここで、縮小とは解像度を低減、つまり画素数を減少させることをいう。縮小における縮小率が例えば０．５である場合、縮小画像の水平方向、垂直方向の画素数は、原画像の水平方向、垂直方向の画素数のそれぞれ１／２となる。縮小器５は、生成した低解像度モノクロ画像１０３を低解像度カラー情報推定器７に出力する。 The reducer 5 generates a low-resolution monochrome image 103 by performing a process of reducing the input high-resolution monochrome image 101. Here, reduction means reducing the resolution, that is, reducing the number of pixels. When the reduction ratio in the reduction is, for example, 0.5, the number of pixels in the horizontal direction and the vertical direction of the reduced image is 1/2 of the number of pixels in the horizontal direction and the vertical direction of the original image, respectively. The reducer 5 outputs the generated low-resolution monochrome image 103 to the low-resolution color information estimator 7.

低解像度カラー情報推定器７は、推定を行うための学習により予め決定されたパラメータ群を用いて、縮小器５により生成された低解像度モノクロ画像１０３から、低解像度のカラー情報（画像特徴量）を抽出する。これにより、低解像度カラー情報推定器７は、低解像度カラー情報１０５を推定する。なお、低解像度カラー情報推定器７を作成するための学習の流れは、従来技術と同様であるが、簡単な説明を後記する。 The low-resolution color information estimator 7 uses low-resolution color information (image feature amount) from the low-resolution monochrome image 103 generated by the reducer 5 by using a parameter group predetermined by learning for estimation. To extract. As a result, the low-resolution color information estimator 7 estimates the low-resolution color information 105. The learning flow for creating the low-resolution color information estimator 7 is the same as that of the conventional technique, but a brief description will be described later.

カラー情報拡大器１０は、低解像度カラー情報推定器７により推定された低解像度カラー情報１０５と、縮小器５をバイパスして入力される高解像度モノクロ画像１０１と、を入力として、画像サイズが拡大されたカラー情報（高解像度カラー情報１０７）を推定する処理を行うものである。カラー情報拡大器１０は、低解像度カラー情報１０５を拡大する際に、高解像度モノクロ画像１０１（モノクロ情報）を用いて拡大する。そして、カラー情報拡大器１０は、推定した高解像度カラー情報１０７を情報合成器９に出力する。 The color information magnifier 10 expands the image size by inputting the low-resolution color information 105 estimated by the low-resolution color information estimator 7 and the high-resolution monochrome image 101 input by bypassing the reducer 5. The process of estimating the color information (high-resolution color information 107) is performed. The color information magnifier 10 enlarges the low-resolution color information 105 by using the high-resolution monochrome image 101 (monochrome information). Then, the color information magnifier 10 outputs the estimated high-resolution color information 107 to the information synthesizer 9.

情報合成器９は、カラー情報推定器３で推定された高解像度カラー情報１０７と、高解像度モノクロ画像１０１とを合成し、高解像度カラー画像１０９を作成する。情報合成器９は、１チャンネル（以下、１ｃｈと表記する場合もある）のモノクロ情報と、２チャンネル（２ｃｈ）のカラー情報とを単純に合成してカラー画像を生成する。 The information synthesizer 9 synthesizes the high-resolution color information 107 estimated by the color information estimator 3 and the high-resolution monochrome image 101 to create a high-resolution color image 109. The information synthesizer 9 simply combines monochrome information of one channel (hereinafter, may be referred to as 1ch) and color information of two channels (2ch) to generate a color image.

次に、カラー情報拡大器１０の学習の流れについて、低解像度カラー情報推定器７の学習の流れと対比しながら説明する。 Next, the learning flow of the color information magnifier 10 will be described in comparison with the learning flow of the low-resolution color information estimator 7.

はじめに、低解像度カラー情報推定器７の学習の流れについて図５を参照して説明する。
低解像度カラー情報推定器７は、以下の手順により、予め用意した学習器から生成する。この学習器は、モノクロ画像を入力し、所定の計算処理を行うことによりカラー情報を推定して出力する。図５では、学習器を、学習が終わった状態の低解像度カラー情報推定器７として表記している。そして、大量の学習用のカラー画像を用意し、以下のステップＳ１〜ステップＳ４を十分な回数繰り返す。この学習器がこのパラメータを学習し、適切にパラメータを設定することにより精度の良いカラー情報推定器を作成できる。 First, the learning flow of the low-resolution color information estimator 7 will be described with reference to FIG.
The low-resolution color information estimator 7 is generated from a learning device prepared in advance by the following procedure. This learner inputs a monochrome image and estimates and outputs color information by performing a predetermined calculation process. In FIG. 5, the learner is represented as a low-resolution color information estimator 7 in a state where learning has been completed. Then, a large amount of color images for learning are prepared, and the following steps S1 to S4 are repeated a sufficient number of times. This learner learns this parameter, and by setting the parameter appropriately, an accurate color information estimator can be created.

（ステップＳ１）
学習用のカラー画像として低解像度カラー画像２０２を用意し、それを低解像度モノクロ画像２０３と真のカラー情報２０４とに分離する。
ここで、低解像度モノクロ画像２０３は、低解像度の学習用モノクロ画像である。
また、真のカラー情報２０４は、低解像度の学習用モノクロ画像と同じサイズの正解カラー情報であって、推定されるカラー情報との誤差計算に用いる。 (Step S1)
A low-resolution color image 202 is prepared as a color image for learning, and the low-resolution monochrome image 203 and the true color information 204 are separated.
Here, the low-resolution monochrome image 203 is a low-resolution learning monochrome image.
Further, the true color information 204 is correct color information having the same size as the low-resolution learning monochrome image, and is used for error calculation with the estimated color information.

（ステップＳ２）
次に、学習器（低解像度カラー情報推定器７）は、低解像度モノクロ画像２０３を入力し、現在のパラメータを用いた推定結果のカラー情報として、低解像度カラー情報２０５を出力する。 (Step S2)
Next, the learner (low-resolution color information estimator 7) inputs the low-resolution monochrome image 203, and outputs the low-resolution color information 205 as the color information of the estimation result using the current parameters.

（ステップＳ３）
次に、従来の誤差計算器８は、低解像度カラー情報２０５（推定カラー情報）と真のカラー情報２０４との誤差を計算する。この誤差としては、各画素値の平均二乗誤差などが用いられる。 (Step S3)
Next, the conventional error calculator 8 calculates the error between the low resolution color information 205 (estimated color information) and the true color information 204. As this error, the mean square error of each pixel value or the like is used.

（ステップＳ４）
また、従来の誤差計算器８は、計算して得られた誤差から、ＳＧＤなどの誤差勾配に基づく最適化手法を用いて、誤差が小さくなるように、学習器（低解像度カラー情報推定器７）のパラメータを調整し、調整されたパラメータを学習器に出力する。
なお、従来の誤差計算器８は、例えば、図１に示した第２誤差算出手段５１と最小化手段５２とで構成される。 (Step S4)
Further, the conventional error computer 8 uses a learning device (low resolution color information estimator 7) so that the error is reduced by using an optimization method based on an error gradient such as SGD from the calculated error. ) Is adjusted, and the adjusted parameter is output to the learner.
The conventional error calculator 8 is composed of, for example, the second error calculating means 51 and the minimizing means 52 shown in FIG.

次に、カラー情報拡大器１０の学習の流れについて図６を参照して説明する。
カラー情報拡大器１０は、以下の手順により、予め用意した学習器から生成する。この学習器は、高解像度モノクロ画像３０１および低解像度カラー情報３０５を入力し、所定の計算処理を行うことにより高解像度カラー情報３０７を推定して出力する。図６では、学習器を、学習が終わった状態のカラー情報拡大器１０として表記している。そして、大量の学習用のカラー画像を用意し、以下のステップＳ１０〜ステップＳ１４を十分な回数繰り返す。この学習器がこのパラメータを学習し、適切にパラメータを設定することにより精度の良いカラー情報拡大器を作成できる。 Next, the learning flow of the color information magnifier 10 will be described with reference to FIG.
The color information magnifier 10 is generated from a learning device prepared in advance by the following procedure. This learner inputs the high-resolution monochrome image 301 and the low-resolution color information 305, and estimates and outputs the high-resolution color information 307 by performing a predetermined calculation process. In FIG. 6, the learning device is represented as a color information magnifier 10 in a state where learning has been completed. Then, a large amount of color images for learning are prepared, and the following steps S10 to S14 are repeated a sufficient number of times. This learner learns this parameter, and by setting the parameter appropriately, an accurate color information magnifier can be created.

（ステップＳ１０）
学習用のカラー画像として高解像度カラー画像３０９を用意し、それを縮小器５によって単純に縮小して低解像度カラー情報３０５とする。
ここで、高解像度カラー画像３０９としては、古い白黒フィルムをカラー化したものも使用する。この場合、例えば、過去の白黒フィルムや写真からスキャンによりデジタル化したモノクロ画像に対して、人手で色付けしたデジタルデータとする。また、学習用の高解像度カラー画像３０９を大量に準備するために、古い白黒フィルム以外に、カラー撮影された新しい４Ｋ等のカラー画像を用いてもよい。 (Step S10)
A high-resolution color image 309 is prepared as a color image for learning, and the high-resolution color image 309 is simply reduced by the reducer 5 to obtain low-resolution color information 305.
Here, as the high-resolution color image 309, a colorized version of an old black-and-white film is also used. In this case, for example, digital data obtained by manually coloring a monochrome image digitized by scanning from a past black-and-white film or photograph is used. Further, in order to prepare a large amount of high-resolution color images 309 for learning, a new color image such as 4K taken in color may be used in addition to the old black-and-white film.

（ステップＳ１１）
次に、高解像度カラー画像３０９を、高解像度モノクロ画像３０１と高解像度カラー情報（真のカラー情報）３０４とに分離する。
ここで、高解像度モノクロ画像３０１は、高解像度の学習用モノクロ画像である。
また、高解像度カラー情報３０４は、高解像度の学習用モノクロ画像と同じサイズの正解カラー情報であって、推定される高解像度カラー情報との誤差計算に用いる。 (Step S11)
Next, the high-resolution color image 309 is separated into a high-resolution monochrome image 301 and a high-resolution color information (true color information) 304.
Here, the high-resolution monochrome image 301 is a high-resolution learning monochrome image.
Further, the high-resolution color information 304 is correct color information having the same size as the high-resolution learning monochrome image, and is used for error calculation with the estimated high-resolution color information.

（ステップＳ１２）
次に、学習器（カラー情報拡大器１０）は、高解像度モノクロ画像３０１を入力し、現在のパラメータを用いた推定結果のカラー情報として、高解像度カラー情報３０７を出力する。 (Step S12)
Next, the learning device (color information magnifier 10) inputs the high-resolution monochrome image 301, and outputs the high-resolution color information 307 as the color information of the estimation result using the current parameters.

（ステップＳ１３）
次に、本発明の実施形態に係る誤差計算器４０は、高解像度カラー情報３０７（推定カラー情報）と高解像度カラー情報（真のカラー情報）３０４との誤差（合成誤差）を計算する。この誤差としては、前記したカラー情報間の誤差と、勾配マップ間の誤差とを、を合成した合成誤差４０５（図１参照）を用いる。 (Step S13)
Next, the error calculator 40 according to the embodiment of the present invention calculates an error (composite error) between the high-resolution color information 307 (estimated color information) and the high-resolution color information (true color information) 304. As this error, a combined error 405 (see FIG. 1), which is a combination of the above-mentioned error between color information and the error between gradient maps, is used.

（ステップＳ１４）
また、誤差計算器４０は、計算して得られた誤差（合成誤差４０５）から、ＳＧＤなどの誤差勾配に基づく最適化手法を用いて、誤差が小さくなるように、学習器（カラー情報拡大器１０）のパラメータを調整し、調整されたパラメータを学習器に出力する。なお、誤差計算器４０は、学習のときに付加されるが、学習後には接続を解除する。 (Step S14)
Further, the error calculator 40 uses a learning device (color information expander) so that the error is reduced from the calculated error (composite error 405) by using an optimization method based on an error gradient such as SGD. The parameter of 10) is adjusted, and the adjusted parameter is output to the learner. The error computer 40 is added at the time of learning, but is disconnected after learning.

次に、カラー情報拡大器１０の詳細な構成について図７を参照して説明する。
カラー情報拡大器１０は、図７に示すように、サイズ拡大手段２１と、合成手段２２ａと、高解像度カラー情報推定手段２３と、を備えている。なお、図７のカラー情報拡大器１０は、特徴抽出手段３１，３２，３３を備える形態で図示したが、例えば、すべての特徴抽出手段を省略した構成とすることもできる。なお、以下では、特徴抽出手段について、便宜的に第１の特徴抽出手段３１、第２の特徴抽出手段３２、および第３の特徴抽出手段３３のように呼称する場合もある。 Next, the detailed configuration of the color information magnifier 10 will be described with reference to FIG. 7.
As shown in FIG. 7, the color information magnifier 10 includes a size enlarging means 21, a synthesizing means 22a, and a high-resolution color information estimating means 23. Although the color information magnifier 10 of FIG. 7 is shown in a form including the feature extraction means 31, 32, 33, for example, all the feature extraction means may be omitted. In the following, the feature extraction means may be referred to as the first feature extraction means 31, the second feature extraction means 32, and the third feature extraction means 33 for convenience.

カラー情報拡大器１０は、例えばニューラルネットワークにより構成できる。また、ニューラルネットワークは、例えばＣＮＮ（Convolutional Neural Network）であってもよい。ＣＮＮでは、隠れ層（hidden layer）に、Convolution層（畳み込み層）や、Deconvolution層（逆畳み込み層、または、Transposed Convolution 層）を用いる。よて、ＣＮＮを採用した場合、カラー情報拡大器１０は、各構成要素を、Convolution層またはDeconvolution層を用いて実装可能であり、ＧＰＵを用いて高速に計算できる。 The color information magnifier 10 can be configured by, for example, a neural network. Further, the neural network may be, for example, a CNN (Convolutional Neural Network). In CNN, a Convolution layer (convolutional layer) or a Deconvolutional layer (deconvolutional layer or Transposed Convolutional layer) is used as a hidden layer. Therefore, when CNN is adopted, each component can be mounted by the color information magnifier 10 by using the Convolution layer or the Deconvolution layer, and can be calculated at high speed by using the GPU.

サイズ拡大手段２１は、入力される低解像度の画像特徴量を拡大する処理を行って高解像度の画像特徴量を生成するものである。ここで、低解像度の画像特徴量とは、例えば、低解像度カラー情報１０５のことをいう。なお、図７に示すように、カラー情報拡大器１０が第２の特徴抽出手段３２を備える場合には、第２の特徴抽出手段３２が低解像度カラー情報１０５から抽出した画像特徴量が低解像度の画像特徴量となる。このサイズ拡大手段２１は、生成した高解像度の画像特徴量を合成手段２２ａに出力する。 The size enlargement means 21 generates a high-resolution image feature amount by performing a process of enlarging the input low-resolution image feature amount. Here, the low-resolution image feature amount means, for example, low-resolution color information 105. As shown in FIG. 7, when the color information magnifier 10 includes the second feature extraction means 32, the image feature amount extracted from the low resolution color information 105 by the second feature extraction means 32 has a low resolution. It becomes the image feature amount of. The size enlargement means 21 outputs the generated high-resolution image feature amount to the synthesis means 22a.

サイズ拡大手段２１には、例えば、Deconvolution層（ニューラルネットワークを用いた画像拡大層）を用いてもよい。また、一般的な画像拡大アルゴリズムで用いられるパラメータを固定的に用いてもよい。なお、一般的な画像拡大アルゴリズムとしては、例えば、最近傍補間法やBilinear補間法などを用いてもよい。 As the size enlargement means 21, for example, a Deconvolution layer (an image enlargement layer using a neural network) may be used. Further, the parameters used in a general image enlargement algorithm may be fixedly used. As a general image enlargement algorithm, for example, the nearest neighbor interpolation method or the Bilinear interpolation method may be used.

合成手段２２ａは、例えば、入力される高解像度モノクロ画像１０１と、サイズ拡大手段２１によって生成された高解像度の画像特徴量とを合成するものである。なお、図７に示すように、カラー情報拡大器１０が第１の特徴抽出手段３１を備える場合には、合成手段２２ａは、高解像度モノクロ画像１０１から抽出された画像特徴量と、サイズ拡大手段２１によって生成された高解像度の画像特徴量とを合成する。この合成手段２２ａは、合成した高解像度の画像特徴量を高解像度カラー情報推定手段２３に出力する。
合成手段２２ａは、１ｃｈのモノクロ情報と、このモノクロ情報と同じ大きさの２ｃｈのカラー情報とを単純に合成し、高解像度の画像特徴量を生成する。合成手段２２ａには、例えば、ニューラルネットワークのConvolution層を用いてもよい。 The synthesizing means 22a synthesizes, for example, the input high-resolution monochrome image 101 and the high-resolution image feature amount generated by the size expanding means 21. As shown in FIG. 7, when the color information magnifier 10 includes the first feature extracting means 31, the compositing means 22a includes an image feature amount extracted from the high-resolution monochrome image 101 and a size expanding means. The high-resolution image feature amount generated by 21 is combined. The synthesizing means 22a outputs the synthesized high-resolution image feature amount to the high-resolution color information estimating means 23.
The synthesizing means 22a simply synthesizes 1ch monochrome information and 2ch color information having the same size as the monochrome information to generate a high-resolution image feature amount. For the synthesis means 22a, for example, the Convolution layer of the neural network may be used.

高解像度カラー情報推定手段２３は、合成手段２２ａにより合成された高解像度の画像特徴量から、高解像度カラー情報を推定するための学習により、予め決定されたパラメータ群を用いて画像特徴量を抽出し、高解像度カラー情報１０７を推定するものである。ここで、学習とは、カラー情報拡大器１０を作成するための学習をいう。 The high-resolution color information estimation means 23 extracts an image feature amount from a high-resolution image feature amount synthesized by the synthesis means 22a using a predetermined parameter group by learning for estimating high-resolution color information. However, the high-resolution color information 107 is estimated. Here, the learning means learning for creating the color information magnifier 10.

高解像度カラー情報１０７は、低解像度カラー情報１０５が拡大されたカラー情報に相当し、高解像度モノクロ画像１０１に対応した解像度を有する。この高解像度カラー情報１０７とは、色空間のチャンネルごとのカラー情報であって、例えば、輝度チャンネル以外の２チャンネルについての画像特徴量をいう。 The high-resolution color information 107 corresponds to the enlarged color information of the low-resolution color information 105, and has a resolution corresponding to the high-resolution monochrome image 101. The high-resolution color information 107 is color information for each channel in the color space, and refers, for example, an image feature amount for two channels other than the luminance channel.

高解像度カラー情報推定手段２３は、その前段からの複数（３以上）の出力（Output）に対応した複数（３以上）のアウトプットチャンネルについての画像特徴量を、色空間における２チャンネルについての画像特徴量に変換し、カラー情報を推定する。
高解像度カラー情報推定手段２３には、例えば、ニューラルネットワークのConvolution層を用いてもよい。また、Convolution層（隠れ層）が複数あってもよい。つまり、Convolutionを連続的に繰り返し行ってもよい。
高解像度カラー情報推定手段２３の前段からのアウトプットチャンネル数は所望の値に設定できる。例えば合成手段２２ａからのアウトプットチャンネル数は３ｃｈやそれ以上であってもよい。 The high-resolution color information estimation means 23 uses image features for a plurality of (3 or more) output channels corresponding to a plurality of (3 or more) outputs from the previous stage, and an image for two channels in the color space. Convert to feature quantity and estimate color information.
For the high-resolution color information estimation means 23, for example, the Convolution layer of the neural network may be used. In addition, there may be a plurality of Convolution layers (hidden layers). That is, the Convolution may be repeated continuously.
The number of output channels from the previous stage of the high-resolution color information estimation means 23 can be set to a desired value. For example, the number of output channels from the synthesis means 22a may be 3 channels or more.

カラー情報拡大器１０は、図７に示すように、第１の特徴抽出手段３１、第２の特徴抽出手段３２、および第３の特徴抽出手段３３のうちの少なくとも１つの特徴抽出手段を備えてもよい。 As shown in FIG. 7, the color information magnifier 10 includes at least one feature extraction means of the first feature extraction means 31, the second feature extraction means 32, and the third feature extraction means 33. May be good.

第１の特徴抽出手段３１は、高解像度モノクロ画像１０１から、学習により予め決定されたパラメータ群を用いて高解像度の画像特徴量を抽出し、抽出した高解像度の画像特徴量を合成手段２２ａに出力するものである。なお、学習とは、カラー情報拡大器１０を作成するための学習をいう。第１の特徴抽出手段３１は、第１の特徴抽出手段３１に入力される１ｃｈのモノクロ情報を、第１の特徴抽出手段３１のアウトプットチャンネルごとに高解像度の画像特徴量にそれぞれ変換する。 The first feature extraction means 31 extracts a high-resolution image feature amount from the high-resolution monochrome image 101 using a parameter group predetermined by learning, and uses the extracted high-resolution image feature amount as the synthesis means 22a. It is the one to output. The learning refers to learning for creating the color information magnifier 10. The first feature extraction means 31 converts 1ch monochrome information input to the first feature extraction means 31 into a high-resolution image feature amount for each output channel of the first feature extraction means 31.

第２の特徴抽出手段３２は、低解像度カラー情報１０５から、学習により予め決定されたパラメータ群を用いて低解像度の画像特徴量を抽出し、抽出した低解像度の画像特徴量をサイズ拡大手段２１に出力するものである。第２の特徴抽出手段３２は、第２の特徴抽出手段３２に入力される２ｃｈのカラー情報を、第２の特徴抽出手段３２のアウトプットチャンネルごとに低解像度の画像特徴量にそれぞれ変換する。 The second feature extraction means 32 extracts a low-resolution image feature amount from the low-resolution color information 105 using a parameter group predetermined by learning, and size-enlarges the extracted low-resolution image feature amount 21. It is output to. The second feature extraction means 32 converts the 2ch color information input to the second feature extraction means 32 into a low-resolution image feature amount for each output channel of the second feature extraction means 32.

第３の特徴抽出手段３３は、合成手段２２ａで生成された高解像度の画像特徴量から、学習により予め決定されたパラメータ群を用いて高解像度の画像特徴量を抽出し、抽出した高解像度の画像特徴量を高解像度カラー情報推定手段２３に出力するものである。第３の特徴抽出手段３３は、合成手段２２ａからの複数の出力に対応した複数のアウトプットチャンネル（例えば３ｃｈ）についての画像特徴量を、第３の特徴抽出手段３３のアウトプットチャンネルごとに高解像度の画像特徴量にそれぞれ変換する。なお、第３の特徴抽出手段３３のアウトプットチャンネル数は、例えば６４ｃｈ、１２８ｃｈ、２５６ｃｈ等に設定される。 The third feature extraction means 33 extracts a high-resolution image feature amount from the high-resolution image feature amount generated by the synthesis means 22a using a parameter group predetermined by learning, and extracts the high-resolution image feature amount. The image feature amount is output to the high-resolution color information estimation means 23. The third feature extraction means 33 increases the amount of image features for a plurality of output channels (for example, 3 channels) corresponding to the plurality of outputs from the synthesis means 22a for each output channel of the third feature extraction means 33. Convert each to the image feature amount of the resolution. The number of output channels of the third feature extraction means 33 is set to, for example, 64ch, 128ch, 256ch, or the like.

各特徴抽出手段３１〜３３には、例えば、ニューラルネットワークのConvolution層を用いてもよい。また、Convolution層（隠れ層）が複数あってもよい。各特徴抽出手段からのアウトプットチャンネル数は所望の値に設定できる。なお、本明細書では、特徴抽出手段等に入力した画像特徴量をアウトプットチャンネルごとにコンボリューションにかけて得られた画像特徴量のことを、入力から得た特徴という。また、本明細書では、特徴抽出手段等への複数チャンネルからなる入力情報をコンボリューションにかけて、入力した画像特徴量を変換することを、特徴を抽出するという。 For each feature extraction means 31 to 33, for example, a Convolution layer of a neural network may be used. In addition, there may be a plurality of Convolution layers (hidden layers). The number of output channels from each feature extraction means can be set to a desired value. In this specification, the image feature amount obtained by convolving the image feature amount input to the feature extraction means or the like for each output channel is referred to as a feature obtained from the input. Further, in the present specification, the feature extraction is defined as converting the input image feature amount by applying convolution to the input information consisting of a plurality of channels to the feature extraction means or the like.

図７では、高解像度カラー情報推定手段２３とは別に第３の特徴抽出手段３３を図示したが、高解像度カラー情報推定手段２３が内部に第３の特徴抽出手段３３を備えることとしてもよい。第３の特徴抽出手段３３は、高解像度カラー情報推定手段２３が色空間のチャンネルごとの画像特徴量を抽出する前に、色空間の２チャンネルについての画像特徴量を出力するためのパラメータ群とは異なるパラメータ群を用いて、サイズ拡大手段２１および合成手段２２ａの処理により生成された高解像度の画像特徴量から、複数チャンネル（例えば６４ｃｈ）について高解像度の画像特徴量をそれぞれ生成する。 In FIG. 7, the third feature extraction means 33 is shown separately from the high-resolution color information estimation means 23, but the high-resolution color information estimation means 23 may include the third feature extraction means 33 inside. The third feature extraction means 33 is a parameter group for outputting the image feature amounts for the two channels of the color space before the high-resolution color information estimation means 23 extracts the image feature amount for each channel in the color space. Uses different parameter groups to generate high-resolution image features for a plurality of channels (for example, 64 channels) from the high-resolution image features generated by the processing of the size enlargement means 21 and the compositing means 22a.

カラー情報拡大器１０によれば、高解像度モノクロ画像１０１（モノクロ情報）を明示的に用いているので、推定されるカラー情報のぼけを低減し、低解像度カラー情報１０５を精度よく拡大できる。このカラー情報拡大器１０は、例えば４Ｋまたは８Ｋ等の高解像度モノクロ画像１０１への自動色付けをする際に用いるカラー情報を推定するカラー情報推定器３に組み込むことができる。また、カラー情報推定器３は、高解像度モノクロ画像１０１への自動色付けをする際に用いるカラー情報を推定する精度を向上させることができる。 According to the color information magnifier 10, since the high-resolution monochrome image 101 (monochrome information) is explicitly used, it is possible to reduce the blurring of the estimated color information and magnify the low-resolution color information 105 with high accuracy. The color information magnifier 10 can be incorporated into a color information estimator 3 that estimates color information used when automatically coloring a high-resolution monochrome image 101 such as 4K or 8K. Further, the color information estimator 3 can improve the accuracy of estimating the color information used when automatically coloring the high-resolution monochrome image 101.

また、高解像度のモノクロ画像のデジタルデータは、例えば物理的フィルムからスキャンすることにより得られるが、従来の色付け技術では、このような高解像度のモノクロ画像に直接色づけすることはできなかった。これに対して、カラー情報推定器３を備える自動色付け装置１は、４Ｋ等の高解像度のモノクロ画像に対する自然な色付けを可能とすることができる。 Further, digital data of a high-resolution monochrome image can be obtained by scanning, for example, from a physical film, but conventional coloring techniques cannot directly color such a high-resolution monochrome image. On the other hand, the automatic coloring device 1 provided with the color information estimator 3 can enable natural coloring of a high-resolution monochrome image such as 4K.

また、例えば、写真や物理的フィルムからスキャンしたモノクロ画像のデータは存在するが、写真や物理的フィルムが消失してデータしか残っていない状況においても、カラー情報推定器３を備える自動色付け装置１は、当時の色情報を推定して、モノクロ画像に色付けすることができる。 Further, for example, even in a situation where there is data of a monochrome image scanned from a photograph or a physical film, but the photograph or the physical film disappears and only the data remains, the automatic coloring device 1 provided with the color information estimator 3 Can estimate the color information at that time and color the monochrome image.

さらに、例えば、低解像度カラー情報１０５が由来するところのカラー撮影された画像では、モノクロ情報チャンネル（色空間における輝度チャンネル）上で境界がはっきりしている領域は、カラー情報チャンネル（例えば、輝度チャンネル以外の２チャンネル）上でも境界がはっきりしているケースが多い。ここで、境界とは、例えばオブジェクトの輪郭線（オブジェクトとその背景との境目）等の線で表される部分である。
そのため、カラー情報拡大器１０のように、高解像度モノクロ画像１０１を用いて、低解像度カラー情報１０５を拡大すると、特に、高解像度モノクロ情報チャンネル（高解像度モノクロ画像１０１）上で境界がはっきりしている領域におけるカラー情報のぼけが低減される効果を奏する。 Further, for example, in a color-photographed image from which the low-resolution color information 105 is derived, a region having a clear boundary on the monochrome information channel (luminance channel in the color space) is a color information channel (for example, a luminance channel). In many cases, the boundary is clear even on the other 2 channels). Here, the boundary is a part represented by a line such as an outline of an object (a boundary between an object and its background).
Therefore, when the low-resolution color information 105 is enlarged by using the high-resolution monochrome image 101 like the color information magnifier 10, the boundary becomes clear especially on the high-resolution monochrome information channel (high-resolution monochrome image 101). It has the effect of reducing the blurring of color information in the area.

なお、前記したカラー情報は、色空間における輝度チャンネル以外の２チャンネルとして説明したが、それ以外であっても取り扱うことが可能である。一例としては、ＲＧＢ色空間における３チャンネルすべてをカラー情報として用いてもよい。 Although the above-mentioned color information has been described as two channels other than the luminance channel in the color space, it is possible to handle other channels. As an example, all three channels in the RGB color space may be used as color information.

また、カラー情報拡大器やカラー情報推定器に対して入力されるカラー情報の形式と、出力するカラー情報の形式とは一致していなくても構わない。一例としては、カラー情報拡大器１０に、高解像度モノクロ画像１０１としてＬａｂ色空間におけるＬチャンネルを入力すると共に、低解像度カラー情報１０５としてＬａｂ色空間におけるａｂチャンネルを入力した場合、高解像度カラー情報１０７としてＲＧＢ色空間におけるＲＧＢチャンネルを出力することもできる。 Further, the format of the color information input to the color information magnifier and the color information estimator and the format of the output color information do not have to match. As an example, when the L channel in the Lab color space is input to the color information magnifier 10 as the high resolution monochrome image 101 and the ab channel in the Lab color space is input as the low resolution color information 105, the high resolution color information 107 It is also possible to output RGB channels in the RGB color space.

また、カラー情報拡大器１０は、ニューラルネットワークによる学習に限らず、他の機械学習技術を用いて構成することもできる。誤差計算器４０は、カラー情報拡大器１０の学習に用いるだけではなく、低解像度カラー情報推定器７の学習に用いてもよい。 Further, the color information magnifier 10 is not limited to learning by a neural network, and can be configured by using other machine learning techniques. The error calculator 40 may be used not only for learning the color information magnifier 10 but also for learning the low-resolution color information estimator 7.

実施形態に係る誤差計算器の性能を確かめるために、誤差計算器４０を学習に用いたカラー情報拡大器によってカラー情報を拡大する実験を行った。図８は、実験に用いたカラー情報拡大器を模式的に示す説明図である。図８に示すように、実験に用いたカラー情報拡大器は、サイズ拡大手段２１と、合成手段２２ａと、第３の特徴抽出手段３３と、高解像度カラー情報推定手段２３と、を備えている。なお、このカラー情報拡大器において、図７に示すカラー情報拡大器１０と同じ構成には同じ符号を付して説明を省略する。 In order to confirm the performance of the error computer according to the embodiment, an experiment was conducted in which the color information was expanded by the color information expander using the error computer 40 for learning. FIG. 8 is an explanatory diagram schematically showing the color information magnifier used in the experiment. As shown in FIG. 8, the color information magnifier used in the experiment includes a size enlarging means 21, a synthesizing means 22a, a third feature extracting means 33, and a high-resolution color information estimating means 23. .. In this color information magnifier, the same components as those of the color information magnifier 10 shown in FIG. 7 are designated by the same reference numerals, and the description thereof will be omitted.

合成手段２２ａは、高解像度モノクロ画像１０１と、サイズ拡大手段２１によって生成された高解像度の画像情報とを合成して高解像度の合成画像情報を生成する。 The compositing means 22a synthesizes the high-resolution monochrome image 101 and the high-resolution image information generated by the size-enlarging means 21 to generate high-resolution composite image information.

高解像度モノクロ画像１０１は、Ｌａｂ色空間におけるＬチャンネルに相当する１ｃｈのモノクロ情報（画像特徴量）である。図８では、１枚の画像として模式的に示した。
また、実験では、高解像度モノクロ画像１０１が９６０×５４０ピクセルの画像であるものとした。なお、高解像度モノクロ画像１０１における画素値をベクトルで表現すると、一般には次の式（６）で示される。式（６）で示すベクトルｘ₁は、高解像度モノクロ画像１０１の画素数と同様に５１８４００個の成分を持つ。 The high-resolution monochrome image 101 is 1ch monochrome information (image feature amount) corresponding to the L channel in the Lab color space. In FIG. 8, it is schematically shown as one image.
Further, in the experiment, it was assumed that the high-resolution monochrome image 101 was an image of 960 × 540 pixels. When the pixel value in the high-resolution monochrome image 101 is expressed by a vector, it is generally expressed by the following equation (6). _{The vector x 1} represented by the equation (6) has 518,400 components as in the number of pixels of the high-resolution monochrome image 101.

低解像度カラー情報１０５は、Ｌａｂ色空間におけるａｂチャンネルに相当する２ｃｈのカラー情報（画像特徴量）である。図８では、２枚の小さな画像として模式的に示した。
また、実験では、低解像度カラー情報１０５の解像度が４８０×２７０ピクセルであるものとした。そして、実験では、サイズ拡大手段２１による拡大率を２（垂直方向２倍×水平方向２倍）とした。図８では、２枚の拡大された画像として模式的に示した。 The low-resolution color information 105 is 2ch color information (image feature amount) corresponding to the ab channel in the Lab color space. In FIG. 8, it is schematically shown as two small images.
Further, in the experiment, it was assumed that the resolution of the low resolution color information 105 was 480 × 270 pixels. Then, in the experiment, the enlargement ratio by the size enlargement means 21 was set to 2 (double in the vertical direction x 2 times in the horizontal direction). In FIG. 8, it is schematically shown as two enlarged images.

これら拡大された２ｃｈのカラー情報における画素値をそれぞれベクトルで表現すると、一般には次の式（７）および式（８）で示される。それぞれのベクトルｘ₂，ｘ₃は、前記した式（６）で示されるベクトルｘ₁と同数個の成分を持っている。 When the pixel values in the enlarged 2ch color information are expressed by vectors, they are generally expressed by the following equations (7) and (8). Each of the vectors x ₂ and x ₃ has the same number of components _{as the vector x 1} represented by the above equation (6).

合成手段２２ａは、各ベクトルｘ₁、ｘ₂、ｘ₃を入力として、それらのベクトル成分を各画素に対応させて並べて、３ｃｈの情報とする。図８では、３枚の画像として模式的に示した。なお、この時点では、例えば３×９６０×５４０個の画素ごとの特徴量に対応したメモリが必要である。 The synthesizing means 22a takes each vector x ₁ , x ₂ , and x ₃ as inputs, and arranges the vector components corresponding to each pixel to obtain 3ch information. In FIG. 8, it is schematically shown as three images. At this point, for example, a memory corresponding to the feature amount for each of 3 × 960 × 540 pixels is required.

第３の特徴抽出手段３３は、コンボリューションを行うニューラルネットワークで構成されている。本実験では、２０層のConvolution層を構築した。
また、各Convolution層では、出力としてＮ個の特徴を抽出するものとした。つまり、アウトプットチャンネル数はＮである。この実験ではＮｃｈ＝６４ｃｈとした。
なお、図８では、３層のConvolution層だけを示し、他は省略した。また、６４ｃｈのうち１２のチャンネルだけをＮｃｈとして図示し、他は省略した。 The third feature extraction means 33 is composed of a neural network that performs convolution. In this experiment, 20 Convolution layers were constructed.
Further, in each Convolution layer, N features are extracted as outputs. That is, the number of output channels is N. In this experiment, Nch = 64ch.
In FIG. 8, only three Convolution layers are shown, and the others are omitted. Further, only 12 channels out of 64 channels are shown as Nch, and the others are omitted.

Convolution層の１層目（１回目）は、入力チャンネルが３ｃｈ（色空間における３チャンネル）であり、この１層目についての６４のアウトプットチャンネルごとに、次の式（９）で表されるコンボリューションを行った。 The first layer (first time) of the Convolution layer has 3 channels (3 channels in the color space), and each of the 64 output channels for the first layer is represented by the following equation (9). Convolution was done.

式（９）において、ω_iは重みベクトルである。重みベクトルω_iは、このカラー情報拡大器における学習の際に誤差を使ってω_iを更新する、という誤差計算で決定する学習パラメータである。重みベクトルω_iは、１次元の多数変数のベクトルであって、入力される高解像度モノクロ画像１０１の画素数と同数の成分を持つ。ｂはバイアスである。なお、ｉ＝１，２，３に対応したｘ₁、ｘ₂、ｘ₃は式（６）〜式（８）で定義されている。
なお、この時点では、例えば６４×９６０×５４０個の画素ごとの特徴量に対応したメモリが必要である。 In equation (9), ω _i is a weight vector. The weight vector ω _i is a learning parameter determined by error calculation in _{which ω i} is updated by using an error during learning in this color information expander. The weight vector ω _i is a one-dimensional multivariable vector and has the same number of components as the number of pixels of the input high-resolution monochrome image 101. b is a bias. _{Note that x 1} , x ₂ , and x ₃ corresponding to i = 1, 2, and 3 are defined by equations (6) to (8).
At this point, for example, a memory corresponding to the feature amount for each of 64 × 960 × 540 pixels is required.

Convolution層の２層目（２回目）は、入力チャンネルが６４ｃｈ（前段の１層目についてのアウトプットにおける６４チャンネル）であり、２層目についての６４のアウトプットチャンネルごとに、次の式（１０）で表されるコンボリューションを行った。 The second layer (second time) of the Convolution layer has 64 channels of input channels (64 channels in the output for the first layer in the previous stage), and for each of the 64 output channels for the second layer, the following equation ( The convolution represented by 10) was performed.

式（１０）は式（９）と同様の形式で表されている。なお、ｉ＝１〜６４に対応したｘ₁〜ｘ₆₄は、前段の１層目についてのアウトプットにおける６４チャンネルのそれぞれの情報を示しており、式（６）〜式（８）と同様に定義できるので、その詳細は省略する。 Equation (10) is expressed in the same format as Equation (9). Incidentally, x ₁ ~x ₆₄ corresponding to the i = 1 to 64 indicates the respective information 64 channels at the output of the first layer of the preceding stage, as in equation (6) to (8) Since it can be defined, the details are omitted.

Convolution層の３〜１９層目（３〜１９回目）は、同様に、入力チャンネルが６４ｃｈ（前の層についてのアウトプットにおける６４チャンネル）であり、それぞれ、６４のアウトプットチャンネルごとに、前記した式（１０）で表されるコンボリューションを行った。なお、３〜１９層目においても、ｉ＝１〜６４に対応したｘ₁〜ｘ₆₄は、同様に、それらの前の層についてのアウトプットにおける６４チャンネルについての画像特徴量を示している。 Similarly, in the 3rd to 19th layers (3rd to 19th times) of the Convolution layer, the input channels are 64 channels (64 channels in the output for the previous layer), and each of the 64 output channels is described above. The convolution represented by the equation (10) was performed. Also in 3 to 19-layer, x ₁ ~x ₆₄ corresponding to the i = 1 to 64 are likewise shows an image feature amount for 64 channels at the output of their previous layer.

高解像度カラー情報推定手段２３もConvolution層で構成されている。この高解像度カラー情報推定手段２３は、出力として、色空間における２つのチャンネルに対応させた特徴をそれぞれ抽出した。つまり、アウトプットチャンネルは２ｃｈである。
このConvolution層（高解像度カラー情報推定手段２３）は、入力チャンネルが６４ｃｈ（前の層についてのアウトプットにおける６４チャンネル）であり、色空間における２つのチャンネルごとに、前記した式（１０）で表されるコンボリューションを行った。 The high-resolution color information estimation means 23 is also composed of a Convolution layer. The high-resolution color information estimation means 23 extracts features corresponding to two channels in the color space as outputs. That is, the output channel is 2ch.
This Convolution layer (high-resolution color information estimation means 23) has 64 channels of input channels (64 channels in the output of the previous layer), and each of the two channels in the color space is represented by the above equation (10). The convolution to be done was done.

前記した式（９）におけるω_iと式（１０）におけるω_iとはそれぞれ異なっている。また、アウトプットチャンネルごとにω_iはそれぞれ異なっている。さらに、前記した２０層のConvolution層には、それぞれ異なる重みベクトルω_iを用いた。 They are different from each from the omega _i in equation (10) and omega _i in the equation (9). _{Also, ω i} is different for each output channel. _{Further, different weight vectors ω i} were used for the 20 Convolution layers described above.

また、実験では、１２８２回（＝６４＋６４×１９＋２）のコンボリューションのすべてを、一例として、以下の同じ条件で、重みベクトルω_iを変えながら行った。
カーネル（kernel）：３
パディング（padding）：１
ストライド（stride）：１ In the experiment, all 1282 convolutions (= 64 + 64 × 19 + 2) were performed as an example under the same conditions as below, while changing the _{weight vector ω i.}
Kernel: 3
Padding: 1
Stride: 1

よって、実験で用いた重みベクトルの各成分の個数を総計した個数は、次の式（１１）を演算した結果の個数となる。
３×３×（３×６４＋６４×６４×１９＋６４×２） … 式（１１）
また、バイアス項の個数の総計はコンボリューションの個数と同じく、１２８２個である。これらの合計が全パラメータ数である。
つまり、実験に用いたカラー情報拡大器において、学習によって予め決定されたパラメータ群の個数は、７０３２９６＋１２８２＝７０４５７８個となる。 Therefore, the total number of each component of the weight vector used in the experiment is the number of results obtained by calculating the following equation (11).
3 × 3 × (3 × 64 + 64 × 64 × 19 + 64 × 2)… Equation (11)
The total number of bias terms is 1282, which is the same as the number of convolutions. The sum of these is the total number of parameters.
That is, in the color information magnifier used in the experiment, the number of parameter groups determined in advance by learning is 703296 + 1282 = 704578.

以上の処理により得られた高解像度カラー情報１０７を、図４に示すように、元画像である高解像度モノクロ画像１０１と合成して、高解像度カラー画像１０９を作成した（以下、実施例１）。
また、従来技術の方法で拡大したカラー情報を、元画像である高解像度モノクロ画像１０１と合成して、高解像度カラー画像を作成した（以下、比較例１）。
実施例１は、比較例１と比べて色のぼけが低減されたことを目視で確認できた。
また、式（１）で計算できる誤差が、平均値で７．６６（比較例１）から６．５３（実施例１）と約１７％減少したことを確認できた。 As shown in FIG. 4, the high-resolution color information 107 obtained by the above processing is combined with the high-resolution monochrome image 101 which is the original image to create a high-resolution color image 109 (hereinafter, Example 1). ..
Further, the color information enlarged by the method of the prior art was combined with the high-resolution monochrome image 101 which is the original image to create a high-resolution color image (hereinafter, Comparative Example 1).
In Example 1, it was confirmed visually that the color blur was reduced as compared with Comparative Example 1.
Further, it was confirmed that the error that can be calculated by the equation (1) was reduced by about 17% from 7.66 (Comparative Example 1) to 6.53 (Example 1) on average.

本実施形態に係る誤差計算器は、カラー情報拡大器やカラー情報推定器を作成する際の学習に利用することができる。 The error calculator according to the present embodiment can be used for learning when creating a color information magnifier and a color information estimator.

１自動色付け装置
３カラー情報推定器
５縮小器
７低解像度カラー情報推定器
９情報合成器
１０カラー情報拡大器
２１サイズ拡大手段
２２ａ合成手段
２３高解像度カラー情報推定手段
３１〜３３特徴抽出手段
４０誤差計算器
４１特徴量マップ作成手段
４２第１誤差算出手段
４３誤差合成手段
５１第２誤差算出手段
５２最小化手段
６０推定器
Ｓ学習装置 1 Automatic coloring device 3 Color information estimator 5 Reducer 7 Low resolution color information estimator 9 Information synthesizer 10 Color information magnifier 21 Size enlargement means 22a Synthesis means 23 High resolution color information estimator 31 to 33 Feature extraction means 40 Error Computer 41 Feature map creation means 42 First error calculation means 43 Error synthesis means 51 Second error calculation means 52 Minimization means 60 Estimator S Learning device

Claims

An error calculator that calculates the error between the first image feature amount, which is the estimated color information, and the second image feature amount, which is the true color information.
From the first image feature amount and the second image feature amount, the feature amount that characterizes the relationship of a plurality of pixels in the image is extracted by a predetermined calculation, and the first feature amount map and the second feature amount map Feature map creation means to create each
A first error calculating means for calculating an error between feature amount maps based on an error of pixel values between pixels corresponding to the first feature amount map and the second feature amount map, and a first error calculating means.
An error between the image feature amounts calculated based on the error of the pixel value between the pixels corresponding to the first image feature amount and the second image feature amount is input, and the error between the image feature amounts and the error between the image feature amounts are input. An error calculator comprising an error compositing means for generating a compositing error by adding the error between the feature quantity maps.

The feature amount map creating means uses a predetermined filter to perform an operation of filtering the first image feature amount or the second image feature amount, thereby relating a plurality of adjacent pixels. The error calculator according to claim 1, wherein a feature amount that characterizes the property is extracted to create the first feature amount map or the second feature amount map.

The feature amount map creating means maps the edge map to the first feature amount map by using an edge filter that detects an edge included in the image based on the first image feature amount or the second image feature amount. Alternatively, the error calculator according to claim 2, which is created as the second feature amount map.

Second error calculation that calculates the error between the image feature amount based on the error of the pixel value between the pixels corresponding to the first image feature amount and the second image feature amount and outputs it to the error synthesizing means. The error calculator according to any one of claims 1 to 3, further comprising means.

The first image feature amount is estimated color information that the learner outputs according to internal parameters with respect to the input value input as the training data.
The error calculator according to claim 4, wherein the second image feature amount is true color information prepared as training data of the learning device.

A set of the first image feature amount and the second image feature amount is sequentially input to the feature amount map creating means and the second error calculating means, and according to the set of the input image feature amounts, The error calculator according to claim 5, further comprising an error minimizing means for adjusting the parameters of the learning device so that the synthesis error is reduced by a predetermined calculation and supplying the adjusted parameters as parameters for updating to the learning device. ..

An error calculation program for operating a computer as an error calculator according to any one of claims 1 to 6.