CN116402724A

CN116402724A - RYB format RAW image color restoration method

Info

Publication number: CN116402724A
Application number: CN202310671470.XA
Authority: CN
Inventors: 周康; 王彬; 程银; 熊傲然; 杨元浩; 方昊栋
Original assignee: Jiangsu Daoyuan Technology Group Co ltd; Jiangsu Peregrine Microelectronics Co ltd
Current assignee: Jiangsu Daoyuan Technology Group Co ltd; Jiangsu Peregrine Microelectronics Co ltd
Priority date: 2023-06-08
Filing date: 2023-06-08
Publication date: 2023-07-07
Anticipated expiration: 2043-06-08
Also published as: CN116402724B

Abstract

The invention discloses a RAW image color restoration method of RYB format, which comprises a neural network with a two-stage structure, wherein the first-stage structure comprises ten layers of structures which are sequentially connected, and the second-stage structure is a recovery layer for restoring the output of the first-stage structure. According to the invention, the RAW graph in RYB format is processed through the neural network, the neural network is a pixel-level processed neural network, and each pixel value of the image can be modified and transformed in the network application process; by the network training method provided by the invention, the trained neural network can successfully recover the RYB format RAW image into a full-color RGB image.

Description

RYB format RAW image color restoration method

Technical Field

The invention relates to an image processing method, in particular to an image color restoration method.

Background

Currently, most of the mainstream CMOS image sensors output RAW data in Bayer format. The implementation principle is that a layer of optical filter is covered on the surface of the image sensor, so that a certain pixel point only receives light of one color, and the pixel value obtained by the image sensor is the pixel value of the single color. The bayer filter array is arranged using RGB three colors, typically in a set of 2x2 matrices, the number of colors being R: g: b=1: 2:1, each element represents a pixel, so that each pixel point has only one color, and common arrangement forms are RGGB, BGGR, GRBG and the like. If the green filter in the 2x2 matrix is replaced by a yellow filter and is in RYYB arrangement, the spectrum response of Y (yellow) in the sensor is wider, the spectrum coverage is wider, more photons can be sensed, the total light inlet quantity can be improved by about 30-40%, the signal-to-noise ratio of Luminance is obviously improved in a dark light scene, and the night shooting effect is better.

However, the conventional RGB three-color-based algorithm such as the Demosaic algorithm, the AWB algorithm, etc. cannot accurately restore the image colors to the RAW image arranged in RYYB.

Disclosure of Invention

The invention aims to: in view of the above prior art, a RYB format RAW image color reproduction method is proposed to perform color reproduction on a RYB format RAW image captured by an image sensor, and output an RGB format image.

The technical scheme is as follows: a RAW image color reproduction method of RYB format, comprising: inputting RAW images arranged in RYYB into a neural network for color reduction processing, and outputting RGB images;

the neural network comprises a two-stage structure, and the first-stage structure comprises a ten-layer structure which is sequentially connected: the first layer is a preprocessing layer, R, Y, Y, B pixels in the image are respectively extracted into independent channels through convolution, the input size of the first layer is 448 x 3, and the output size of the first layer is 448 x 4; the second to fifth layers are all convolution layers, each layer is formed by linking Conv layers, BN layers and ReLU layers, and the output sizes of the second to fifth layers are respectively as follows: 112 x 24, 56 x 96, 28 x 192, 14 x 384; the sixth to tenth layers are deconvolution layers, each layer is formed by linking an Upsample layer, a BN layer and a ReLU layer, and the output sizes of the sixth to tenth layers are respectively as follows: 14 x 384, 28 x 192, 56 x 96, 112 x 24, 448 x 4; wherein, the output of the seventh layer, the eighth layer and the ninth layer respectively correspond to the output of the fourth layer, the third layer and the second layer for residual operation;

the second stage structure of the neural network is a recovery layer for recovering the output of the first stage structure, the recovery layer is composed of an Upsample layer and 3 Conv layer links with the size of 1*1, and the final output size is 448×448×3.

Further, when training the neural network, marking the first-stage structure as a backbond, and constructing a structure Tophalf parallel to the first-stage structure, wherein the Tophalf is formed by the same structure as the first to fifth layers in the first-stage structure;

training the first stage structure first includes:

step 1: the method comprises the steps that an image sensor which outputs RAW images arranged in RGGB and RAW images arranged in RYYB are adopted to respectively shoot a plurality of scenes in the same mode, and the obtained images are respectively recorded into an image set P1 and an image set P2, so that a training set is formed;

step 2: inputting the images in the training set into the first-stage structure for training;

in the training process, inputting the images in the image set P1 into the Tophalf, inputting the corresponding images in the image set P2 into the backbox, and then respectively calculating the difference Dis of the output images of the sixth layer, the seventh layer, the eighth layer and the ninth layer in the backbox relative to the second layer, the third layer, the fourth layer and the fifth layer in the Tophalf _n ：

In the Backbone _i,j Outputting pixel values at the ith row and the jth column of the image for the corresponding layer in the backlight, tophalf _i,j Outputting pixel values at the ith row and the jth column of the image for the corresponding layer in the Tophalf; wherein n is 6,7,8 and 9, which respectively correspond to a sixth layer, a seventh layer, an eighth layer and a ninth layer in the back bone;

will make each degree of difference Dis _n The values are added according to the weights to obtain a Loss value Loss _Dis ：

Loss _Dis =0.4Dis ₆ +0.6Dis ₇ +0.8Dis ₈ +1.0Dis ₉

Performing back propagation according to the loss value to optimize network parameters of the first-stage structure;

step 3: training the second stage structure after completing the training of the first stage structure, comprising: inputting the images in the image set P2 into the trained first stage structure, inputting the images with the final output size of 448 x 4 of the first stage structure into the second stage structure, and then obtaining the loss value LossR between the images with the final output size of 448 x 3 of the second stage structure and the corresponding images in the image set P1:

wherein P1 _i,j For the pixel value at the ith row and j column of the corresponding image in the image set P1, a discover _i,j Final outputting pixel values at row i and column j of the image for the second stage structure;

and carrying out back propagation according to the loss value LossR to optimize network parameters of the second-stage structure.

The beneficial effects are that: the existing Demosaic algorithm is only applicable to RGB format images, and no RYB format RAW image processing algorithm capable of effectively improving the picture effect exists. According to the invention, the RAW graph in RYB format is processed through the neural network, the neural network is a pixel-level processed neural network, and each pixel value of the image can be modified and transformed in the network application process; by the network training method provided by the invention, the trained neural network can successfully recover the RYB format RAW image into a full-color RGB image.

Drawings

FIG. 1 is a schematic diagram of a first stage structure of a neural network according to the present invention;

FIG. 2 is a schematic diagram of the Tophalf network constructed during the neural network training process of the present invention;

FIG. 3 is a schematic diagram showing the difference values obtained during the training of the first stage structure of the neural network according to the present invention;

fig. 4 is a schematic diagram of a network structure of a second stage structure of the neural network according to the present invention.

Detailed Description

The invention is further explained below with reference to the drawings.

A RAW image color reproduction method of RYB format, comprising: and inputting the RAW images arranged in RYYB into a neural network for color reduction processing, and outputting RGB images.

The neural network includes a two-stage structure. As shown in fig. 1, the first-stage structure includes ten layers of structures connected in sequence: the first layer is a preprocessing layer, R, Y, Y, B pixels in the image are respectively extracted into independent channels through convolution, the input size of the first layer is 448 x 3, and the output size of the first layer is 448 x 4; the second to fifth layers are all convolution layers, each layer is composed of a convolution (Conv) layer, a BN layer and an activation function (ReLU) layer in a linked mode, and the output sizes of the second to fifth layers are respectively as follows: 112 x 24, 56 x 96, 28 x 192, 14 x 384; the sixth to tenth layers are deconvolution layers, each layer is composed of an up sampling (Upsample) layer, a BN layer and a ReLU layer which are linked, and the output sizes of the sixth to tenth layers are respectively as follows: 14 x 384, 28 x 192, 56 x 96, 112 x 24, 448 x 4. Wherein, the outputs of the seventh, eighth and ninth layers respectively correspond to the outputs of the fourth, third and second layers, and residual operations (Res) are performed.

As shown in fig. 4, the second stage structure of the neural network is a recovery layer for recovering the output of the first stage structure, and the recovery layer is composed of an Upsample layer and 3 Conv layer links with a size of 1*1, and the final output size is 448×448×3.

When training the above neural network, the first-stage structure is denoted as a backbond, and a structure Tophalf parallel to the first-stage structure is constructed, which is composed of the same structures as the first to fifth layers in the first-stage structure, as shown in fig. 2.

First training a first stage structure, comprising:

step 1: the image sensors which output RAW images arranged in RGGB and RAW images arranged in RYYYB are adopted to respectively shoot a plurality of scenes in the same way, namely, different image sensors are adopted to respectively shoot the same scene, other shooting parameters and conditions are the same, and the obtained images are respectively recorded into an image set P1 and an image set P2, so that a training set is formed. In the training process of this embodiment, the training set images are not less than 3000.

Step 2: the images in the training set are input into the first stage structure for training.

In the training process, the images in the image set P1 are input into Tophalf, the corresponding images in the image set P2 are input into the backbox, and then, as shown in FIG. 3, the difference degree Dis of the output images of the sixth layer, the seventh layer, the eighth layer and the ninth layer in the backbox relative to the second layer, the third layer, the fourth layer and the fifth layer in Tophalf is calculated respectively _n ：

In the Backbone _i,j Output the pixel value at the ith row and j column of the image for the corresponding layer in the backlight, tophalf _i,j Outputting pixel values at the ith row and the jth column of the image for the corresponding layer in Tophalf; and n is 6,7,8 and 9, and corresponds to a sixth layer, a seventh layer, an eighth layer and a ninth layer in the back bond respectively.

Loss _Dis =0.4Dis ₆ +0.6Dis ₇ +0.8Dis ₈ +1.0Dis ₉

The network parameters of the first phase structure are optimized by back propagation based on the loss values.

Step 3: training the second stage structure after training the first stage structure is completed.

Inputting the images in the image set P2 into a trained first stage structure, inputting the images with the final output size of 448 x 4 of the first stage structure into a second stage structure, and then obtaining loss values LossR between the images with the final output size of 448 x 3 of the second stage structure and the corresponding images in the image set P1:

wherein P1 _i,j For pixel values at the ith row, column, j, of the corresponding image in image set P1, a discover _i,j The pixel values at row j column of the i-th row of the image are finally output for the second stage structure.

The network parameters of the second phase structure are optimized by back propagation according to the loss value LossR.

Inputting the RAW graph of RYYYB into a trained neural network, and when the network is normally used, the Tophalf structure is not started, and the network finally outputs the RGB image with normal color.

The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims

1. A RAW image color reproduction method of RYB format, comprising: inputting RAW images arranged in RYYB into a neural network for color reduction processing, and outputting RGB images;

2. The method for color reproduction of a RAW image in RYB format according to claim 1, wherein the neural network is trained by marking the first-stage structure as a Backbone and constructing a structure Tophalf parallel to the first-stage structure, the Tophalf being composed of the same structure as the first to fifth layers in the first-stage structure;

training the first stage structure first includes:

in the training process, inputting the images in the image set P1 into the Tophalf, inputting the corresponding images in the image set P2 into the backbox, and then respectively calculating the sixth layer, the seventh layer, the eighth layer and the ninth layer in the backbox relative to the second layer in the TophalfThe difference degree Dis of the output images of the third layer, the fourth layer and the fifth layer _n ：

Loss _Dis =0.4Dis ₆ +0.6Dis ₇ +0.8Dis ₈ +1.0Dis ₉