CN113538223A

CN113538223A - Noise image generation method, noise image generation device, electronic device, and storage medium

Info

Publication number: CN113538223A
Application number: CN202110854815.6A
Authority: CN
Inventors: 郭桦
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2021-07-28
Filing date: 2021-07-28
Publication date: 2021-10-22
Also published as: WO2023005818A1

Abstract

The application discloses a noise image generation method and device, electronic equipment and a storage medium, and belongs to the technical field of image processing. The noise image generation method includes: acquiring a first image and a second image; wherein the first image is a noise-free image and the second image is a noisy image acquired using the target image sensor; determining a noise index value corresponding to the target image sensor; adding noise to the first image according to the noise index value to obtain a third image; and adjusting the noise distribution of the third image according to the second image to obtain a noise image corresponding to the first image.

Description

Noise image generation method, noise image generation device, electronic device, and storage medium

Technical Field

The application belongs to the technical field of image processing, and particularly relates to a noise image generation method and device, electronic equipment and a storage medium.

Background

With the improvement of image quality requirements of users, images containing noise obtained by shooting by electronic equipment cannot meet the requirements of users more and more, and therefore, the images obtained by shooting need to be subjected to further noise reduction processing.

In the process of image denoising by using related artificial intelligence algorithms such as traditional learning and deep learning, a noise-free sample image pair is generally required to be acquired or constructed so as to train an image denoising model by using the sample image pair.

In the prior art, random noise is generally used as noise in a noise image when a training sample is constructed, so that the noise image obtained by noise synthesis cannot reflect an image sensor in real electronic equipment, and noise generated in an image shooting process causes that the synthesized noise image is not real enough, thereby reducing the accuracy of a subsequent image noise reduction model training process.

Disclosure of Invention

The embodiment of the application aims to provide a noise image generation method, a noise image generation device, electronic equipment and a storage medium, and can solve the problem that in the prior art, a synthesized noise image is not real enough, so that the accuracy of a subsequent image noise reduction model training process is reduced.

In a first aspect, an embodiment of the present application provides a noise image generation method, including:

acquiring a first image and a second image; wherein the first image is a noise-free image and the second image is a noisy image acquired using the target image sensor;

determining a noise index value corresponding to the target image sensor;

adding noise to the first image according to the noise index value to obtain a third image;

and adjusting the noise distribution of the third image according to the second image to obtain a noise image corresponding to the first image.

In a second aspect, an embodiment of the present application provides a noise image generation apparatus, including:

the acquisition module is used for acquiring a first image and a second image; wherein the first image is a noise-free image and the second image is a noisy image acquired using the target image sensor;

a determining module, configured to determine a noise index value corresponding to the target image sensor;

the noise adding module is used for adding noise to the first image according to the noise index value to obtain a third image;

and the adjusting module is used for adjusting the noise distribution of the third image according to the second image to obtain a noise image corresponding to the first image.

In a third aspect, an embodiment of the present application provides an electronic device, which includes a processor, a memory, and a program or instructions stored on the memory and executable on the processor, and when executed by the processor, the program or instructions implement the steps of the method according to the first aspect.

In a fourth aspect, embodiments of the present application provide a readable storage medium, on which a program or instructions are stored, which when executed by a processor implement the steps of the method according to the first aspect.

In a fifth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement the method according to the first aspect.

In the embodiment of the application, by acquiring a noise-free first image and a noise second image acquired by a target image sensor, then adding noise to the first image according to the noise index value corresponding to the target image sensor to obtain a third image, adjusting the noise distribution of the third image according to the second image to obtain a noise image corresponding to the first image, since the third image is a noisy image generated for the corresponding noise indicator value of the target image sensor, and the third image is optimized with the true noise image, i.e. the second image, acquired by the target image sensor, and, therefore, the finally obtained noise image can have stronger pertinence to the target image sensor, the noise in the generated noise image is closer to the noise of the image really acquired by the target image sensor, therefore, the accuracy of the subsequent training process of the image noise reduction model aiming at the target sensor can be improved.

Drawings

FIG. 1 is one of the flow diagrams illustrating a method of noise image generation according to an exemplary embodiment;

FIG. 2 is a flowchart illustrating the operation of an optimized discriminative network according to an exemplary embodiment;

FIG. 3 is a second flowchart illustrating a method of noise image generation according to an exemplary embodiment;

FIG. 4 is a third flowchart illustrating a method of generating a noisy image according to an exemplary embodiment;

fig. 5 is a block diagram showing a configuration of a noise image generation apparatus according to an exemplary embodiment;

FIG. 6 is a block diagram illustrating the structure of an electronic device in accordance with an exemplary embodiment;

fig. 7 is a schematic diagram of a hardware structure of an electronic device implementing an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described clearly below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present disclosure.

The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the application may be practiced in sequences other than those illustrated or described herein, and that the terms "first," "second," and the like are generally used herein in a generic sense and do not limit the number of terms, e.g., the first term can be one or more than one. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.

The noise image generation method, apparatus, electronic device and storage medium provided in the embodiments of the present application are described in detail with reference to the accompanying drawings through specific embodiments and application scenarios thereof.

Since an image projected from a camera generally contains noise, and the higher the sensitivity (ISO) of the camera, the more serious the noise in the captured image is, the image needs to be subjected to noise reduction processing, where ISO is an International uniform index for measuring the light-sensing speed of a conventional camera.

In the process of image denoising by using related artificial intelligence algorithms such as traditional learning and deep learning, a noise-noiseless sample image pair needs to be acquired or constructed so as to train an image denoising model by using the sample image pair.

The application provides a noise image generation method which can be applied to a scene for constructing a noise image.

In addition, according to the noise image generation method provided by the embodiment of the application, the execution subject can be the noise image generation device or a control module for the noise image generation method in the noise image generation device. In the embodiment of the present application, a noise image generation method provided in the embodiment of the present application is described by taking as an example a noise image generation device executing the noise image generation method.

Fig. 1 is a flow chart illustrating a method of generating a noise image according to an exemplary embodiment.

As shown in fig. 1, the noise image generation method may include the steps of:

step 110, a first image and a second image are acquired.

In the embodiment of the present application, the first image may be a noise-free image, for example: high-definition color (Red, Green, Blue, RGB) images. The second image may be a noisy image captured using a target image sensor in the target electronic device, where the target image sensor may be a type of sensor referenced in generating a noisy image for embodiments of the application, and may also be a type of sensor targeted by a noise reduction model trained using the noisy image, and may be used to optimize the initial noisy image. The first image and the second image can be both obtained from an open data set or a data set shot by using a target image sensor, the data can be cleaned in the obtaining process, a large number of clear RGB images are reserved to be used as the first image, in addition, a plurality of RGB images with different brightness obtained by shooting through electronic equipment with the target image sensor can be used, and one of the RGB images is selected to be used as the second image.

And step 120, determining a noise index value corresponding to the target image sensor.

Here, since different image sensors have different noise index values, it is necessary to determine their corresponding noise index values for the different image sensors. The noise index value may be a value capable of measuring a noise degree of the target image sensor, and may be represented by noiseVariance ═ Ax + B, where noiseVariance may be the noise index value, x may be a pixel value of each pixel in a RAW image corresponding to the first image, a and B may both be parameters, a may be a parameter of one type of noise, and B may be a parameter of another type of noise. Therefore, the noise index value to be added to each pixel point in the first image can be determined by determining A and B.

And step 130, adding noise to the first image according to the noise index value to obtain a third image.

Here, the third image may be an initial noise image, and in order to better add noise to the first image, noise may be added to the RAW image corresponding to the first image, specifically, the pixel value x of each pixel point in the RAW image may be input into the noise model noise variance ═ Ax + B, and a noise index value that needs to be added to each pixel point is output, so as to obtain the third image, that is, the initial noise image.

And step 140, adjusting the noise distribution of the third image according to the second image to obtain a noise image corresponding to the first image.

Here, the noise image corresponding to the first image may be a noise image closer to the actual scene, which may be used for training of a noise reduction model.

For example, the noise distribution of the second image may be determined, and then the noise distribution of the third image may be adjusted according to the noise distribution of the second image, so that the finally obtained noise image is the same as or similar to the noise distribution of the second image, thereby achieving the purpose of being closer to the noise characteristics of the actual image.

In an alternative embodiment, in order to further improve the reality of the generated noise image, the neural network model may be used to learn the noise distribution of the second image, and then the neural network model may be used to adjust the noise distribution of the third image, so as to implement the optimization process on the third image. Specifically, the third image may be optimized by using an optimized discrimination network, and the optimized discrimination network may perform domain confrontation generation between the third image and the second image, so that the noise distributions of the third image and the second image are kept as consistent as possible or the similarity is greater than a preset threshold, so as to improve the reality of the generated noise image.

Based on this, the step 140 may specifically include:

inputting the third image into P continuous convolution layers in the first network to obtain P characteristic information output by the P convolution layers;

and inputting the characteristic information output by the P-th convolution layer into P continuous deconvolution layers in the first network, and outputting to obtain a noise image corresponding to the first image.

Here, the first network may be obtained by training the second image, P may be a positive integer, P is greater than or equal to 2, P convolutional layers and P deconvolution layers may correspond to each other one by one, input information of the first deconvolution layer may be first feature information output by the first convolutional layer and second feature information output by the second deconvolution layer, the first convolutional layer may be any one of the P convolutional layers, the first deconvolution layer may be a deconvolution layer corresponding to the first convolutional layer of the P deconvolution layers, and the first deconvolution layer may be a next deconvolution layer of the second deconvolution layer.

Specifically, the third image may be input into P successive convolutional layers in the first network, and the target image is subjected to a matrix operation Feature in each convolution operation_n＝w_n(w_n-1(…(w₁x+b₁))+b_n-1)+b_nCan output the corresponding Feature vector, wherein Feature_nAnd w is a weight representing the feature vector output by the nth convolution module, namely the convolution layer. Connecting the eigenvector output by the last deconvolution layer and the eigenvector output by the convolution layer corresponding to the last deconvolution layer through a jump structure, inputting the next deconvolution layer, and performing matrix operation in each deconvolution operation

And then outputting the optimized third image, namely the noise image corresponding to the first image, wherein,

and w is a weight, and the feature vector output by the last deconvolution layer and the feature vector output by the convolution layer corresponding to the last deconvolution layer are connected through a jump structure, so that the detail information of the image can be fully retained.

In a specific example, the work flow of the optimized network, i.e. the first network, can be as shown in fig. 2, and the optimized network 220 can include an input convolutional layer 221, 4 convolutional modules 2221-. The input convolutional layer 221 corresponds to the output deconvolution layer 222, the first convolutional module 2221 corresponds to the fourth deconvolution module 2234, the second convolutional module 2222 corresponds to the third deconvolution module 2233, the third convolution module 2223 corresponds to the second deconvolution module 2232, and the fourth convolution module 2224 corresponds to the first deconvolution module 2231. That is, if the first convolutional layer is the input convolutional layer 221, the first deconvolution layer is the output deconvolution layer 222, and the second deconvolution layer is the fourth deconvolution module 2234; if the first convolution layer is the first convolution module 2221, the first deconvolution layer is the fourth deconvolution module 2234, and the second deconvolution layer is the third deconvolution module 2233, and so on, which will not be described herein again.

Illustratively, the synthesized domain image 210, that is, the third image, is input to 4 consecutive convolutional layers in the optimization network 220 through the convolutional layer 221 to obtain 4 feature information output by the 4 convolutional layers, specifically, a first feature vector is obtained through the first convolution module 2221, a second feature vector is obtained through the second convolution module 2222, a third feature vector is obtained through the third convolution module 2223, and a fourth feature vector is obtained through the fourth convolution module 2224. The first feature information is input to 4 consecutive deconvolution layers in the optimization network 220 to obtain the noise image 230, specifically, the fourth feature vector and the fifth feature vector obtained by the first deconvolution module 2231 are connected by a first skip structure 22311 and then input to the second deconvolution module 2232 to obtain a sixth feature vector, the third feature vector and the sixth feature vector are connected by a second skip structure 22321 and then input to the third deconvolution module 2233 to obtain a seventh feature vector, the second feature vector and the seventh feature vector are connected by a third skip structure 22331 and then input to the fourth deconvolution module 2234 to obtain an eighth feature vector, the first feature vector and the eighth feature vector are connected by a fourth skip structure 22341 and then input to the output deconvolution layer 222 to obtain the noise image 230.

In this way, the third image is optimized through the P successive convolution layers and deconvolution layers in the first network and the jump structure, so that a noise image closer to the actual scene than the third image can be obtained, and the reality of the generated noise image is improved.

In addition, since there may be a plurality of noise index values corresponding to the target image sensor, noise may be added to the first image according to the plurality of noise index values, respectively, to obtain a plurality of third images, for example, M third images.

For the above training process of the first network, in an optional implementation manner, when the number of the third images is M, step 140 may specifically include:

inputting the target image into a first network, and outputting to obtain a sixth image;

acquiring a first noise distribution characteristic corresponding to the sixth image and a second noise distribution characteristic corresponding to the second image;

inputting the first noise distribution characteristic and the second noise distribution characteristic into a second network, and outputting to obtain a similarity value between the first noise distribution characteristic and the second noise distribution characteristic;

and under the condition that the similarity value is smaller than the preset threshold value, adjusting the network parameters of the first network according to the similarity value and the corresponding loss value until the first network converges to obtain the trained first network.

Here, the target image may be any of M third images, M may be a positive integer, and M ≧ 2. The sixth image may be an image obtained by performing first network optimization on the target image. The M noise images corresponding to the first image may be noise images closer to the true noise distribution than the target image. The first network and the second network may constitute a generation countermeasure network. The second image may be an image randomly selected from a plurality of RGB images with different brightness captured by the target image sensor, and the second image may be randomly selected a plurality of times during the training process.

For example, the optimized discrimination network may be a two-stage network model, where the first stage network may be an optimized network, that is, a first network; the second segment of the network may be a discriminating network, i.e. a second network. The first network may be used to optimize the target image to generate a noise image that is closer to the true noise distribution, and the second network may be used to determine a similarity value between a first noise distribution feature corresponding to the sixth image and a second noise distribution feature corresponding to the second image. Here, the first network may be trained, and when the similarity value is smaller than the preset threshold, the network parameter of the first network may be adjusted until the first network converges, so as to obtain a trained first network, where the trained first network may be used to generate a noise image more fitting the actual scene.

Besides, the sixth image output by the optimization network still belongs to the synthesis domain image, and the probability distribution of noise can be represented by Px; the noise probability distribution of the real domain image, i.e. the second image, can be represented by Py. Inputting the two images into a discrimination network, outputting a feature vector (feature map) through a continuous convolution layer, finally inputting the feature vector into three continuous full-connection layers, and finally outputting a probability value positioned in a section [0, 1], wherein the probability value represents the similarity between the images of the synthesis domain and the real domain, the noise distribution of the two images is more similar when the probability value is closer to 1, and the noise distribution difference of the two images is larger when the probability value is closer to 0.

In addition, when the probability value is lower than 0.5, the noise distribution between the sixth image and the second image is considered to have a large difference, the probability value is fed back to the optimization network in the first stage, the optimization network adjusts weight coefficients (weights) in the network, the optimized sixth image is regenerated, and then the regenerated sixth image is input into the discrimination network to perform similarity discrimination between the second image and the sixth image. Specifically, the weight adjustment may be determined by solving the partial derivative of the probability loss to the weight, for example, by updating w ^ new ^ Δ w + w ^ old to perform the weight adjustment and training until the training converges to obtain the final model, i.e., the trained first network.

In a specific example, the workflow of the discrimination network, that is, the second network, may be as shown in fig. 2, first inputting the synthesized domain image 210, that is, the target image, into the optimization network 220, and outputting to obtain the initial noise image 230, that is, the sixth image. Acquiring a first noise distribution characteristic 2301 corresponding to the noise image 230 and a second noise distribution characteristic 2501 corresponding to the real domain image 250, namely the second image, through continuous convolutional layers 241 in the discrimination network 240, inputting the first noise distribution characteristic 2301 and the second noise distribution characteristic 2501 into continuous three-layer full-connection layers 242 in the discrimination network 240, outputting a similarity value between the first noise distribution characteristic 2301 and the second noise distribution characteristic 2501, and training the optimization network 220 according to the similarity to obtain a trained optimization network so as to generate a noise image more fitting the actual scene.

In this way, in the process of training the first network, the second network is used for judging the similarity between the sixth image and the second image, and then the network parameters of the first network are adjusted according to the judgment result, so that the first network has the capability of optimizing the image noise distribution, and the authenticity of the noise image generated after the first network is optimized is further improved.

In addition, in an alternative embodiment, in the case that the first image is a noise-free RGB image, before step 130, the noise image generation method may further include:

converting the first image from an RGB image into an original image file;

based on this, the step 130 may include:

adding noise to the original image file according to the noise index value to obtain a third image;

and converting the third image from the original image file into an RGB image.

Here, the original image file may be a RAW image. The noise distribution in the RGB image is complex and difficult to process, and noise may be added to the RAW image for better processing of the noise distribution.

For example, a high-definition RGB Image may be acquired when the first Image is acquired, and the first Image is converted into a RAW Image from the RGB Image through an Image Signal Processing (ISP) operation. Specifically, the RAW image can be obtained by, for example, inverse tone mapping, gamma reverse correction, and inverse digital gain. Here, the ISP may include: black level compensation, color interpolation (demosaicing), denoising, automatic white balance, color correction and the like. The inverse tone mapping is a technology for converting Standard Dynamic Range (SDR) source signals into High Dynamic Range (HDR) source signals, can be applied to a production end or a terminal device, and realizes HDR 'restoration' and upward compatibility of existing SDR programs to a certain extent; the inverse gamma correction may be a method of editing a gamma curve of an image to perform nonlinear tone editing on the image, and may detect a dark color portion and a light color portion in an image signal and increase a ratio of the two portions, thereby improving an image contrast effect.

Based on this, adding noise to the first image according to the noise index value may be adding noise to the RAW image, that is, the original image file according to the noise index value. The third image thus obtained may also be a RAW image, i.e. an original image file, and therefore, before adjusting the noise distribution of the third image, the third image may also be converted from the original image file into an RGB image.

In this way, the RAW image can better reflect the noise distribution situation, so that the effect of noise addition can be better and the noise distribution characteristic of the image can be conveniently extracted by converting the first image from the RGB image into the RAW image and then adding the noise.

Thereby, by acquiring a noise-free first image and a noisy second image acquired by the target image sensor, then adding noise to the first image according to the noise index value corresponding to the target image sensor to obtain a third image, adjusting the noise distribution of the third image according to the second image to obtain a noise image corresponding to the first image, since the third image is a noisy image generated for the corresponding noise indicator value of the target image sensor, and the third image is optimized with the true noise image, i.e. the second image, acquired by the target image sensor, and, therefore, the finally obtained noise image can have stronger pertinence to the target image sensor, the noise in the generated noise image is closer to the noise of the image really acquired by the target image sensor, therefore, the accuracy of the subsequent training process of the image noise reduction model aiming at the target sensor can be improved.

Based on the above steps 110-140, in a possible embodiment, as shown in fig. 3, the step 120 may specifically include: step 1201-:

step 1201, determining a target poisson noise index value and a target gaussian noise index value corresponding to the target image sensor.

Here, regarding the noise index value noiseVariance ═ Ax + B, where a may be a target poisson noise index value corresponding to the target image sensor, B may be a target gaussian noise index value corresponding to the target image sensor, and x may be a pixel value of each pixel point in the first image.

For example, in the case of determining the target poisson noise index value a and the target gaussian noise index value B, the noise index value noiseVariance of the target image sensor corresponding to each pixel point may be determined.

Based on this, in an optional implementation, step 1201 may specifically include:

acquiring N fourth images;

traversing N fourth images, and respectively calculating a first pixel average value and a first pixel variance value of pixel points contained in each fourth image;

dividing the first pixel variance value by the first pixel average value to obtain a poisson noise index value corresponding to each fourth image;

determining a first mapping relation between the Poisson noise index values and the light sensitivity according to N light sensitivities corresponding to the N fourth images and the N Poisson noise index values;

and determining M Poisson noise index values corresponding to the M target sensitivities according to the first mapping relation, wherein the M Poisson noise index values are used as target Poisson noise index values corresponding to the target image sensors.

Here, the fourth image may be an image of a standard color chart acquired at different sensitivities using the target image sensor, N may be a positive integer, and N ≧ 2. The M target sensitivities may be M sensitivities determined from a sensitivity interval corresponding to the target image sensor, M may be a positive integer, and M ≧ 2.

In a specific example, different shooting devices have different ISO segments, and in order to make the generated noise image closer to the real scene, the ISO values may be randomly selected or selected at equal intervals in the ISO segment corresponding to the target image sensor, and the ISO may be calculated from the analog gain and the digital gain set by the target image sensor. The method comprises the steps of shooting 24 color card images under different ISO conditions, calculating an average value and a variance value of pixel points contained in each image of the 24 color card images, dividing each variance value by the average value to obtain 10 Poisson noise index values corresponding to the 10 24 color card images, and determining a mapping relation A between the Poisson noise index values and the light sensitivity according to 10 light sensitivity and 10 Poisson noise index values respectively corresponding to the 10 24 color card images, wherein the mapping relation A is a 0-ISO + a1, a0 and a1 can be parameters of a target image sensor, and according to the mapping relation, 5 Poisson noise index values corresponding to 5 randomly selected target light sensitivity can be determined to serve as 5 target Poisson noise index values A corresponding to the target image sensor.

In this way, by determining the first mapping relationship between the poisson noise index value and the sensitivity, a plurality of target poisson noise index values can be obtained, so that a plurality of third images capable of basically covering a sensitivity interval corresponding to the target image sensor are generated, a plurality of noise images corresponding to the first image are finally obtained, and the noise environment of the target image sensor about poisson noise in different scenes can be fully simulated.

In addition, in an optional implementation, step 1201 may specifically further include:

acquiring K fifth images;

traversing the K fifth images, respectively calculating a second pixel variance value of a pixel point contained in each image, and taking the second pixel variance values corresponding to the K fifth images as K Gaussian noise index values corresponding to the K fifth images;

determining a second mapping relation between Gaussian noise index values and the sensitivity according to K sensitivity values corresponding to the K fifth images and K Gaussian noise index values based on a maximum likelihood estimation algorithm;

and determining M Gaussian noise index values corresponding to the M target sensitivities according to the second mapping relation, wherein the M Gaussian noise index values are used as target Gaussian noise index values corresponding to the target image sensors.

Here, the fifth image may be an image captured at different sensitivities using the target image sensor, the image may be a black image, K may be a positive integer, and K ≧ 2. Specifically, the fifth image may be obtained by capturing a black image using the target image sensor, or may be obtained by capturing an image while blocking a lens of the target image sensor.

In a specific example, 20 full black images may be captured under different ISO conditions, variance values of pixel points included in each of the 20 full black images are calculated, the 20 variance values are used as 20 gaussian noise index values corresponding to the 20 full black images, and a mapping relationship B between the gaussian noise index values and sensitivity is determined as B0 · ISO + B1 · ISO based on a maximum likelihood estimation algorithm, where B0 and B1 may both be parameters of the target image sensor, and according to the mapping relationship, 10 gaussian noise index values corresponding to 10 target sensitivities may be determined as 10 gaussian noise index values B corresponding to the target image sensor.

In this way, by determining the second mapping relationship between the gaussian noise index values and the sensitivity, a plurality of target gaussian noise index values can be obtained, so that a plurality of third images capable of basically covering a sensitivity interval corresponding to the target image sensor are generated, a plurality of noise images corresponding to the first image are finally obtained, and the noise environment of the target image sensor about gaussian noise in different scenes can be fully simulated.

And step 1202, calculating a noise index value corresponding to the target image sensor according to the target Poisson noise index value and the target Gaussian noise index value.

In one specific example, the noise index value noiseVariance corresponding to the target image sensor may be calculated as Ax + B based on the plurality of target poisson noise index values a and the plurality of target gaussian noise index values B.

Therefore, through the process, the noise of Gaussian distribution and the noise of Poisson distribution can be synthesized simultaneously, the noise distribution is more diversified, and the reality of the noise image is further improved.

To better describe the whole scheme, based on the above embodiments, as a specific example, as shown in fig. 4, the noise image generation method may include

steps

410 and 450, which are explained in detail below.

At step 410, a first image and a second image are acquired.

Here, a high-definition RGB image, that is, a first image, and a noisy RGB image, that is, a second image, captured using the target image sensor may be acquired.

In step 420, the first image is converted into a RAW image.

Here, the noise distribution in the RGB image is complex and difficult to process, so that noise addition can be performed on the basis of the RAW image, but the high-definition RAW image is difficult to acquire, so that the RGB image can be acquired first and then converted into the RAW image, and the specific conversion method is not described herein again.

And step 430, calibrating the noise index value of the sensor.

Here, since different sensors have different noise intensities, it is necessary to calibrate their noise index values for the different sensors. The noise model may be represented by noise Variance ═ Ax + B, where a and B are noise index values that need to be calibrated, a may be a target poisson noise index value, B may be a target gaussian noise index value, x may be a pixel value of each pixel point in the RAW image, and noise Variance may be a noise Variance value corresponding to each pixel point for the target image sensor, that is, a noise index value. The specific calibration process is not described herein.

Step 440, generating a third image based on the RAW image.

Here, noise addition is performed based on the RAW image, and it is easier to handle noise distribution.

And step 450, inputting the third image and the second image into an optimization and discrimination network.

Here, the initial noise image, that is, the third image, and the second image are input to the optimization and discrimination network to be optimized and discriminated a plurality of times, so that a noise image closer to the actual scene can be obtained. Of course, the first image and one or more noise images output by the optimization and discrimination network may be combined to form a noise-free sample image pair for training an image noise reduction model corresponding to the target image sensor.

Based on the same inventive concept, the application also provides a noise image generation device. The following describes in detail the noise image generation device according to the embodiment of the present application with reference to fig. 5.

Fig. 5 is a block diagram showing a configuration of a noise image generation apparatus according to an exemplary embodiment.

As shown in fig. 5, the noise image generation apparatus 500 may include:

an obtaining module 501, configured to obtain a first image and a second image; wherein the first image is a noise-free image and the second image is a noisy image acquired using the target image sensor;

a determining module 502, configured to determine a noise index value corresponding to the target image sensor;

a noise adding module 503, configured to add noise to the first image according to the noise index value to obtain a third image;

and an adjusting module 504, configured to adjust the noise distribution of the third image according to the second image, so as to obtain a noise image corresponding to the first image.

The following describes the noise image generation device 500 in detail, specifically as follows:

in one embodiment, the determining module 502 may include:

the determining submodule is used for determining a target Poisson noise index value and a target Gaussian noise index value corresponding to the target image sensor;

and the calculating submodule is used for calculating to obtain a noise index value corresponding to the target image sensor according to the target Poisson noise index value and the target Gaussian noise index value.

In one embodiment, the determining sub-module may include:

a first acquisition unit configured to acquire N fourth images; the fourth image is an image of a standard color chart acquired by using a target image sensor under different photosensitivities, N is a positive integer and is more than or equal to 2;

the first calculation unit is used for traversing the N fourth images and respectively calculating a first pixel average value and a first pixel variance value of pixel points contained in each fourth image;

the second calculation unit is used for dividing the first pixel variance value by the first pixel average value to obtain a Poisson noise index value corresponding to each fourth image;

a first relationship determination unit configured to determine a first mapping relationship between poisson noise index values and sensitivity, based on N sensitivities and N poisson noise index values corresponding to the N fourth images;

a first index determining unit configured to determine, as target poisson noise index values corresponding to the target image sensors, M poisson noise index values corresponding to the M target sensitivities according to the first mapping relationship; the M target sensitivities are M sensitivities determined from sensitivity intervals corresponding to the target image sensor, M is a positive integer and is larger than or equal to 2.

In one embodiment, the determining sub-module may further include:

a second acquisition unit configured to acquire K fifth images; the fifth image is an image acquired by using a target image sensor under different photosensitivities, K is a positive integer and is more than or equal to 2;

the third calculation unit is used for traversing the K fifth images, calculating a second pixel variance value of a pixel point contained in each fifth image respectively, and taking the second pixel variance values corresponding to the K fifth images as K Gaussian noise index values corresponding to the K fifth images;

a second relationship determination unit configured to determine a second mapping relationship between the gaussian noise index values and the sensitivity according to K sensitivities corresponding to the K fifth images and K gaussian noise index values;

and the second index determining unit is used for determining M Gaussian noise index values corresponding to the M target sensitivities according to the second mapping relation, and the M Gaussian noise index values are used as target Gaussian noise index values corresponding to the target image sensor.

In one embodiment, the adjusting module 504 includes:

the image input submodule is used for inputting the target image to the continuous P convolutional layers to obtain P characteristic information output by the P convolutional layers;

the characteristic input submodule is used for inputting the characteristic information output by the P-th convolutional layer into the continuous P deconvolution layers and outputting the characteristic information to obtain a sixth image;

the method comprises the following steps that P is a positive integer, P is larger than or equal to 2, P convolutional layers correspond to P deconvolution layers one by one, input information of a first deconvolution layer is first characteristic information output by the first convolutional layer and second characteristic information output by the second deconvolution layer, the first convolutional layer is any one of the P convolutional layers, the first deconvolution layer is the deconvolution layer corresponding to the first convolutional layer in the P deconvolution layers, and the first deconvolution layer is the next deconvolution layer of the second deconvolution layer.

In one embodiment, in the case that the number of the third images is M, the adjusting module 504 may further include:

the target image processing submodule is used for inputting the target image into the first network and outputting the target image to obtain a sixth image before inputting the third image into the continuous P convolutional layers in the first network to obtain P characteristic information output by the P convolutional layers; wherein the target image is any image in the M third images;

the acquisition submodule is used for acquiring a first noise distribution characteristic corresponding to the sixth image and a second noise distribution characteristic corresponding to the second image;

the characteristic processing submodule is used for inputting the first noise distribution characteristic and the second noise distribution characteristic into a second network and outputting to obtain a similarity value between the first noise distribution characteristic and the second noise distribution characteristic;

and the adjusting submodule is used for adjusting the network parameters of the first network according to the similarity value and the corresponding loss value thereof under the condition that the similarity value is smaller than the preset threshold value until the first network is converged to obtain the trained first network.

In one embodiment, the first image is a noise-free RGB image;

the noise image generation apparatus 500 may further include:

a conversion module 505, configured to convert the first image from an RGB image into an original image file before adding noise to the first image according to the noise index value to obtain a third image;

the noise adding module 503 may include:

the noise adding submodule is used for adding noise to the original image file according to the noise index value to obtain a third image;

and the conversion sub-module is used for converting the third image from the original image file into an RGB image.

The noise image generation device in the embodiment of the present application may be a device, or may be a component, an integrated circuit, or a chip in a terminal. The device can be mobile electronic equipment or non-mobile electronic equipment. By way of example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and the non-mobile electronic device may be a server, a Network Attached Storage (NAS), a Personal Computer (PC), a Television (TV), a teller machine or a self-service machine, and the like, and the embodiments of the present application are not particularly limited.

The noise image generation device in the embodiment of the present application may be a device having an operating system. The operating system may be an Android (Android) operating system, an ios operating system, or other possible operating systems, and embodiments of the present application are not limited specifically.

The noise image generation device provided in the embodiment of the present application can implement each process implemented by the method embodiments of fig. 1 to fig. 4, and is not described herein again to avoid repetition.

Optionally, as shown in fig. 6, an electronic device 600 further provided in the embodiment of the present application includes a processor 601, a memory 602, and a program or an instruction stored in the memory 602 and executable on the processor 601, where the program or the instruction is executed by the processor 601 to implement the processes of the above noise image generation method embodiment, and can achieve the same technical effect, and in order to avoid repetition, the details are not repeated here.

It should be noted that the electronic device in the embodiment of the present application includes the mobile electronic device and the non-mobile electronic device described above.

The electronic device 700 includes, but is not limited to: a radio frequency unit 701, a network module 702, an audio output unit 703, an input unit 704, a sensor 705, a display unit 706, a user input unit 707, an interface unit 708, a memory 709, and a processor 710.

Those skilled in the art will appreciate that the electronic device 700 may also include a power supply (e.g., a battery) for powering the various components, and the power supply may be logically coupled to the processor 710 via a power management system, such that the functions of managing charging, discharging, and power consumption may be performed via the power management system. The electronic device structure shown in fig. 7 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than those shown, or combine some components, or arrange different components, and thus, the description is omitted here.

Wherein, the input unit 704 is used for acquiring a first image and a second image; wherein the first image is a noise-free image and the second image is a noisy image acquired using the target image sensor;

a processor 710 for determining a noise indicator value corresponding to a target image sensor; adding noise to the first image according to the noise index value to obtain a third image; and adjusting the noise distribution of the third image according to the second image to obtain a noise image corresponding to the first image.

Therefore, the noise image corresponding to the first image is obtained by acquiring the noise-free first image and the noisy second image collected by the target image sensor, then adding noise to the first image according to the noise index value corresponding to the target image sensor to obtain the third image, and then adjusting the noise distribution of the third image according to the second image.

Optionally, the processor 710 is configured to determine a target poisson noise index value and a target gaussian noise index value corresponding to the target image sensor;

and calculating to obtain a noise index value corresponding to the target image sensor according to the target Poisson noise index value and the target Gaussian noise index value.

Optionally, the input unit 704 is specifically configured to acquire N fourth images; the fourth image is an image of a standard color chart acquired by using a target image sensor under different photosensitivities, N is a positive integer and is more than or equal to 2;

the processor 710 is specifically configured to traverse the N fourth images, and respectively calculate a first pixel average value and a first pixel variance value of a pixel point included in each fourth image; dividing the first pixel variance value by the first pixel average value to obtain a poisson noise index value corresponding to each fourth image; determining a first mapping relation between the Poisson noise index values and the light sensitivity according to N light sensitivities corresponding to the N fourth images and the N Poisson noise index values; determining M Poisson noise index values corresponding to the M target sensitivities according to the first mapping relation, and taking the M Poisson noise index values as target Poisson noise index values corresponding to the target image sensor; the M target sensitivities are M sensitivities determined from sensitivity intervals corresponding to the target image sensor, M is a positive integer and is larger than or equal to 2.

Optionally, the input unit 704 is further specifically configured to acquire K fifth images; the fifth image is an image acquired by using a target image sensor under different photosensitivities, K is a positive integer and is more than or equal to 2;

the processor 710 is further specifically configured to traverse the K fifth images, calculate second pixel variance values of pixel points included in each fifth image, and use the second pixel variance values corresponding to the K fifth images as K gaussian noise index values corresponding to the K fifth images; determining a second mapping relation between the Gaussian noise index values and the sensitivity according to the K sensitivities corresponding to the K fifth images and the K Gaussian noise index values; and determining M Gaussian noise index values corresponding to the M target sensitivities according to the second mapping relation, wherein the M Gaussian noise index values are used as target Gaussian noise index values corresponding to the target image sensors.

Optionally, the processor 710 is further specifically configured to input the third image to P consecutive convolutional layers in the first network, so as to obtain P feature information output by the P convolutional layers; the first network is obtained according to the second image training; inputting the characteristic information output by the P-th convolution layer into P continuous deconvolution layers in a first network, and outputting to obtain a noise image corresponding to the first image; the method comprises the following steps that P is a positive integer, P is larger than or equal to 2, P convolutional layers correspond to P deconvolution layers one by one, input information of a first deconvolution layer is first characteristic information output by the first convolutional layer and second characteristic information output by the second deconvolution layer, the first convolutional layer is any one of the P convolutional layers, the first deconvolution layer is the deconvolution layer corresponding to the first convolutional layer in the P deconvolution layers, and the first deconvolution layer is the next deconvolution layer of the second deconvolution layer.

Optionally, the processor 710 is specifically further configured to, when the number of the third images is M, input the target image to the first network, and output the target image to obtain a sixth image; wherein the target image is any image in the M third images; acquiring a first noise distribution characteristic corresponding to the sixth image and a second noise distribution characteristic corresponding to the second image; inputting the first noise distribution characteristic and the second noise distribution characteristic into a second network, and outputting to obtain a similarity value between the first noise distribution characteristic and the second noise distribution characteristic; and under the condition that the similarity value is smaller than the preset threshold value, adjusting the network parameters of the first network according to the similarity value and the corresponding loss value until the first network converges to obtain the trained first network.

Optionally, the processor 710 is further specifically configured to convert the first image from an RGB image to an original image file; adding noise to the original image file according to the noise index value to obtain a third image; and converting the third image from the original image file into an RGB image.

Therefore, the noise distribution is more diversified by simultaneously synthesizing the noise of the Gaussian distribution and the noise of the Poisson distribution, and the reality of the noise image can be further improved.

It should be understood that in the embodiment of the present application, the input Unit 704 may include a Graphics Processing Unit (GPU) 14041 and a microphone 7042, and the Graphics processor 7041 processes image data of still pictures or videos obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The display unit 706 may include a display panel 7061, and the display panel 7061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 707 includes a touch panel 7071 and other input devices 7072. The touch panel 7071 is also referred to as a touch screen. The touch panel 7071 may include two parts of a touch detection device and a touch controller. Other input devices 7072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in detail herein. Memory 709 may be used to store software programs as well as various data, including but not limited to applications and operating systems. Processor 710 may integrate an application processor, which primarily handles operating systems, user interfaces, applications, etc., and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 710.

The embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored on the readable storage medium, and when the program or the instruction is executed by a processor, the program or the instruction implements each process of the above noise image generation method, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.

The processor is the processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium, such as a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and so on.

The embodiment of the present application further provides a chip, where the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement each process of the above noise image generation method embodiment, and can achieve the same technical effect, and in order to avoid repetition, the description is omitted here.

It should be understood that the chips mentioned in the embodiments of the present application may also be referred to as system-on-chip, system-on-chip or system-on-chip, etc.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Further, it should be noted that the scope of the methods and apparatus of the embodiments of the present application is not limited to performing the functions in the order illustrated or discussed, but may include performing the functions in a substantially simultaneous manner or in a reverse order based on the functions involved, e.g., the methods described may be performed in an order different than that described, and various steps may be added, omitted, or combined. In addition, features described with reference to certain examples may be combined in other examples.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a computer software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present application.

While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A noise image generation method characterized by comprising:

acquiring a first image and a second image; wherein the first image is a noise-free image and the second image is a noisy image acquired using a target image sensor;

determining a noise index value corresponding to the target image sensor;

2. The method of claim 1, wherein determining a noise indicator value corresponding to the target image sensor comprises:

determining a target Poisson noise index value and a target Gaussian noise index value corresponding to the target image sensor;

3. The method of claim 2, wherein determining a target poisson noise index value corresponding to the target image sensor comprises:

acquiring N fourth images; the fourth image is an image of a standard color chart acquired by using the target image sensor under different photosensitivities, N is a positive integer and is more than or equal to 2;

traversing the N fourth images, and respectively calculating a first pixel average value and a first pixel variance value of pixel points contained in each fourth image;

determining a first mapping relation between the Poisson noise index value and the sensitivity according to N sensitivities corresponding to the N fourth images and N Poisson noise index values;

determining M Poisson noise index values corresponding to M target sensitivities according to the first mapping relation, wherein the M Poisson noise index values serve as target Poisson noise index values corresponding to the target image sensors; the M target sensitivities are M sensitivities determined from sensitivity intervals corresponding to the target image sensors, M is a positive integer and is larger than or equal to 2.

4. The method of claim 2, wherein determining the target gaussian noise index value corresponding to the target image sensor comprises:

acquiring K fifth images; the fifth image is an image acquired by using the target image sensor under different photosensitivities, K is a positive integer and is more than or equal to 2;

traversing the K fifth images, respectively calculating a second pixel variance value of a pixel point contained in each fifth image, and taking the second pixel variance values respectively corresponding to the K fifth images as K Gaussian noise index values corresponding to the K fifth images;

determining a second mapping relation between the Gaussian noise index values and the sensitivity according to K sensitivities corresponding to the K fifth images and the K Gaussian noise index values;

and determining M Gaussian noise index values corresponding to the M target sensitivities according to the second mapping relation, wherein the M Gaussian noise index values are used as target Gaussian noise index values corresponding to the target image sensor.

5. The method of claim 1, wherein the adjusting the noise distribution of the third image according to the second image to obtain a noise image corresponding to the first image comprises:

inputting the third image into P continuous convolutional layers in a first network to obtain P characteristic information output by the P convolutional layers; the first network is obtained by training according to the second image;

inputting feature information output by the P-th convolutional layer into P continuous deconvolution layers in the first network, and outputting to obtain a noise image corresponding to the first image;

the convolution layer comprises P convolution layers, wherein P is a positive integer and is not less than 2, the P convolution layers correspond to the P deconvolution layers one by one, input information of a first deconvolution layer is first characteristic information output by the first convolution layer and second characteristic information output by a second deconvolution layer, the first convolution layer is any one of the P convolution layers, the first deconvolution layer is a deconvolution layer corresponding to the first convolution layer in the P deconvolution layers, and the first deconvolution layer is a next deconvolution layer of the second deconvolution layer.

6. The method according to claim 5, wherein in a case that the number of the third images is M, before inputting the third images into consecutive P convolutional layers in the first network and obtaining P feature information output by the P convolutional layers, the method further comprises:

inputting a target image into the first network, and outputting to obtain a sixth image; wherein the target image is any image of the M third images;

and under the condition that the similarity value is smaller than a preset threshold value, adjusting the network parameters of the first network according to the similarity value and the corresponding loss value thereof until the first network converges to obtain a trained first network.

7. A noise image generation device characterized by comprising:

the acquisition module is used for acquiring a first image and a second image; wherein the first image is a noise-free image and the second image is a noisy image acquired using a target image sensor;

a determination module for determining a noise index value corresponding to the target image sensor;

8. The apparatus of claim 7, wherein the determining module comprises:

a determining submodule for determining a target poisson noise index value and a target gaussian noise index value corresponding to the target image sensor;

and the calculating submodule is used for calculating a noise index value corresponding to the target image sensor according to the target Poisson noise index value and the target Gaussian noise index value.

9. The apparatus of claim 8, wherein the determination submodule comprises:

a first acquisition unit configured to acquire N fourth images; the fourth image is an image of a standard color chart acquired by using the target image sensor under different photosensitivities, N is a positive integer and is more than or equal to 2;

a first relationship determination unit configured to determine a first mapping relationship between the poisson noise index value and the sensitivity according to N sensitivities and N poisson noise index values corresponding to the N fourth images;

a first index determining unit, configured to determine, according to the first mapping relationship, M poisson noise index values corresponding to M target sensitivities as target poisson noise index values corresponding to the target image sensor; the M target sensitivities are M sensitivities determined from sensitivity intervals corresponding to the target image sensors, M is a positive integer and is larger than or equal to 2.

10. The apparatus of claim 8, wherein the determination submodule comprises:

a second acquisition unit configured to acquire K fifth images; the fifth image is an image acquired by using the target image sensor under different photosensitivities, K is a positive integer and is more than or equal to 2;

a third calculating unit, configured to traverse the K fifth images, calculate second pixel variance values of pixel points included in each fifth image, and use the second pixel variance values corresponding to the K fifth images as K gaussian noise index values corresponding to the K fifth images;

a second relationship determination unit configured to determine, based on a maximum likelihood estimation algorithm, a second mapping relationship between K sensitivities and K gaussian noise index values corresponding to the K fifth images according to the K sensitivities and the K gaussian noise index values;

11. The apparatus of claim 7, wherein the adjustment module comprises:

the characteristic input submodule is used for inputting the characteristic information output by the P-th convolutional layer into the continuous P deconvolution layers and outputting the characteristic information to obtain the sixth image;

12. The apparatus of claim 11, wherein in the case that the number of the third images is M, the adjusting module further comprises:

the target image processing submodule is used for inputting a target image into the first network and outputting the target image to obtain a sixth image before inputting the third image into the continuous P convolutional layers in the first network to obtain P characteristic information output by the P convolutional layers; wherein the target image is any image of the M third images;

the obtaining submodule is used for obtaining a first noise distribution characteristic corresponding to the sixth image and a second noise distribution characteristic corresponding to the second image;

and the adjusting submodule is used for adjusting the network parameters of the first network according to the similarity value and the corresponding loss value thereof under the condition that the similarity value is smaller than a preset threshold value until the first network is converged to obtain a trained first network.

13. An electronic device comprising a processor, a memory, and a program or instructions stored on the memory and executable on the processor, the program or instructions when executed by the processor implementing the steps of the noise image generation method of any of claims 1-6.

14. A readable storage medium, characterized in that it stores thereon a program or instructions which, when executed by a processor, implement the steps of the noise image generation method according to any one of claims 1 to 6.