CN112598597A - Training method of noise reduction model and related device - Google Patents

Training method of noise reduction model and related device Download PDF

Info

Publication number
CN112598597A
CN112598597A CN202011565423.XA CN202011565423A CN112598597A CN 112598597 A CN112598597 A CN 112598597A CN 202011565423 A CN202011565423 A CN 202011565423A CN 112598597 A CN112598597 A CN 112598597A
Authority
CN
China
Prior art keywords
image
sub
pixels
noise reduction
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011565423.XA
Other languages
Chinese (zh)
Inventor
李松江
黄涛
贾旭
刘健庄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202011565423.XA priority Critical patent/CN112598597A/en
Publication of CN112598597A publication Critical patent/CN112598597A/en
Priority to PCT/CN2021/131656 priority patent/WO2022134971A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The application discloses a training method of a noise reduction model, which is applied to the field of artificial intelligence and belongs to the computer vision technology. The method comprises the following steps: acquiring an image sample to be subjected to noise reduction from a sample set, wherein the sample set comprises a plurality of image samples to be subjected to noise reduction; executing a first random down-sampling process and a second random down-sampling process on an image sample to be subjected to noise reduction to respectively obtain a first sub-image and a second sub-image; inputting the first sub-image into a noise reduction model to obtain a first target image; obtaining a first loss function according to the first target image and the second sub-image, wherein the first loss function is used for indicating the difference between the first target image and the second sub-image; and training the noise reduction model at least according to the first loss function to obtain a target noise reduction model. In the scheme, the training of the noise reduction model can be realized based on the noise image, a clean image corresponding to the noise image is not required to be obtained, and the training difficulty of the noise reduction model is reduced.

Description

Training method of noise reduction model and related device
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a training method of a noise reduction model and a related device.
Background
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
At present, in the process of generating or transmitting images, the images are often easily interfered by imaging equipment or external environment noise, and noise which influences the image quality is generated and carried. Such an image including noise due to being disturbed is generally referred to as a noisy image or a noisy image. To improve the quality of such images, image denoising methods have been developed. The image denoising method is to apply an algorithm to remove noise from an observed noise image, retain image details and reconstruct a corresponding clean image. At present, the image denoising method has important application value in the fields of mobile phone photographing, high-definition televisions, monitoring equipment, satellite images, medical images and the like.
In the related art, an image is mainly denoised by a learning-based denoising model (e.g., a convolutional neural network). The noise reduction model is trained through a large amount of training data, and the noise reduction model with good noise reduction effect can be obtained, so that the noise reduction of the image is realized.
In general, the noise reduction effect of the noise reduction model depends largely on the training data used to train the noise reduction model, i.e., the noise image-clean image pair. However, in the field of image processing, the acquisition of noisy image-clean image pairs tends to be difficult. Therefore, a method is needed that can achieve training of a noise reduction model without the need to base on noisy image-clean image pairs.
Disclosure of Invention
The embodiment of the application provides a training method and a related device of a noise reduction model, wherein a pair of sub-image samples is obtained by respectively performing random down-sampling processing twice on an image sample to be subjected to noise reduction, one sub-image in the pair of sub-image samples is used as an input value of the noise reduction model, and the other sub-image in the pair of sub-image samples is used as an output expected value of the noise reduction model, so that the training of the noise reduction model is realized. The training of the noise reduction model can be realized based on the noise image, a clean image corresponding to the noise image does not need to be obtained, and the training difficulty of the noise reduction model is reduced.
The first aspect of the present application provides a training method for a noise reduction model, which can be applied to scenes such as terminal photographing, medical images or surveillance videos, and is used for realizing noise reduction of images. The method comprises the following steps: the terminal obtains image samples to be subjected to noise reduction from a sample set, wherein the sample set comprises a plurality of image samples to be subjected to noise reduction. And the terminal executes first random downsampling processing and second random downsampling processing on the image sample to be denoised to respectively obtain a first sub-image and a second sub-image, wherein the resolution of the first sub-image is the same as that of the second sub-image. Namely, the terminal respectively executes two times of random downsampling processing on the same image sample to be denoised to obtain a first sub-image and a second sub-image. The first sub-image and the second sub-image are different images, the resolution of the first sub-image is the same as that of the second sub-image, and the resolution of the first sub-image and the resolution of the second sub-image are lower than that of the image sample to be denoised. The random down-sampling processing refers to randomly sampling pixels in the image sample to be denoised based on a set sampling mode, and obtaining a sub-image with a resolution smaller than the image sample to be denoised based on pixel splicing obtained by sampling. The above-described first and second random downsampling processes perform pixel sampling in the same manner. However, since the first random downsampling process and the second random downsampling process are two independent pixel random sampling processes, the maximum probabilities of the first sub-image obtained by the first random downsampling process and the second sub-image obtained by the second random downsampling process are different.
And the terminal inputs the first sub-image into a noise reduction model to obtain a first target image, wherein the noise reduction model comprises but is not limited to a model based on learning, such as a convolutional neural network or a noise reduction model based on sparse feature expression. The terminal obtains a first loss function according to the first target image and the second sub-image, wherein the first loss function is used for indicating the difference between the first target image and the second sub-image. And finally, the terminal trains the noise reduction model at least according to the first loss function until model training conditions are met, and a target noise reduction model is obtained. The model training condition means that a first loss function obtained by the terminal is smaller than a preset threshold.
In the scheme, random down-sampling processing is respectively performed twice on the image samples to be denoised to obtain sub-image sample pairs, one sub-image in the sub-image sample pairs is used as an input value of a denoising model, and the other sub-image in the sub-image sample pairs is used as an output expected value of the denoising model, so that the training of the denoising model is realized. The training of the noise reduction model can be realized based on the noise image, a clean image corresponding to the noise image does not need to be obtained, and the training difficulty of the noise reduction model is reduced.
Optionally, in a possible implementation manner, the performing, by the terminal, a first random downsampling process and a second random downsampling process on the to-be-denoised image sample to obtain a first sub-image and a second sub-image respectively includes: and the terminal divides the image sample to be subjected to noise reduction into M image units, wherein each image unit in the M image units comprises n x n pixels. The terminal performs a first random selection of pixels in each of the M image cells, obtaining M first pixels. And according to the M first pixels obtained by random sampling, the terminal performs splicing on the M first pixels according to the positions of the image units corresponding to the M first pixels in the image to be denoised to obtain the first sub-image.
Similarly, the terminal performs a second random selection of pixels in each of the M image cells, obtaining M second pixels. And according to the M second pixels obtained by random sampling, the terminal performs splicing on the M second pixels according to the positions of the image units corresponding to the M second pixels in the image to be denoised to obtain the second sub-image.
Optionally, in a possible implementation manner, the terminal performs the second random selection of the pixels in each of the M image units, including: and the terminal acquires n x n-1 target pixels in each image unit of the M image units, wherein the n x n-1 target pixels are pixels which are not selected when the first random selection of the pixels is executed in each image unit. And the terminal executes the random selection of pixels in n x n-1 target pixels in each image unit of the M image units to obtain M second pixels, wherein the M second pixels are different from the M first pixels.
That is, when the terminal performs the second random selection of pixels for each image cell, the terminal needs to first determine the pixels in each image cell that are not selected as the first pixels, i.e., n × n-1 target pixels in each image cell. Then, the terminal randomly selects one pixel from the n × n-1 target pixels as a second pixel. Thus, the first pixel and the second pixel selected by the terminal in each image unit are necessarily different pixels. For example, each pixel in the first sub-image and the second sub-image as shown in fig. 5 is a different pixel.
By ensuring that the same pixels as those in the first sub-image are not sampled in the downsampling process of the second sub-image, the first sub-image and the second sub-image obtained by random downsampling can be ensured to be two completely independent images, namely, the first sub-image and the second sub-image have no strong correlation, so that two images with random noise in the same scene can be better simulated based on the first sub-image and the second sub-image, and the training effect of the noise reduction model is further improved.
Optionally, in a possible implementation manner, the performing, by the terminal, random selection of pixels from n × n-1 target pixels in each of the M image units includes: and the terminal randomly selects a second pixel adjacent to the pixel selected in the first random selection from n x n-1 target pixels in each of the M image units to obtain M second pixels, wherein each of the M second pixels is adjacent to the corresponding first pixel.
That is, for n x n-1 target pixels in each image cell, the terminal may first determine the target pixels of the n x n-1 target pixels that are adjacent to the first pixel selected when the first random selection of pixels was performed. Then, the terminal randomly selects one pixel as a second pixel among the determined target pixels adjacent to the first pixel. Thus, the first pixel and the second pixel selected by the terminal in each image unit are necessarily adjacent pixels.
By selecting the target pixel adjacent to the first pixel as the second pixel, higher similarity between the obtained second sub-image and the first sub-image can be ensured, namely two images with random noise in the same scene are better simulated based on the first sub-image and the second sub-image, and further the training effect of the noise reduction model is improved.
Optionally, in a possible implementation manner, the method further includes: and the terminal inputs the image sample to be subjected to noise reduction into the noise reduction model to obtain a second target image. And the terminal performs downsampling processing on the second target image based on the sampling position of the pixel in the first random downsampling processing to obtain a first sub-target image. That is, the positions of each set of corresponding pixels in the first sub-target image and the first sub-image in the source image before down-sampling are the same. And the terminal performs downsampling processing on the second target image based on the sampling position of the pixel in the second random downsampling processing to obtain a second sub-target image. And the terminal acquires a second loss function according to the first target image, the second sub-image, the first sub-target image and the second sub-target image. And the terminal trains the noise reduction model at least according to the first loss function and the second loss function.
In the scheme, the second loss function is introduced, so that the noise reduction model can be further constrained, and the noise reduction model is prevented from generating an excessively smooth image due to the fact that the positions of the pixels corresponding to the first sub-image and the second sub-image in the image sample to be subjected to noise reduction are inconsistent, that is, high-frequency detail information in the noise image is protected from being removed by the noise reduction model, and the noise-reduced image is ensured to still have the high-frequency detail information.
Optionally, in a possible implementation manner, the training, by the terminal, of the noise reduction model according to at least the first loss function and the second loss function includes: and the terminal trains the noise reduction model at least according to the first loss function, the first weight coefficient, the second loss function and the second weight coefficient. Wherein the first weight coefficient is used to indicate a weight of the first loss function, and the second weight coefficient is used to indicate a weight of the second loss function.
The larger the second weight coefficient is, the weaker the noise reduction strength is, the more the noise is, but the less the loss of the high-frequency details of the image is; the smaller the second weight coefficient, the stronger the noise reduction intensity, and the stronger the noise reduction intensity, the smaller the noise, but the more the high frequency details of the image are lost. Therefore, in practical applications, the first weight coefficient and/or the second weight coefficient can be adjusted to achieve the balance between the noise reduction strength and the loss degree of high-frequency details.
Optionally, in a possible implementation manner, the noise reduction model includes a learning-based noise reduction model such as a convolutional neural network or a sparse feature expression-based noise reduction model.
A second aspect of the present application provides an image denoising method, including: acquiring an image to be denoised; and inputting the image to be denoised into a target denoising model to obtain a denoised image. The target noise reduction model is obtained by training a noise reduction model at least based on a first loss function, the first loss function is obtained based on a first target image and a second sub-image, the first loss function is used for indicating the difference between the first target image and the second sub-image, the first target image is obtained by inputting a first sub-image into the noise reduction model, the first sub-image and the second sub-image are obtained by respectively performing first random down-sampling processing and second random down-sampling processing on an image sample to be subjected to noise reduction, and the resolutions of the first sub-image and the second sub-image are the same.
Optionally, in a possible implementation manner, the first sub-image is obtained according to M first pixels, where the M first pixels are obtained by performing first random selection of pixels in each image unit of the M image units after dividing the image sample to be denoised into M image units; the second sub-image is obtained according to M second pixels, and the M second pixels are obtained by performing second random selection of pixels in each image unit of the M image units after dividing the image sample to be denoised into the M image units.
Optionally, in a possible implementation manner, the M second pixels are obtained by performing random selection of pixels in n × n-1 target pixels in each of the M image units, where the n × n-1 target pixels are pixels that are not selected in the first random selection of pixels in each image unit, and the M second pixels are different from the M first pixels.
Optionally, in a possible implementation manner, the M second pixels are obtained by randomly selecting one second pixel adjacent to the pixel selected at the time of the first random selection from n × n-1 target pixels in each of the M image units, and each of the M second pixels is adjacent to the corresponding first pixel.
Optionally, in a possible implementation manner, the target noise reduction model is obtained by training the noise reduction model based on at least the first loss function and a second loss function, the second loss function is obtained according to the first target image, the second sub-image, a first sub-target image and a second sub-target image, the first sub-target image is obtained by performing downsampling processing on the second target image based on the sampling position of the pixel in the first random downsampling processing, the second sub-target image is obtained by performing downsampling processing on the second target image based on the sampling position of the pixel in the second random downsampling processing, and the second target image is obtained by inputting the image sample to be noise reduced into the noise reduction model.
Optionally, in a possible implementation manner, the target noise reduction model is obtained by training the noise reduction model based on at least the first loss function first weight coefficient, the second loss function, and the second weight coefficient. Wherein the first weight coefficient is used to indicate a weight of the first loss function, and the second weight coefficient is used to indicate a weight of the second loss function.
Optionally, in a possible implementation, the noise reduction model includes a convolutional neural network or a noise reduction model based on sparse feature expression.
A third aspect of the present application provides a model training apparatus, comprising: an acquisition unit and a processing unit. The acquiring unit is used for acquiring image samples to be subjected to noise reduction from a sample set, wherein the sample set comprises a plurality of image samples to be subjected to noise reduction; the processing unit is used for performing first random downsampling processing and second random downsampling processing on the image sample to be denoised to respectively obtain a first sub-image and a second sub-image, and the resolution of the first sub-image is the same as that of the second sub-image; the processing unit is further configured to input the first sub-image into a noise reduction model to obtain a first target image; the processing unit is further configured to obtain a first loss function from the first target image and the second sub-image, where the first loss function is used to indicate a difference between the first target image and the second sub-image; the processing unit is further configured to train the noise reduction model at least according to the first loss function, so as to obtain a target noise reduction model.
Optionally, in a possible implementation manner, the processing unit is further configured to: dividing the image sample to be denoised into M image units, wherein each image unit in the M image units comprises n x n pixels; performing a first random selection of pixels in each of the M image units to obtain M first pixels, and obtaining the first sub-image according to the M first pixels; and performing second random selection of pixels in each of the M image units to obtain M second pixels, and obtaining the second sub-image according to the M second pixels.
Optionally, in a possible implementation manner, the obtaining unit is further configured to obtain n × n-1 target pixels in each of the M image units, where the n × n-1 target pixels are pixels that are not selected in each image unit when performing the first random selection of the pixels; the processing unit is further configured to perform random selection of pixels in n × n-1 target pixels in each of the M image units to obtain M second pixels, where the M second pixels are different from the M first pixels.
Optionally, in a possible implementation manner, the processing unit is further configured to randomly select, from n × n-1 target pixels in each of the M image units, a second pixel adjacent to the pixel selected in the first random selection to obtain M second pixels, where each of the M second pixels is adjacent to a corresponding first pixel.
Optionally, in a possible implementation manner, the processing unit is further configured to: inputting the image sample to be denoised into the denoising model to obtain a second target image; based on the sampling position of the pixel in the first random downsampling processing, downsampling the second target image to obtain a first sub-target image; based on the sampling position of the pixel in the second random downsampling processing, downsampling the second target image to obtain a second sub-target image; acquiring a second loss function according to the first target image, the second sub-image, the first sub-target image and the second sub-target image; training the noise reduction model according to at least the first loss function and the second loss function.
Optionally, in a possible implementation manner, the processing unit is further configured to train the noise reduction model according to at least the first loss function, the first weight coefficient, the second loss function, and the second weight coefficient; wherein the first weight coefficient is used to indicate a weight of the first loss function, and the second weight coefficient is used to indicate a weight of the second loss function.
Optionally, in a possible implementation, the noise reduction model includes a convolutional neural network or a noise reduction model based on sparse feature expression.
The fourth aspect of the present application provides an image noise reduction device, comprising: an acquisition unit and a processing unit. The acquisition unit is used for acquiring an image to be denoised; the processing unit is used for inputting the image to be denoised into a target denoising model to obtain a denoised image; the target noise reduction model is obtained by training a noise reduction model at least based on a first loss function, the first loss function is obtained based on a first target image and a second sub-image, the first loss function is used for indicating the difference between the first target image and the second sub-image, the first target image is obtained by inputting a first sub-image into the noise reduction model, the first sub-image and the second sub-image are obtained by respectively performing first random down-sampling processing and second random down-sampling processing on an image sample to be subjected to noise reduction, and the resolutions of the first sub-image and the second sub-image are the same.
Optionally, in a possible implementation manner, the first sub-image is obtained according to M first pixels, where the M first pixels are obtained by performing first random selection of pixels in each image unit of the M image units after dividing the image sample to be denoised into M image units; the second sub-image is obtained according to M second pixels, and the M second pixels are obtained by performing second random selection of pixels in each image unit of the M image units after dividing the image sample to be denoised into the M image units.
Optionally, in a possible implementation manner, the M second pixels are obtained by performing random selection of pixels in n × n-1 target pixels in each of the M image units, where the n × n-1 target pixels are pixels that are not selected in the first random selection of pixels in each image unit, and the M second pixels are different from the M first pixels.
Optionally, in a possible implementation manner, the M second pixels are obtained by randomly selecting one second pixel adjacent to the pixel selected at the time of the first random selection from n × n-1 target pixels in each of the M image units, and each of the M second pixels is adjacent to the corresponding first pixel.
Optionally, in a possible implementation manner, the target noise reduction model is obtained by training the noise reduction model based on at least the first loss function and a second loss function, the second loss function is obtained according to the first target image, the second sub-image, a first sub-target image and a second sub-target image, the first sub-target image is obtained by performing downsampling processing on the second target image based on the sampling position of the pixel in the first random downsampling processing, the second sub-target image is obtained by performing downsampling processing on the second target image based on the sampling position of the pixel in the second random downsampling processing, and the second target image is obtained by inputting the image sample to be noise reduced into the noise reduction model.
Optionally, in a possible implementation manner, the target noise reduction model is obtained by training the noise reduction model based on at least the first loss function first weight coefficient, the second loss function, and the second weight coefficient; wherein the first weight coefficient is used to indicate a weight of the first loss function, and the second weight coefficient is used to indicate a weight of the second loss function.
Optionally, in a possible implementation, the noise reduction model includes a convolutional neural network or a noise reduction model based on sparse feature expression.
A fifth aspect of the present application provides a model training apparatus, which may include a processor, a processor coupled to a memory, the memory storing program instructions, and the memory storing program instructions, when executed by the processor, implement the method of the first aspect. For the processor to execute the steps in each possible implementation manner of the first aspect, reference may be made to the first aspect specifically, and details are not described here.
A sixth aspect of the present application provides an image noise reduction apparatus that may include a processor, a memory coupled to the processor, the memory storing program instructions that, when executed by the processor, implement the method of the second aspect. For the processor to execute the steps in each possible implementation manner of the second aspect, reference may be made to the second aspect specifically, and details are not described here.
A seventh aspect of the present application provides a computer-readable storage medium having stored thereon a computer program which, when run on a computer, causes the computer to perform the method of the first aspect described above.
An eighth aspect of the present application provides a computer-readable storage medium having stored thereon a computer program which, when run on a computer, causes the computer to perform the method of the second aspect described above.
A ninth aspect of the present application provides circuitry comprising processing circuitry configured to perform the method of the first or second aspect.
A tenth aspect of the present application provides a computer program which, when run on a computer, causes the computer to perform the method of the first or second aspect described above.
An eleventh aspect of the present application provides a chip system, which includes a processor, configured to enable a server or a threshold value obtaining apparatus to implement the functions recited in the above aspects, for example, to transmit or process data and/or information recited in the above methods. In one possible design, the system-on-chip further includes a memory for storing program instructions and data necessary for the server or the communication device. The chip system may be formed by a chip, or may include a chip and other discrete devices.
Drawings
FIG. 1 is a schematic structural diagram of an artificial intelligence body framework provided by an embodiment of the present application;
fig. 2a is an image processing system according to an embodiment of the present application;
FIG. 2b is a schematic diagram of another exemplary image processing system according to an embodiment of the present disclosure;
FIG. 2c is a schematic diagram of an apparatus related to image processing provided in an embodiment of the present application;
fig. 3a is a schematic diagram of a system 100 architecture according to an embodiment of the present application;
FIG. 3b is a schematic diagram of image denoising according to an embodiment of the present disclosure;
fig. 4 is a schematic flowchart of a training method of a noise reduction model according to an embodiment of the present disclosure;
FIG. 5 is a diagram illustrating a random downsampling process according to an embodiment of the present application;
FIG. 6 is a schematic diagram of another random downsampling process provided in an embodiment of the present application;
FIG. 7 is a schematic diagram of another random downsampling process provided in an embodiment of the present application;
fig. 8 is a schematic diagram of a training process of a noise reduction model according to an embodiment of the present disclosure;
fig. 9 is a schematic diagram of a training process of another noise reduction model provided in an embodiment of the present application;
fig. 10 is a schematic diagram of an experimental flow for determining a noise reduction effect of a noise reduction model according to an embodiment of the present application;
FIG. 11 is a schematic diagram illustrating comparison of noise reduction effects of various noise reduction methods provided in the embodiments of the present application;
fig. 12 is a schematic diagram of another experimental flow for determining a noise reduction effect of a noise reduction model according to an embodiment of the present application;
fig. 13 is a schematic diagram illustrating a comparison of noise reduction indexes of various noise reduction methods according to an embodiment of the present application;
FIG. 14 is a schematic diagram illustrating comparison of noise reduction effects of various noise reduction methods provided in the embodiments of the present application;
fig. 15 is a schematic flowchart of an image denoising method according to an embodiment of the present application;
FIG. 16 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure;
fig. 17 is a schematic structural diagram of an image noise reduction apparatus according to an embodiment of the present application;
fig. 18 is a schematic structural diagram of an execution device according to an embodiment of the present application;
FIG. 19 is a schematic structural diagram of a training apparatus provided in an embodiment of the present application;
fig. 20 is a schematic structural diagram of a chip according to an embodiment of the present disclosure.
Detailed Description
The embodiments of the present invention will be described below with reference to the drawings. The terminology used in the description of the embodiments of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
Embodiments of the present application are described below with reference to the accompanying drawings. As can be known to those skilled in the art, with the development of technology and the emergence of new scenarios, the technical solution provided in the embodiments of the present application is also applicable to similar technical problems.
The terms "first," "second," and the like in the description and in the claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and are merely descriptive of the various embodiments of the application and how objects of the same nature can be distinguished. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The general workflow of the artificial intelligence system will be described first, please refer to fig. 1, which shows a schematic structural diagram of an artificial intelligence body framework, and the artificial intelligence body framework is explained below from two dimensions of "intelligent information chain" (horizontal axis) and "IT value chain" (vertical axis). Where "intelligent information chain" reflects a list of processes processed from the acquisition of data. For example, the general processes of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision making and intelligent execution and output can be realized. In this process, the data undergoes a "data-information-knowledge-wisdom" refinement process. The 'IT value chain' reflects the value of the artificial intelligence to the information technology industry from the bottom infrastructure of the human intelligence, information (realization of providing and processing technology) to the industrial ecological process of the system.
(1) Infrastructure
The infrastructure provides computing power support for the artificial intelligent system, realizes communication with the outside world, and realizes support through a foundation platform. Communicating with the outside through a sensor; the computing power is provided by intelligent chips (hardware acceleration chips such as CPU, NPU, GPU, ASIC, FPGA and the like); the basic platform comprises distributed computing framework, network and other related platform guarantees and supports, and can comprise cloud storage and computing, interconnection and intercommunication networks and the like. For example, sensors and external communications acquire data that is provided to intelligent chips in a distributed computing system provided by the base platform for computation.
(2) Data of
Data at the upper level of the infrastructure is used to represent the data source for the field of artificial intelligence. The data relates to graphs, images, voice and texts, and also relates to the data of the Internet of things of traditional equipment, including service data of the existing system and sensing data such as force, displacement, liquid level, temperature, humidity and the like.
(3) Data processing
Data processing typically includes data training, machine learning, deep learning, searching, reasoning, decision making, and the like.
The machine learning and the deep learning can perform symbolized and formalized intelligent information modeling, extraction, preprocessing, training and the like on data.
Inference means a process of simulating an intelligent human inference mode in a computer or an intelligent system, using formalized information to think about and solve a problem by a machine according to an inference control strategy, and a typical function is searching and matching.
The decision-making refers to a process of making a decision after reasoning intelligent information, and generally provides functions of classification, sequencing, prediction and the like.
(4) General capabilities
After the above-mentioned data processing, further based on the result of the data processing, some general capabilities may be formed, such as algorithms or a general system, e.g. translation, analysis of text, computer vision processing, speech recognition, recognition of images, etc.
(5) Intelligent product and industrial application
The intelligent product and industry application refers to the product and application of an artificial intelligence system in various fields, and is the encapsulation of an artificial intelligence integral solution, the intelligent information decision is commercialized, and the landing application is realized, and the application field mainly comprises: intelligent terminal, intelligent transportation, intelligent medical treatment, autopilot, safe city etc..
Several application scenarios of the present application are presented next.
Fig. 2a is an image processing system provided in an embodiment of the present application, where the image processing system includes a user device and a data processing device. The user equipment comprises a mobile phone, a personal computer or an intelligent terminal such as an information processing center. The user equipment is an initiating end of the image processing, and as an initiator of the image enhancement request, the user usually initiates the request through the user equipment.
The data processing device may be a device or a server having a data processing function, such as a cloud server, a network server, an application server, and a management server. The data processing equipment receives an image enhancement request from the intelligent terminal through the interactive interface, and then performs image processing in the modes of machine learning, deep learning, searching, reasoning, decision making and the like through a memory for storing data and a processor link for data processing. The memory in the data processing device may be a generic term that includes a database that stores locally and stores historical data, either on the data processing device or on other network servers.
In the image processing system shown in fig. 2a, the user device may receive an instruction from a user, for example, the user device may obtain an image input/selected by the user, and then initiate a request to the data processing device, so that the data processing device executes an image denoising application on the image obtained by the user device, thereby obtaining a corresponding processing result for the image. For example, the user device may obtain an image input by a user, and then initiate an image denoising request to the data processing device, so that the data processing device performs image denoising on the image, thereby obtaining a denoised image.
In fig. 2a, a data processing device may execute the training method of the noise reduction model according to the embodiment of the present application.
Fig. 2b is another image processing system according to an embodiment of the present application, in fig. 2b, a user device directly serves as a data processing device, and the user device can directly obtain an input from a user and directly perform processing by hardware of the user device itself, and a specific process is similar to that in fig. 2a, and reference may be made to the above description, which is not repeated herein.
In the image processing system shown in fig. 2b, the user device may receive an instruction from the user, for example, the user device may obtain an image selected by the user in the user device, and then execute the image processing application for the image by the user device itself, so as to obtain a corresponding processing result for the image.
In fig. 2b, the user equipment itself may execute the training method of the noise reduction model according to the embodiment of the present application.
Fig. 2c is a schematic diagram of a related apparatus for image processing provided in an embodiment of the present application.
The user device in fig. 2a and fig. 2b may specifically be the local device 301 or the local device 302 in fig. 2c, and the data processing device in fig. 2a may specifically be the execution device 210 in fig. 2c, where the data storage system 250 may store data to be processed of the execution device 210, and the data storage system 250 may be integrated on the execution device 210, or may be disposed on a cloud or other network server.
The processor in fig. 2a and 2b may perform data training/machine learning/deep learning through a neural network model or other models (e.g., models based on a support vector machine), and perform image processing application on the image using the model finally trained or learned by the data, so as to obtain a corresponding processing result.
Fig. 3a is a schematic diagram of an architecture of a system 100 according to an embodiment of the present application, in fig. 3a, an execution device 110 configures an input/output (I/O) interface 112 for data interaction with an external device, and a user may input data to the I/O interface 112 through a client device 140, where the input data may include: each task to be scheduled, the resources that can be invoked, and other parameters.
During the process that the execution device 110 preprocesses the input data or during the process that the calculation module 111 of the execution device 110 performs the calculation (for example, performs the function implementation of the neural network in the present application), the execution device 110 may call the data, the code, and the like in the data storage system 150 for corresponding processing, and may store the data, the instruction, and the like obtained by corresponding processing into the data storage system 150.
Finally, the I/O interface 112 returns the processing results to the client device 140 for presentation to the user.
It should be noted that the training device 120 may generate corresponding target models/rules based on different training data for different targets or different tasks, and the corresponding target models/rules may be used to achieve the targets or complete the tasks, so as to provide the user with the required results. Wherein the training data may be stored in the database 130 and derived from training samples collected by the data collection device 160.
In the case shown in fig. 3a, the user may manually give the input data, which may be operated through an interface provided by the I/O interface 112. Alternatively, the client device 140 may automatically send the input data to the I/O interface 112, and if the client device 140 is required to automatically send the input data to obtain authorization from the user, the user may set the corresponding permissions in the client device 140. The user can view the result output by the execution device 110 at the client device 140, and the specific presentation form can be display, sound, action, and the like. The client device 140 may also serve as a data collection terminal, collecting input data of the input I/O interface 112 and output results of the output I/O interface 112 as new sample data, and storing the new sample data in the database 130. Of course, the input data inputted to the I/O interface 112 and the output result outputted from the I/O interface 112 as shown in the figure may be directly stored in the database 130 as new sample data by the I/O interface 112 without being collected by the client device 140.
It should be noted that fig. 3a is only a schematic diagram of a system architecture provided in this embodiment of the present application, and the position relationship between the devices, modules, etc. shown in the diagram does not constitute any limitation, for example, in fig. 3a, the data storage system 150 is an external memory with respect to the execution device 110, and in other cases, the data storage system 150 may also be disposed in the execution device 110. As shown in fig. 3a, a neural network may be trained from the training device 120.
The embodiment of the application also provides a chip, which comprises the NPU. The chip may be provided in an execution device 110 as shown in fig. 3a to perform the calculation work of the calculation module 111. The chip may also be disposed in the training apparatus 120 as shown in fig. 3a to complete the training work of the training apparatus 120 and output the target model/rule.
The neural network processor NPU, NPU is mounted as a coprocessor on a main Central Processing Unit (CPU) (host CPU), and tasks are distributed by the main CPU. The core portion of the NPU is an arithmetic circuit, and the controller controls the arithmetic circuit to extract data in a memory (weight memory or input memory) and perform an operation.
In some implementations, the arithmetic circuitry includes a plurality of processing units (PEs) therein. In some implementations, the operational circuit is a two-dimensional systolic array. The arithmetic circuit may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuitry is a general-purpose matrix processor.
For example, assume that there is an input matrix A, a weight matrix B, and an output matrix C. The arithmetic circuit fetches the data corresponding to the matrix B from the weight memory and buffers the data on each PE in the arithmetic circuit. The arithmetic circuit takes the matrix A data from the input memory and carries out matrix operation with the matrix B, and partial results or final results of the obtained matrix are stored in an accumulator (accumulator).
The vector calculation unit may further process the output of the arithmetic circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, magnitude comparison, and the like. For example, the vector computation unit may be used for network computation of the non-convolution/non-FC layer in a neural network, such as pooling (pooling), batch normalization (batch normalization), local response normalization (local response normalization), and the like.
In some implementations, the vector calculation unit can store the processed output vector to a unified buffer. For example, the vector calculation unit may apply a non-linear function to the output of the arithmetic circuit, such as a vector of accumulated values, to generate the activation value. In some implementations, the vector calculation unit generates normalized values, combined values, or both. In some implementations, the vector of processed outputs can be used as activation inputs to arithmetic circuitry, e.g., for use in subsequent layers in a neural network.
The unified memory is used for storing input data and output data.
The weight data directly passes through a memory cell access controller (DMAC) to carry input data in the external memory to the input memory and/or the unified memory, store the weight data in the external memory in the weight memory, and store data in the unified memory in the external memory.
And the Bus Interface Unit (BIU) is used for realizing interaction among the main CPU, the DMAC and the instruction fetch memory through a bus.
An instruction fetch buffer (instruction fetch buffer) connected to the controller for storing instructions used by the controller;
and the controller is used for calling the instruction cached in the finger memory and realizing the control of the working process of the operation accelerator.
Generally, the unified memory, the input memory, the weight memory, and the instruction fetch memory are On-Chip (On-Chip) memories, the external memory is a memory outside the NPU, and the external memory may be a double data rate synchronous dynamic random access memory (DDR SDRAM), a High Bandwidth Memory (HBM), or other readable and writable memories.
Since the embodiments of the present application relate to the application of a large number of neural networks, for the convenience of understanding, the related terms and related concepts such as neural networks related to the embodiments of the present application will be described below.
(1) Neural network
The neural network may be composed of neural units, the neural units may refer to operation units with xs and intercept 1 as inputs, and the output of the operation units may be:
Figure BDA0002860703080000111
where s is 1, 2, … … n, n is a natural number greater than 1, Ws is the weight of xs, and b is the bias of the neural unit. f is an activation function (activation functions) of the neural unit for introducing a nonlinear characteristic into the neural network to convert an input signal in the neural unit into an output signal. The output signal of the activation function may be used as an input to the next convolutional layer. The activation function may be a sigmoid function. A neural network is a network formed by a number of the above-mentioned single neural units joined together, i.e. the output of one neural unit may be the input of another neural unit. The input of each neural unit can be connected with the local receiving domain of the previous layer to extract the characteristics of the local receiving domain, and the local receiving domain can be a region composed of a plurality of neural units.
The operation of each layer in the neural network can be expressed mathematically
Figure BDA0002860703080000112
To describe: from the work of each layer in the physical layer neural network, it can be understood that the transformation of the input space into the output space (i.e. the row space to the column space of the matrix) is accomplished by five operations on the input space (set of input vectors), which include: 1. ascending/descending dimensions; 2. zooming in/out; 3. rotating; 4. translating; 5. "bending". Wherein 1, 2, 3 are operated by
Figure BDA0002860703080000121
The operation of 4 is completed by + b, and the operation of 5 is realized by a (). The expression "space" is used herein because the object being classified is not a single thing, but a class of things, and space refers to the collection of all individuals of such things. Where W is a weight vector, each value in the vector representing a weight value for a neuron in the layer of neural network. The vector W determines the spatial transformation of the input space into the output space described above, i.e. the weight W of each layer controls how the space is transformed. The purpose of training the neural network is to finally obtain the weight matrix (the weight matrix formed by the vectors W of many layers) of all the layers of the trained neural network. Therefore, the training process of the neural network is essentially a way of learning the control space transformation, and more specifically, the weight matrix.
Because it is desirable that the output of the neural network is as close as possible to the value actually desired to be predicted, the weight vector of each layer of the neural network can be updated by comparing the predicted value of the current network with the value actually desired to be predicted, and then updating the weight vector according to the difference between the predicted value and the value actually desired (of course, there is usually an initialization process before the first update, that is, the parameters are configured in advance for each layer of the neural network). Therefore, it is necessary to define in advance "how to compare the difference between the predicted value and the target value", which are loss functions (loss functions) or objective functions (objective functions), which are important equations for measuring the difference between the predicted value and the target value. Taking the loss function as an example, if the higher the output value (loss) of the loss function indicates the larger the difference, the training of the neural network becomes a process of reducing the loss as much as possible.
(2) Back propagation algorithm
The neural network can adopt a Back Propagation (BP) algorithm to correct the size of parameters in the initial neural network model in the training process, so that the reconstruction error loss of the neural network model is smaller and smaller. Specifically, the error loss is generated by transmitting the input signal in the forward direction until the output, and the parameters in the initial neural network model are updated by reversely propagating the error loss information, so that the error loss is converged. The back propagation algorithm is a back propagation motion with error loss as a dominant factor, aiming at obtaining the optimal parameters of the neural network model, such as a weight matrix.
(3) Image enhancement
Image enhancement refers to processing the brightness, color, contrast, saturation, dynamic range, etc. of an image to meet certain specific criteria. In brief, in the process of image processing, by purposefully emphasizing the overall or local characteristics of an image, an original unclear image is made clear or certain interesting characteristics are emphasized, the difference between different object characteristics in the image is enlarged, and the uninteresting characteristics are inhibited, so that the effects of improving the image quality and enriching the image information quantity are achieved, the image interpretation and identification effects can be enhanced, and the requirements of certain special analysis are met. Exemplary image enhancements may include, but are not limited to, image super-resolution reconstruction, image noise reduction, image defogging, image deblurring, and image contrast enhancement.
(4) Image noise reduction
The image denoising method is to apply an algorithm to remove noise from an observed noise image, retain more important detail information in the image and reconstruct a corresponding clean image. The reconstructed image looks clear and clean, the image quality can be improved through image noise reduction, and subsequent image processing procedures such as image classification or object recognition and the like can be favorably carried out on the image. In the flow of image processing, image noise reduction is an important step. At present, many image denoising methods are applied in academic circles and industrial circles, and image denoising is one of the research hotspots in the technical field of current image processing. Referring to fig. 3b, fig. 3b is a schematic diagram of image noise reduction according to an embodiment of the present disclosure. As shown in fig. 3b, noise in the image can be eliminated as much as possible by image noise reduction, thereby improving the quality of the image.
The method provided by the present application is described below from the training side of the neural network and the application side of the neural network.
The training method of the neural network provided by the embodiment of the application relates to image processing, and particularly can be applied to data processing methods such as data training, machine learning and deep learning, and the training data (such as the image in the application) is subjected to symbolized and formalized intelligent information modeling, extraction, preprocessing, training and the like, and a trained image processing model is finally obtained; in addition, the image noise reduction method provided in the embodiment of the present application may use the trained noise reduction model to input data (e.g., an image to be processed in the present application) into the trained image processing model, so as to obtain output data (e.g., a target image in the present application). It should be noted that the training method of the noise reduction model and the image noise reduction method provided in the embodiment of the present application are inventions based on the same concept, and may also be understood as two parts in a system or two stages of an overall process: such as a model training phase and a model application phase.
At present, in the process of generating or transmitting images, the images are often easily interfered by imaging equipment or external environment noise, and noise which influences the image quality is generated and carried. Such an image including noise due to being disturbed is generally referred to as a noisy image or a noisy image. To improve the quality of such images, image denoising methods have been developed. The image denoising method is to apply an algorithm to remove noise from an observed noise image, retain image details and reconstruct a corresponding clean image. The method is a common idea of image denoising research by extracting features of a noise image, removing noise and filling details by means of image priori knowledge, image self-similarity, multi-frame image complementary information and the like, and generating a corresponding high-quality image. In the industry, the image denoising technology has important application value in the fields of mobile phone photographing, high-definition televisions, monitoring equipment, satellite images, medical images and the like.
Generally, image noise reduction algorithms are largely classified into conventional filtering methods and learning-based methods. The conventional filtering methods include mean filtering gaussian filtering, bilateral filtering, and three-dimensional Block Matching filtering (Block-Matching and 3D filtering, B M3D). The traditional filtering method mainly utilizes the similarity of images, reduces random noise in the images through filtering smoothing, and simultaneously keeps high-frequency signals of the images. Generally, most of image noise reduction methods with good effects are combination of multiple methods, which can not only keep edge information well, but also remove noise in images. For example, a median filtering method and a wavelet filtering method are combined to perform image filtering, so as to achieve a better image denoising effect. In general, conventional noise reduction algorithms find out rules from a noise image, and then perform corresponding noise reduction processing. Under the condition that the rule cannot be found from the noise image, the traditional filtering method is poor in performance and difficult to achieve a good image noise reduction effect, and therefore further improvement of the image noise reduction performance is limited.
In this case, an image noise reduction method based on learning is produced. The learning-based image noise reduction method is a data-driven method, and realizes the noise reduction of images by learning the rule of large-scale noise image-clean image pairs through a noise reduction model so as to achieve good image noise reduction effect. In recent years, Deep Neural Networks (DNNs) have rapidly surpassed conventional image noise reduction methods by virtue of their powerful learning capabilities, and have achieved great success. The image noise reduction method based on the deep neural network can generate cleaner, clearer and less artifact clean pictures, and further promotes the development of an image noise reduction technology.
Due to the continuous improvement of the computational power of the processor and the rapid development of the deep convolutional neural network, the effect of the image noise reduction network is greatly improved, and the application of the image noise reduction network based on deep learning is further promoted. The noise reduction effect of current Supervised Learning (Supervised Learning) based image noise reduction networks depends to a large extent on the training data used to train the network, i.e. the noise image-clean image pair. The noise image-clean image pair refers to a noise image and a clean noise-free image corresponding to the noise image. Briefly, a noise image-clean image pair is a pair of images in the same scene, and scene information included in the noise image and the clean image is the same except that the noise is not included in the clean image.
However, in the field of image processing, the acquisition of noisy image-clean image pairs tends to be difficult. For example, in the field of photography, images taken by cameras tend to be noisy due to interference from imaging equipment and external environmental noise. Although relatively clean images (i.e., images containing less noise) can be obtained by increasing the exposure time, multi-frame smoothing, and other techniques, the limitations are also apparent. Especially for dynamic scenes, the content of the images taken at different times changes, so that it is often difficult to obtain a noisy-clean image pair in a dynamic scene.
For another example, in the field of medical images, medical images are generated by an instrument generating rays or electromagnetic waves and acting on a human body. Due to the instrument itself, the image obtained by the instrument often contains a large amount of random noise, i.e., it is often difficult to obtain a clean and noise-free image by the instrument. In addition, due to the particularity of the medical images, a certain negative influence is easily generated on a human body in the process of shooting the medical images, so that in the practical application process, the medical images under the same parts are often difficult to obtain.
In order to solve the problem that a noise image-clean image pair is difficult to obtain in the noise reduction model training process, the present embodiment proposes a method for training a noise reduction model using only a noise image. The method comprises the steps of respectively executing random downsampling processing twice on an image sample to be denoised to obtain a sub-image sample pair, taking one sub-image in the sub-image sample pair as an input value of a denoising model, and taking the other sub-image in the sub-image sample pair as an output expected value of the denoising model, so that training of the denoising model is achieved. The training of the noise reduction model can be realized based on the noise image, a clean image corresponding to the noise image does not need to be obtained, and the training difficulty of the noise reduction model is reduced.
For ease of understanding, the following describes a device and a scenario to which the training method of the noise reduction model provided in the present embodiment is applied.
The training method of the noise reduction model provided by the embodiment of the application can be applied to a terminal, and the terminal is equipment capable of executing model training. Illustratively, the terminal may be, for example, a Personal Computer (PC), a notebook computer, a server, a mobile phone (mobile phone), a tablet computer, a Mobile Internet Device (MID), a wearable device, a Virtual Reality (VR) device, an Augmented Reality (AR) device, a wireless terminal in industrial control (industrial control), a wireless terminal in unmanned driving (self driving), a wireless terminal in remote surgery (remote management), a wireless terminal in smart grid (smart grid), a wireless terminal in transportation safety, a wireless terminal in smart city (smart city), a wireless terminal in smart home (smart home), and the like. The terminal may be a device running an android system, an IOS system, a windows system, and other systems.
The training method of the noise reduction model provided by the embodiment can be applied to scenes such as terminal equipment photographing, video monitoring, medical image processing and the like which need to convert a noise image into a clean image.
With the popularization of portable terminal devices such as smart phones and tablet computers, photographing has become an indispensable function of smart devices. Along with the continuous improvement of the importance of the photographing function in the terminal equipment, people put forward more requirements on the improvement of the photographing quality of the terminal equipment. At present, the short board for terminal equipment to photograph mainly lies in the image quality under the dark scene. In dark light or night and other scenes, because the ambient light is too weak, the photos shot by the terminal equipment have very obvious noise, and the image quality is greatly influenced. It should be understood that, in such a scene, it is difficult to obtain a clean image of the scene through the terminal device, that is, it is difficult to obtain a noise image-clean image pair, because noise in an image captured by the terminal device is caused by too poor ambient light.
The medical image is an image of a part such as a human organ obtained by an Imaging apparatus, and examples thereof include Magnetic Resonance Imaging (MRI), Computed Tomography (CT), and the like. Noise is introduced in the imaging process of the imaging instrument, and the signal to noise ratio is reduced by larger noise, so that the subsequent image processing is seriously influenced. Therefore, in medical image processing, it is an essential link to perform noise reduction processing on a noise-containing image generated by an imaging instrument.
Due to the particularity of medical images, a plurality of medical images can be shot only one, a plurality of images containing noise are difficult to be shot at the same time in the same scene, and the acquisition of a clean image is difficult to realize. Therefore, in the field of medical image denoising, it is also difficult to obtain a noisy image-clean image pair, and normal training of a denoising model is easily affected.
Referring to fig. 4, fig. 4 is a schematic flowchart of a training method of a noise reduction model according to an embodiment of the present application. As shown in fig. 4, the training method of the noise reduction model provided in the embodiment of the present application includes the following steps 401 and 405.
Step 401, obtaining an image sample to be denoised from a sample set, where the sample set includes a plurality of image samples to be denoised.
When the training method of the noise reduction model provided in this embodiment is applied to a terminal photographing scene, that is, when a model for reducing noise of a photographed image needs to be obtained through training, the terminal may obtain a sample set including a plurality of image samples to be subjected to noise reduction. The image samples to be denoised in the sample set may be images acquired by the terminal in different scenes, and the images acquired by the terminal include noise.
When the training method of the noise reduction model provided in this embodiment is applied to a medical image scene, the image samples to be noise reduced included in the sample set are medical images of the same type, for example, a plurality of image samples to be noise reduced in the sample set are all MRI images, or a plurality of image samples to be noise reduced in the sample set are all CT images.
In general, a plurality of image samples to be denoised included in a sample set are images of the same type, and each of the image samples to be denoised includes noise. The terminal can obtain a corresponding sample set according to a scene to which the training method of the noise reduction model is applied, and train the noise reduction model based on the image samples to be noise reduced in the sample set.
Step 402, performing a first random down-sampling process and a second random down-sampling process on the image sample to be denoised to obtain a first sub-image and a second sub-image respectively, wherein the resolution of the first sub-image is the same as that of the second sub-image.
After obtaining the image sample to be denoised, the terminal respectively executes two times of random downsampling processing on the same image sample to be denoised to obtain a first sub-image and a second sub-image. The first sub-image and the second sub-image are different images, the resolution of the first sub-image is the same as that of the second sub-image, and the resolution of the first sub-image and the resolution of the second sub-image are lower than that of the image sample to be denoised.
The random down-sampling processing refers to randomly sampling pixels in the image sample to be denoised based on a set sampling mode, and obtaining a sub-image with a resolution smaller than the image sample to be denoised based on pixel splicing obtained by sampling. The above-described first and second random downsampling processes perform pixel sampling in the same manner. However, since the first random downsampling process and the second random downsampling process are two independent pixel random sampling processes, the maximum probabilities of the first sub-image obtained by the first random downsampling process and the second sub-image obtained by the second random downsampling process are different.
For ease of understanding, two ways of performing the random downsampling process provided in the present embodiment will be described below. It should be understood that, in practical applications, other random down-sampling processing manners may also be adopted, and the random down-sampling processing manner is not specifically limited herein.
The method comprises the steps of dividing an image sample to be denoised into a plurality of image units, and randomly sampling one pixel in each image unit to obtain a sub-image formed by the pixels obtained by sampling.
Illustratively, the terminal equally divides the image sample to be denoised into M image units, each of which includes n × n pixels. Then, the terminal performs a first random selection of pixels in each of said M image cells, i.e. randomly selects one pixel as a first pixel among n x n pixels of each image cell, thereby obtaining M first pixels. And according to the M first pixels obtained by random sampling, the terminal performs splicing on the M first pixels according to the positions of the image units corresponding to the M first pixels in the image to be denoised to obtain the first sub-image.
Similarly, the terminal performs a second random selection of pixels in each of said M image cells, i.e. again randomly selects one pixel as second pixel in the n × n pixels of each image cell, resulting in M second pixels. And according to the M second pixels obtained by random sampling, the terminal performs splicing on the M second pixels according to the positions of the image units corresponding to the M second pixels in the image to be denoised to obtain the second sub-image.
Referring to fig. 5, fig. 5 is a schematic diagram of a random downsampling process according to an embodiment of the present disclosure. As shown in fig. 5, it is assumed that the resolution of the image to be noise-reduced is 4 × 4, i.e., the image to be noise-reduced is composed of 4 × 4 pixels. In fig. 5, the image to be denoised is divided into 4 image units, each image unit comprising 2 × 2 pixels.
In the process of performing the first random downsampling process on the image to be denoised, for the first image unit (i.e. the image unit at the upper left corner of the image to be denoised), the pixel at the upper left corner of the first image unit (i.e. pixel 1A) is randomly sampled as the first pixel. For the second image unit (i.e. the image unit at the upper right corner in the image to be denoised), randomly sampling the pixel at the upper right corner in the second image unit (i.e. pixel 1B) as the first pixel; for the third image unit (i.e. the image unit at the lower left corner of the image to be denoised), randomly sampling the pixel at the lower right corner of the third image unit (i.e. pixel 1C) as the first pixel; for the fourth image unit (i.e., the image unit in the lower right corner of the image to be denoised), the pixel in the lower right corner of the fourth image unit (i.e., pixel 1D) is randomly sampled as the first pixel. Based on the pixels obtained by sampling the four image units, the positions of the image units corresponding to the sampled pixels (i.e., pixel 1A, pixel 1B, pixel 1C, and pixel 1D) in the image to be denoised are pieced together to obtain a first sub-image.
In the process of performing the second random downsampling process on the image to be denoised, for the first image unit, the pixel at the lower right corner in the first image unit (i.e. pixel 2A) is randomly sampled as the second pixel. For the second image unit, randomly sampling the pixel at the lower left corner of the second image unit (i.e., pixel 2B) as the second pixel; for the third image unit, the pixel at the upper left corner in the third image unit (i.e., pixel 2C) is randomly sampled as the second pixel; for the fourth image cell, the pixel in the upper right corner of the fourth image cell (i.e., pixel 2D) is randomly sampled as the second pixel. Based on the pixels obtained by sampling the four image units, the positions of the image units corresponding to the sampled pixels (i.e., pixel 2A, pixel 2B, pixel 2C, and pixel 2D) in the image to be denoised are pieced together to obtain a second sub-image.
Optionally, when the terminal performs the second random selection of pixels in each of the M image cells, the terminal may obtain n × n-1 target pixels in each of the M image cells, where the n × n-1 target pixels are pixels in each of the image cells that were not selected when the first random selection of pixels was performed. Then, the terminal performs random selection of pixels in n × n-1 target pixels in each of the M image units to obtain M second pixels, where the M second pixels are different from the M first pixels.
That is, when the terminal performs the second random selection of pixels for each image cell, the terminal needs to first determine the pixels in each image cell that are not selected as the first pixels, i.e., n × n-1 target pixels in each image cell. Then, the terminal randomly selects one pixel from the n × n-1 target pixels as a second pixel. Thus, the first pixel and the second pixel selected by the terminal in each image unit are necessarily different pixels. For example, each pixel in the first sub-image and the second sub-image as shown in fig. 5 is a different pixel.
By ensuring that the same pixels as those in the first sub-image are not sampled in the downsampling process of the second sub-image, the first sub-image and the second sub-image obtained by random downsampling can be ensured to be two completely independent images, namely, the first sub-image and the second sub-image have no strong correlation, so that two images with random noise in the same scene can be better simulated based on the first sub-image and the second sub-image, and the training effect of the noise reduction model is further improved. For example, without ensuring that the same pixels in the second sub-image are not sampled during the downsampling process for the first sub-image, in some more extreme cases, each second pixel in the second sub-image may be the same as the first pixel, i.e., the second sub-image is the same as the first sub-image. In this case, it is difficult to obtain a good training effect by performing the training of the noise reduction model based on the same two sub-images.
Optionally, when the terminal performs random selection of pixels in n × n-1 target pixels in each of the M image units, the terminal may randomly select, among the n × n-1 target pixels in each of the M image units, a second pixel adjacent to the pixel selected at the time of the first random selection to obtain M second pixels, where each of the M second pixels is adjacent to a corresponding first pixel.
That is, for n x n-1 target pixels in each image cell, the terminal may first determine the target pixels of the n x n-1 target pixels that are adjacent to the first pixel selected when the first random selection of pixels was performed. Then, the terminal randomly selects one pixel as a second pixel among the determined target pixels adjacent to the first pixel. Thus, the first pixel and the second pixel selected by the terminal in each image unit are necessarily adjacent pixels.
For example, referring to fig. 6, fig. 6 is a schematic diagram of another random down-sampling process provided in an embodiment of the present application. As shown in fig. 6, each second pixel (i.e., pixel 2A, pixel 2B, pixel 2C, and pixel 2D) in the second sub-image is adjacent to its corresponding first pixel (i.e., pixel 1A, pixel 1B, pixel 1C, and pixel 1D), and there is no case where the second pixel is located at a diagonal of the first pixel.
In the present embodiment, for each image unit, it can be considered that the adjacent pixels in the image unit have higher similarity. Therefore, by selecting the target pixel adjacent to the first pixel as the second pixel, higher similarity between the obtained second sub-image and the first sub-image can be ensured, namely two images with random noise under the same scene are better simulated based on the first sub-image and the second sub-image, and further the training effect of the noise reduction model is improved.
And secondly, determining pixels needing to be subjected to sampling processing in the image sample to be subjected to noise reduction based on the down-sampling proportion, and randomly sampling one pixel in a plurality of pixels around the pixel to obtain a sub-image formed by the pixels obtained by a plurality of sampling.
For example, for an image to be noise-reduced with a resolution W × H, it may be predetermined to down-sample the image to be noise-reduced into sub-images with a resolution (W-x) × (H-x), that is, the resolutions of the first sub-image and the second sub-image are both (W-x) × (H-x). Based on the down-sampling ratios, i.e., W/(W-x) and H/(H-x), the terminal determines a pixel in the image to be noise-reduced, which needs to be subjected to the sampling process, and which needs to be a pixel in the sub-image after the sampling process is performed. Then, the terminal performs sampling processing on the pixel to be subjected to the sampling processing, that is, randomly samples one pixel among a plurality of pixels around the pixel or a plurality of pixels around the pixel and the pixel, and the sampled pixel is used as a pixel in the sub-image, and finally obtains the sub-image composed of a plurality of sampled pixels. For example, the terminal may randomly select one line among the 1 st to x th lines as a starting line to perform sampling processing of pixels, and perform sampling processing on pixels of W-x lines successively from the line; similarly, the terminal may randomly select one column among the 1 st column to the x-th column as a starting column to perform sampling processing of pixels, and perform sampling processing of pixels of H-x columns successively from the column.
Referring to fig. 7, fig. 7 is a schematic diagram of another random downsampling process provided in the embodiment of the present application. As shown in fig. 7, it is assumed that the image to be denoised is an image with a resolution of 4 × 4, and the image to be denoised includes 16 pixels, which are pixel 1 to pixel 16. The image to be denoised needs to be downsampled into sub-images resolved into (4-2) × (4-2), i.e. the resolution of the sub-images is 2 × 2. Based on the ratio of down-sampling, it is determined that the random down-sampling processing is performed on the four pixels with the pixel 6, the pixel 7, the pixel 10, and the pixel 11 in the image to be noise-reduced as targets.
In the process of performing the random downsampling process on the pixel 6, a plurality of pixels around the pixel 6, that is, the pixel 1, the pixel 2, the pixel 3, the pixel 5, the pixel 7, the pixel 9, the pixel 10, and the pixel 11 are determined. Then, one pixel is randomly selected as a pixel in the sub-image among a plurality of pixels around the pixel 6. For example, as shown in fig. 7, when the pixel 6 is the target of performing the random downsampling process, the pixel 2 is selected as the pixel in the sub-image. Alternatively, a plurality of pixels around the pixel 6 and one of the pixels 6 may be randomly selected as the pixel in the sub-image, that is, the pixel 6 itself may be selected as the pixel in the sub-image.
In the process of performing the random downsampling process on the pixel 7, a plurality of pixels around the pixel 7, that is, the pixel 2, the pixel 3, the pixel 4, the pixel 6, the pixel 8, the pixel 10, the pixel 11, and the pixel 12 are determined. Then, one pixel is randomly selected as a pixel in the sub-image among a plurality of pixels around the pixel 7. For example, as shown in fig. 7, when the pixel 7 is the target of performing the random downsampling process, the pixel 8 is selected as the pixel in the sub-image.
In the process of performing the random downsampling process on the pixel 10, a plurality of pixels around the pixel 10, that is, the pixel 5, the pixel 6, the pixel 7, the pixel 9, the pixel 11, the pixel 13, the pixel 14, and the pixel 15 are determined. Then, one pixel is randomly selected as a pixel in the sub-image among a plurality of pixels around the pixel 10. For example, as shown in fig. 7, when the pixel 10 is the target of performing the random downsampling process, the pixel 9 is selected as the pixel in the sub-image.
In the process of performing the random downsampling process on the pixel 11, a plurality of pixels around the pixel 11, that is, the pixel 6, the pixel 7, the pixel 8, the pixel 10, the pixel 12, the pixel 14, the pixel 15, and the pixel 16 are determined. Then, one pixel is randomly selected as a pixel in the sub-image among a plurality of pixels around the pixel 11. For example, as shown in fig. 7, when the pixel 11 is the target of performing the random downsampling process, the pixel 12 is selected as the pixel in the sub-image.
Step 403, inputting the first sub-image into a noise reduction model to obtain a first target image.
After obtaining the first sub-image, the terminal may input the first sub-image into the noise reduction model to obtain a first target image output by the noise reduction model. The noise reduction model is used for carrying out noise reduction processing on an input image and outputting a noise-reduced clean image. The noise reduction model is a model which can be learned, and the noise reduction capability of the noise reduction model is poor before the noise reduction model is trained; in the training process, a noise reduction module in the noise reduction model is continuously optimized, and the noise reduction capability is continuously enhanced; after the training is finished, the noise reduction model can be used for realizing the noise reduction of the image.
The noise reduction model includes, but is not limited to, a learning-based model such as a convolutional neural network or a sparse feature expression-based noise reduction model, and the specific structure of the noise reduction model is not specifically limited in this embodiment.
Step 404, obtaining a first loss function according to the first target image and the second sub-image, wherein the first loss function is used for indicating a difference between the first target image and the second sub-image.
In this embodiment, after the random downsampling processing is performed on the image to be denoised to obtain the first sub-image and the second sub-image, the first sub-image and the second sub-image may be used as a sample pair for training. The first sub-image is used as the input value of the noise reduction model, and the second sub-image is used as the expected output value of the noise reduction model. Based on the actual output value of the noise reduction model (i.e. the first target image) and the desired output value of the noise reduction model (i.e. the second sub-image), a first loss function may be obtained, which is indicative of the difference between the first target image and the second sub-image. Based on the first loss function, the noise reduction model may be directed to learn the noise reduction capability.
Exemplarily, assume the first sub-image is g1(y) the second sub-image is g2(y) the noise reduction model is f. First sub-image g1(y) after the noise reduction model f is input, the image output by the noise reduction model f is f (g)1(y)). One possible example of the first loss function is shown in equation 1.
loss1=||f(g1(y))-g2(y)||p Equation 1
Where loss1 is the first loss function, f (g)1(y)) input first sub-image g for noise reduction model f1Image outputted after (y), g2And (y) is a second sub-image, p is a power number, and the value of p can be 1 or 2 and the like. It is understood that f (g)1(y))-g2(y) is understood to mean f (g)1(y)) the value of a pixel in the image with g2(y) the value of the corresponding pixel in the image is subtracted. In general, the value of a pixel in an image may be 0-255 or 0-4095, with different values being used to represent different colors.
And 405, training the noise reduction model at least according to the first loss function to obtain a target noise reduction model.
After obtaining the first loss function, the terminal may train the noise reduction model based on the value of the first loss function. The process that the terminal trains the noise reduction model based on the first loss function comprises the following steps: the terminal adjusts parameters in the noise reduction model based on the value of the first loss function, and repeatedly executes step 401 and step 405, so as to continuously adjust the parameters of the noise reduction model terminal until the obtained first loss function is smaller than a preset threshold value, that is, it can be determined that the model training condition is met, and the target noise reduction model is obtained. The target noise reduction model is a trained noise reduction model and can be used for subsequent image noise reduction.
Optionally, in a possible embodiment, the terminal may further train the noise reduction model based on the first loss function and the second loss function.
For example, before training the noise reduction model, the terminal may input the to-be-noise-reduced image sample into the noise reduction model to obtain a second target image output by the noise reduction model. And based on the sampling position of the pixel in the first random downsampling processing, the terminal performs downsampling processing on the second target image to obtain a first sub-target image. And based on the sampling position of the pixel in the second random downsampling processing, the terminal performs downsampling processing on the second target image to obtain a second sub-target image. The terminal performs the downsampling processing on the second target image based on the sampling position of the pixel in the first random downsampling processing, namely the terminal performs the downsampling processing in the same mode as the first random downsampling processing when performing the downsampling processing on the second target image.
Taking fig. 5 as an example, the first random downsampling processing method is as follows: pixels are collected in the upper left corner of the first image unit, pixels are collected in the upper right corner of the second image unit, pixels are collected in the lower right corner of the third image unit, and pixels are collected in the lower left corner of the fourth image unit. Then, the terminal may divide the second target image into four image units based on the same manner as the first random downsampling process, and acquire corresponding pixels in the four image units of the second target image according to the positions of the pixels acquired in the first random downsampling process in each image unit, thereby obtaining the first sub-target image. That is, the positions of each set of corresponding pixels in the first sub-target image and the first sub-image in the source image before down-sampling are the same.
In practical applications, the terminal may record the position of each first pixel acquired when performing the first random down-sampling process after performing the first random down-sampling process on the image sample to be noise-reduced. Then, the terminal determines a corresponding position in the second target image based on the position of each first pixel, and collects pixels at the corresponding position in the second target image as pixels on the first sub-target image. In addition, the terminal may generate a first sampler for performing a first random down-sampling process and a second sampler for performing a second random down-sampling process, respectively, wherein the first sampler and the second sampler may perform down-sampling in a fixed manner. In this way, the terminal can perform downsampling processing on the second target image based on the first sampler to obtain a first sub-target image; and the terminal performs downsampling processing on the second target image based on the second sampler to obtain a second sub-target image.
And the terminal acquires a second loss function according to the first target image, the second sub-image, the first sub-target image and the second sub-target image. The second loss function is primarily used to indicate a difference between the first sub-target image and the second sub-target image.
Exemplarily, assume the first sub-image is g1(y) the second sub-image is g2(y) the noise reduction model is f. First sub-image g1(y) after the noise reduction model f is input, the image output by the noise reduction model f is f (g)1(y)). The first sub-image is g1(f (y)), the second sub-image is g2(f (y)). One possible example of the second loss function is shown in equation 2.
loss2=||f(g1(y))-g2(y)-(g1(f(y))-g2(f(y)))||p Equation 2
Where loss2 is the second loss function, f (g)1(y)) input first sub-image g for noise reduction model f1Image outputted after (y), g2(y) is the second subimage, g1(f (y)) is a first sub-image, g2(f (y)) is the second sub-image, and p is the number of power.
It can be understood that the second loss function is a correction term for the position inconsistency of the two sub-images during the down-sampling process, and the purpose of the second loss function is to constrain the noise reduction model so that the noise reduction model does not generate an excessively smooth image due to the position inconsistency of the pixels corresponding to the first sub-image and the second sub-image in the image sample to be noise reduced. In short, the clean images corresponding to the first sub-image and the second sub-image obtained through two times of random downsampling processing are not completely the same, and the clean images corresponding to the first sub-image and the second sub-image are only in a relationship of neighbor similarity. On the basis of constructing the loss function by using only the two sub-images, the noise reduction capability learned by the noise reduction model will not only erase the noise in the image, but also erase the difference between the true value images (i.e. the clean image) corresponding to the two sub-images. Namely, the noise reduction capability of the noise reduction model is too strong, so that high-frequency detail information in the noise image can be processed as noise information. Therefore, the second loss function is introduced on the basis of the first loss function, so that high-frequency detail information in the noise image can be protected from being removed by the noise reduction model, and the resolution of the noise-reduced image is ensured.
And finally, after a first loss function and a second loss function are obtained through calculation, the terminal trains the noise reduction model at least according to the first loss function and the second loss function until the noise reduction model meets model training conditions. For example, the terminal may add the first loss function and the second loss function to obtain an overall loss function, and then train the noise reduction model based on the overall loss function.
Optionally, the first loss function and the second loss function further have corresponding weight coefficients, that is, the terminal calculates the total loss function based on the weight coefficients corresponding to the first loss function and the second loss function. That is, the terminal trains the noise reduction model according to at least the first loss function, the first weight coefficient, the second loss function and the second weight coefficient; wherein the first weight coefficient is used to indicate a weight of the first loss function, and the second weight coefficient is used to indicate a weight of the second loss function.
For example, a process of calculating the total loss function based on the first loss function, the first weight coefficient, the second loss function, and the second weight coefficient may be as shown in equation 3.
Total loss a loss 1b loss2 formula 3
Wherein, loss is the total loss function, loss1 is the first loss function, loss2 is the second loss function, a is the first weight coefficient, b is the second weight coefficient. Alternatively, the ratio between a and b may be 1 or 1/2, etc. In practical applications, the first weighting factor and/or the second weighting factor may be adjusted according to the actual noise reduction requirements. The larger the second weight coefficient is, the weaker the noise reduction strength is, the more the noise is, but the less the loss of the high-frequency details of the image is; the smaller the second weight coefficient, the stronger the noise reduction intensity, and the stronger the noise reduction intensity, the smaller the noise, but the more the high frequency details of the image are lost. Therefore, in practical applications, the first weight coefficient and/or the second weight coefficient can be adjusted to achieve the balance between the noise reduction strength and the loss degree of high-frequency details.
The above describes the process of training the noise reduction model. For ease of understanding, the principle of training a noise reduction model based on pairs of sub-images obtained by a random downsampling process will be described below.
When two noise images are adopted to form a sample pair, if the clean images corresponding to the two noise images are consistent, a noise reduction model with the noise reduction capability can be obtained based on training of a large number of sample pairs. It can be appreciated that when the samples used to train the noise reduction model are few, the noise reduction model actually learns the transfer relationship of the two noise patterns. When the samples used to train the noise reduction model are sufficiently large, since the noise in the samples is random and fluctuates around the true value, then from the point of view of minimizing the loss function, it can be seen that the noise reduction model is a clean, noise-free image that can be learned. Since it is not possible for the noise reduction model to learn some kind of noise transfer law so that the loss function is minimized, since the noise is always random. The noise reduction model minimizes the loss function by converting randomly fluctuating noise in the samples to intermediate values, and the intermediate values of the large amount of noise are actually exactly the true values for the noise. Therefore, based on the sample pair composed of the noise images, a noise reduction model having a noise reduction capability can be trained.
For ease of understanding, a specific derivation procedure will be described below.
For the indoor temperature estimation problem: it is assumed that a series of observed temperatures (y1, y2, y3, y4 …) are obtained in a certain manner. Then, the true temperature z is obtained based on a series of observed temperatures, and can be modeled as equation 4.
Figure BDA0002860703080000211
Where argmin represents the expectation of minimizing the loss function L, which is a function with respect to z. For all observed temperatures y, the losses are to be minimized. Therefore, L can be considered as a probability distribution with y as a variable, minimizing the loss of all samples, i.e. minimizing the mean of the losses of all samples. If the distance metric is the L2 norm, then the true temperature z is actually the mean of y. Therefore, the true temperature z is actually the mean value of y, it is not at all important as to what the observed temperature y (i) is at each time, and the objective of the optimization is to consider the mean values of all the observations.
Then, for the image noise reduction problem, it is assumed that the input noise image is x (i) and the output sharp image is y (i). Then, the image noise reduction problem can be modeled as equation 5.
Figure BDA0002860703080000212
In equation 5, θ is actually a weight parameter of the noise reduction model. Also, x and y are not independent of each other, so equation 5 can be converted to equation 6.
Figure BDA0002860703080000221
Similarly, if the distribution of p (y/x) is changed, the finally obtained θ will not be changed as long as the conditions are not changed. Therefore, if a gaussian noise with an average value of 0 is added to the label y as a perturbation, y' can be regarded as an expected value with noise. That is, a noise reduction model having noise reduction capability can be trained by using a noise image as an output expectation value of the noise reduction model.
Based on a similar idea, the random downsampling process is performed twice on the same noise image in the present embodiment to obtain two sub-images. The two sub-images can be used for simulating two images with random noise in the same scene, so that the training of a noise reduction model can be realized on the basis of the sub-images obtained by downsampling. Compared with training a noise reduction model by using two noise images corresponding to the same clean image as a sample pair, the present embodiment obtains the sample pair more easily by performing random downsampling processing twice on the same noise image to obtain sub-images as the sample pair. Since in some particular scenes or fields two noisy images corresponding to the same clean image are in practice difficult to acquire. For example, in the field of terminal photographing, since an object in a dynamic scene is continuously moving, it is actually very difficult to continuously acquire two noise images in the dynamic scene, and it is required that clean images corresponding to the two noise images are identical. For example, in the field of medical images, it is very difficult to continuously acquire two noisy images by a medical instrument because the medical instrument acquires a medical image corresponding to a human body part and has a certain negative effect on the human body.
For convenience of understanding, the training method of the noise reduction model provided in the present embodiment will be described below with reference to specific examples.
Referring to fig. 8 and 9, fig. 8 is a schematic diagram of a training process of a noise reduction model according to an embodiment of the present disclosure; fig. 9 is a schematic diagram of a training process of another noise reduction model according to an embodiment of the present application. As shown in FIG. 8, the training process of the noise reduction model includes steps 801 and 809.
Step 801 selects a noise image y in the sample set.
First, the terminal acquires a sample set including a large number of noise images, and selects a noise image y not used for training among the sample set.
Step 802, construct sampler g1 and sampler g 2.
In this embodiment, the terminal may pre-construct sampler g1 and sampler g 2. The sampler g1 is used to perform a first random down-sampling process to down-sample the noise image y. The sampler g2 is used to perform a second random downsampling process to downsample the noise image y a second time.
In step 803, the noise image y is down-sampled by the sampler g1 and the sampler g2, and a noise sub-graph g1(y) and a noise sub-graph g2(y) are obtained.
Specifically, the above step 402 may be referred to in the process of performing random down-sampling processing on the noise image y by the sampler g1 and the sampler g2, and will not be described herein again.
And step 804, denoising the noise sub-graph g1(y) by a denoising module to obtain a denoised image f (g1 (y)).
After the noise sub-graph g1(y) is obtained, the noise sub-graph g1(y) may be input into a noise reduction module, and the noise sub-graph g1(y) may be subjected to noise reduction processing by the noise reduction module to obtain a noise-reduced image f (g1 (y)). The noise reduction module may be the above-mentioned learning-based noise reduction model, such as a convolutional neural network or a sparse feature expression-based noise reduction model.
And step 805, denoising the noise image y through a denoising module to obtain a denoised image f (y).
In step 806, down-sampling processing is performed on the noise-reduced image f (y) by the sampler g1 and the sampler g2, so as to obtain a noise-reduced sub-image g1(f (y)) and a noise-reduced sub-image g2(f (y)).
After obtaining the noise-reduced image f (y), a sampler g1 and a sampler g2 are used to perform down-sampling on the noise-reduced image f (y), so as to obtain a noise-reduced sub-image g1(f (y)) and a noise-reduced sub-image g2(f (y)), respectively.
In step 807, a loss function is calculated.
The loss function is obtained by calculating a first loss function, a second loss function and weight coefficients corresponding to the first loss function and the second loss function. The first loss function may be calculated based on the noise-reduced image f (g1(y)) and the noise sub-image g2 (y); the second loss function may be obtained based on the noise-reduced image f (g1(y)), the noise sub-graph g2(y), the noise-reduced sub-graph g1(f (y)), and the noise-reduced sub-graph g2(f (y)). The calculation process of the first loss function and the second loss function may refer to step 404 described above, and will not be described herein again.
Step 808, updating the parameters of the noise reduction module.
After the loss function is calculated, parameters of the noise reduction module may be adaptively updated based on values of the loss function.
And step 809, judging whether the noise reduction module converges.
And judging whether the noise reduction module is converged or not based on the value of the loss function, namely whether the noise reduction module meets the training condition or not. If the noise reduction module is converged, the noise reduction module is considered to be trained completely, and the noise reduction module can be output as the trained noise reduction module; if the denoising module does not converge, go to continue to execute step 801 and 808, and continue to train the denoising module until the denoising module converges.
In order to verify the effect of the training method based on the noise reduction model provided in this embodiment, this embodiment provides a plurality of experiments to evaluate the noise reduction effect of the noise reduction model trained by the training method based on the noise reduction model.
Referring to fig. 10, fig. 10 is a schematic diagram of an experimental flow for determining a noise reduction effect of a noise reduction model according to an embodiment of the present application. As shown in fig. 10, a batch of clean images without noise is first acquired, and then noise is randomly synthesized in the clean images to obtain a batch of noise images. And then inputting the noise image into a noise reduction network trained based on the training method to obtain a noise-reduced image, namely a noise-reduced image. And finally, calculating indexes between the denoising image and the clean image based on the denoising image and the clean image corresponding to the denoising image, and determining the difference between the denoising image and the clean image so as to determine the denoising effect of the denoising network.
The noise synthesis is a commonly used measure for noise reduction, and the images containing noise are synthesized by artificially increasing gaussian noise, poisson noise and the like. In the embodiment, the ImageNet data set is used as training data, and comprises fifty thousand high-definition pictures, so that most scenes of daily life are covered. Meanwhile, Kodak image (Kodak) dataset, non-negative neighborhood embedded low-complexity single image super resolution (Set14) dataset, and berkeley image segmentation (BSD300) dataset were also selected as test sets for lateral comparison with other noise reduction methods. And for the noise reduction network, a U-Net network is constructed on a PyTorch platform to serve as the noise reduction network. In addition, in order to evaluate the quality of an output result, an image without noise is taken as a clean image, the Peak signal-to-noise Ratio (PSNR) of each test image is calculated respectively, and finally the average PSNR of the whole test set is calculated.
Specifically, the specific implementation steps are as follows.
1. And constructing a training set, namely extracting fifty thousand high-definition images from the ImageNet verification set. The test Set was constructed, i.e., noise was added to the Kodak dataset, Set14 dataset, and BSD300 dataset.
2. And constructing a noise reduction network, wherein the embodiment adopts a UNet network as the noise reduction network.
3. A downsampler, a first loss function and a second loss function are constructed, and the present embodiment uses a random downsampler. Meanwhile, the corresponding positions of the pixels of the two sampled sub-images at the same position in the original image are in an adjacent relation. Wherein the weight coefficient of the first loss function is set to 1 and the weight coefficient of the second loss function is set to 2.
4. Training a network: based on the constructed training set, the noise reduction network, the sampler and the loss function, the noise reduction network is trained to be convergent by using the training method of the noise reduction model provided by the embodiment.
5. And denoising the test set by using the trained denoising network to generate a denoising image.
6. And calculating the PSNR between the denoised image and the clean image in the test set.
In order to verify the effectiveness of the training method of the noise reduction model provided by the embodiment of the application, the method is transversely compared with several image noise reduction methods in the current mainstream. Multiple models were trained based on the training Set, and then tested on the Kodak, BSD300, and Set14 data sets to calculate their PSNR. The comparison results of the various methods are shown in table 1 and fig. 11, and a higher PSNR indicates a better noise reduction effect. Fig. 11 is a schematic diagram illustrating comparison of noise reduction effects of various noise reduction methods according to an embodiment of the present application.
TABLE 1
Figure BDA0002860703080000241
Figure BDA0002860703080000251
As can be seen from table 1, the Noise reduction effect of the training method of the Noise reduction model provided in the embodiment of the present application is slightly lower than that of the Noise reduction effect of the Noise reduction model of Noise2Noise method, and is higher than that of other methods. The Noise2Noise method is the above method that requires training based on multiple Noise images with the same clean image. However, in practical applications, it is practically very difficult to have multiple noisy images with the same clean image. Therefore, compared with the training method of the Noise reduction model provided by the embodiment of the application, the Noise2Noise method is difficult to implement in most scenes.
In the above experiment, noise is synthesized on a clean image to obtain a noise image and a clean image corresponding to the noise image. The training method of the noise reduction model provided by the embodiment of the present application is performed by taking a scene of terminal photographing as an example.
Referring to fig. 12, fig. 12 is a schematic diagram of another experimental flow for determining a noise reduction effect of a noise reduction model according to an embodiment of the present application. As shown in fig. 12, a plurality of noise images in the same scene, i.e., noise image 1, noise image 2 … noise image N shown in fig. 12, are first acquired by the camera sensor. The multiple noisy images may be captured by the camera in poor lighting scenes, and thus include significant noise. Then, one noise image is randomly selected from the plurality of noise images and input into a noise reduction network trained based on the training method, and a noise-reduced image, namely a noise-reduced image, is obtained. And finally, calculating indexes between the denoising image and the clean image based on the denoising image and the clean image corresponding to the denoising image, and determining the difference between the denoising image and the clean image so as to determine the denoising effect of the denoising network. The clean image may be obtained by averaging the plurality of noise images. Since the camera sensor cannot acquire a clean image, an image as close to a real clean image as possible can be acquired by averaging a plurality of noise images.
Specifically, the specific implementation steps are as follows.
1. A training set and a test set are constructed. In this embodiment, a Smartphone Image Denoising Dataset (SIDD) constructed based on a sensor of a real phone is used. The SIDD data set is a real scene data set collected under a dark light scene by using a camera sensor of a mobile phone, and a corresponding clean image is obtained by weighted average of a plurality of images containing noise.
2. And constructing a noise reduction network, wherein the embodiment adopts a UNet network model as the noise reduction network.
3. A downsampler, a first loss function and a second loss function are constructed, and the present embodiment uses a random downsampler. Meanwhile, the corresponding positions of the pixels of the two sampled sub-images at the same position in the original image are in an adjacent relation. Wherein the weighting coefficients of the first and second loss functions are both set to 1.
4. Training a network: based on the constructed training set, the noise reduction network, the sampler and the loss function, the noise reduction network is trained to be convergent by using the training method of the noise reduction model provided by the embodiment.
5. And denoising the test set by using the trained denoising network to generate a denoising image.
6. And calculating the PSNR between the denoised image and the clean image in the test set.
In order to verify the effectiveness of the training method of the noise reduction model provided by the embodiment of the application, the method is transversely compared with several image noise reduction methods in the current mainstream. A plurality of models are trained on the basis of the SIDD data set, and the average PSNR and the Structural Similarity (SSIM) of the SIDD verification data and the reference data are calculated. Finally, the indexes are compared with the real noise reduction effect. The comparison results of the various methods are shown in fig. 13 and 14, and the higher PSNR and SSIM indicate the better noise reduction effect. Fig. 13 is a schematic diagram illustrating a comparison of noise reduction indexes of a plurality of noise reduction methods provided in the embodiment of the present application; fig. 14 is a schematic diagram illustrating comparison of noise reduction effects of various noise reduction methods according to an embodiment of the present application. As can be seen from fig. 13 and fig. 14, the PSNR and SSIM of the training method of the noise reduction model provided in the embodiment of the present application are higher than those of other methods, that is, the noise reduction effect is better than that of other methods.
Referring to fig. 15, fig. 15 is a schematic flowchart of an image denoising method according to an embodiment of the present disclosure. As shown in fig. 15, the image denoising method includes steps 1501 and 1502.
Step 1501, acquiring an image to be denoised.
The image to be denoised is a noise image which contains noise in the practical application process and needs to be denoised. For example, the image to be denoised may be an image obtained by photographing the terminal, a medical image or a monitoring image. The present embodiment does not specifically limit the type of the image to be denoised.
Step 1502, inputting the image to be denoised into a target denoising model to obtain a denoised image.
The target noise reduction model is obtained by training a noise reduction model at least based on a first loss function, the first loss function is obtained based on a first target image and a second sub-image, the first loss function is used for indicating the difference between the first target image and the second sub-image, the first target image is obtained by inputting a first sub-image into the noise reduction model, the first sub-image and the second sub-image are obtained by respectively performing first random down-sampling processing and second random down-sampling processing on an image sample to be subjected to noise reduction, and the resolutions of the first sub-image and the second sub-image are the same. Briefly, the target noise reduction model is obtained by training based on the training method of the noise reduction model described in the foregoing embodiment, and the specific training process may refer to the description of step 401 and step 405, which are not described herein again.
Optionally, the first sub-image is obtained according to M first pixels, where the M first pixels are obtained by performing first random selection of pixels in each image unit of the M image units after dividing the image sample to be denoised into the M image units;
the second sub-image is obtained according to M second pixels, and the M second pixels are obtained by performing second random selection of pixels in each image unit of the M image units after dividing the image sample to be denoised into the M image units.
Optionally, the M second pixels are obtained by performing random selection of pixels in n × n-1 target pixels in each of the M image units, where the n × n-1 target pixels are pixels that are not selected in each image unit when performing the first random selection of pixels, and the M second pixels are different from the M first pixels.
Optionally, the M second pixels are obtained by randomly selecting one second pixel adjacent to the pixel selected at the time of the first random selection from n × n-1 target pixels in each of the M image units, and each of the M second pixels is adjacent to the corresponding first pixel.
Optionally, the target noise reduction model is obtained by training the noise reduction model based on at least the first loss function and a second loss function, the second loss function is obtained according to the first target image, the second sub-image, a first sub-target image and a second sub-target image, the first sub-target image is obtained by performing downsampling processing on the second target image based on the sampling position of the pixel in the first random downsampling processing, the second sub-target image is obtained by performing downsampling processing on the second target image based on the sampling position of the pixel in the second random downsampling processing, and the second target image is obtained by inputting the image sample to be noise reduced into the noise reduction model.
Optionally, the target noise reduction model is obtained by training the noise reduction model based on at least the first loss function, the first weight coefficient, the second loss function, and the second weight coefficient; wherein the first weight coefficient is used to indicate a weight of the first loss function, and the second weight coefficient is used to indicate a weight of the second loss function.
Optionally, the noise reduction model includes a convolutional neural network or a noise reduction model based on sparse feature expression.
Referring to fig. 16, fig. 16 is a schematic structural diagram of a model training device according to an embodiment of the present disclosure. As shown in fig. 16, the model training apparatus includes: an acquisition unit 1601 and a processing unit 1602. The obtaining unit 1601 is configured to obtain an image sample to be noise-reduced from a sample set, where the sample set includes a plurality of image samples to be noise-reduced; the processing unit 1602 is configured to perform a first random downsampling process and a second random downsampling process on the to-be-denoised image sample to obtain a first sub-image and a second sub-image, where resolutions of the first sub-image and the second sub-image are the same; the processing unit 1602 is further configured to input the first sub-image into a noise reduction model to obtain a first target image; the processing unit 1602, further configured to obtain a first loss function according to the first target image and the second sub-image, where the first loss function is used to indicate a difference between the first target image and the second sub-image; the processing unit 1602 is further configured to train the noise reduction model at least according to the first loss function, so as to obtain a target noise reduction model.
Optionally, in a possible implementation manner, the processing unit 1602 is further configured to: dividing the image sample to be denoised into M image units, wherein each image unit in the M image units comprises n x n pixels; performing a first random selection of pixels in each of the M image units to obtain M first pixels, and obtaining the first sub-image according to the M first pixels; and performing second random selection of pixels in each of the M image units to obtain M second pixels, and obtaining the second sub-image according to the M second pixels.
Optionally, in a possible implementation manner, the obtaining unit 1601 is further configured to obtain n × n-1 target pixels in each of the M image units, where the n × n-1 target pixels are pixels that are not selected in each of the image units when performing the first random selection of the pixels; the processing unit 1602 is further configured to perform random selection of pixels in n × n-1 target pixels in each of the M image units to obtain M second pixels, where the M second pixels are different from the M first pixels.
Optionally, in a possible implementation manner, the processing unit 1602 is further configured to randomly select, from n × n-1 target pixels in each of the M image units, a second pixel adjacent to the pixel selected in the first random selection to obtain M second pixels, where each of the M second pixels is adjacent to a corresponding first pixel.
Optionally, in a possible implementation manner, the processing unit 1602 is further configured to: inputting the image sample to be denoised into the denoising model to obtain a second target image; based on the sampling position of the pixel in the first random downsampling processing, downsampling the second target image to obtain a first sub-target image; based on the sampling position of the pixel in the second random downsampling processing, downsampling the second target image to obtain a second sub-target image; acquiring a second loss function according to the first target image, the second sub-image, the first sub-target image and the second sub-target image; training the noise reduction model according to at least the first loss function and the second loss function.
Optionally, in a possible implementation manner, the processing unit 1602 is further configured to train the noise reduction model according to at least the first loss function, the first weight coefficient, the second loss function, and the second weight coefficient; wherein the first weight coefficient is used to indicate a weight of the first loss function, and the second weight coefficient is used to indicate a weight of the second loss function.
Optionally, in a possible implementation, the noise reduction model includes a convolutional neural network or a noise reduction model based on sparse feature expression.
Referring to fig. 17, fig. 17 is a schematic structural diagram of an image noise reduction apparatus according to an embodiment of the present disclosure. As shown in fig. 17, the image noise reduction apparatus includes: an acquisition unit 1701 and a processing unit 1702. The acquiring unit 1701 is configured to acquire an image to be denoised; the processing unit 1702 is configured to input the image to be denoised into a target denoising model to obtain a denoised image; the target noise reduction model is obtained by training a noise reduction model at least based on a first loss function, the first loss function is obtained based on a first target image and a second sub-image, the first loss function is used for indicating the difference between the first target image and the second sub-image, the first target image is obtained by inputting a first sub-image into the noise reduction model, the first sub-image and the second sub-image are obtained by respectively performing first random down-sampling processing and second random down-sampling processing on an image sample to be subjected to noise reduction, and the resolutions of the first sub-image and the second sub-image are the same.
Optionally, in a possible implementation manner, the first sub-image is obtained according to M first pixels, where the M first pixels are obtained by performing first random selection of pixels in each image unit of the M image units after dividing the image sample to be denoised into M image units; the second sub-image is obtained according to M second pixels, and the M second pixels are obtained by performing second random selection of pixels in each image unit of the M image units after dividing the image sample to be denoised into the M image units.
Optionally, in a possible implementation manner, the M second pixels are obtained by performing random selection of pixels in n × n-1 target pixels in each of the M image units, where the n × n-1 target pixels are pixels that are not selected in the first random selection of pixels in each image unit, and the M second pixels are different from the M first pixels.
Optionally, in a possible implementation manner, the M second pixels are obtained by randomly selecting one second pixel adjacent to the pixel selected at the time of the first random selection from n × n-1 target pixels in each of the M image units, and each of the M second pixels is adjacent to the corresponding first pixel.
Optionally, in a possible implementation manner, the target noise reduction model is obtained by training the noise reduction model based on at least the first loss function and a second loss function, the second loss function is obtained according to the first target image, the second sub-image, a first sub-target image and a second sub-target image, the first sub-target image is obtained by performing downsampling processing on the second target image based on the sampling position of the pixel in the first random downsampling processing, the second sub-target image is obtained by performing downsampling processing on the second target image based on the sampling position of the pixel in the second random downsampling processing, and the second target image is obtained by inputting the image sample to be noise reduced into the noise reduction model.
Optionally, in a possible implementation manner, the target noise reduction model is obtained by training the noise reduction model based on at least the first loss function first weight coefficient, the second loss function, and the second weight coefficient; wherein the first weight coefficient is used to indicate a weight of the first loss function, and the second weight coefficient is used to indicate a weight of the second loss function.
Optionally, in a possible implementation, the noise reduction model includes a convolutional neural network or a noise reduction model based on sparse feature expression.
Referring to fig. 18, fig. 18 is a schematic structural diagram of an execution device according to an embodiment of the present application, and the execution device 1800 may be embodied as a mobile phone, a tablet, a notebook computer, an intelligent wearable device, a server, and the like, which is not limited herein. The execution device 1800 may be disposed with the data processing apparatus described in the embodiment corresponding to fig. 18, and is configured to implement the function of data processing in the embodiment corresponding to fig. 18. Specifically, the execution device 1800 includes: a receiver 1801, a transmitter 1802, a processor 1803, and a memory 1804 (where the number of processors 1803 in the execution device 1800 may be one or more, for example, one processor in fig. 18), where the processor 1803 may include an application processor 18031 and a communication processor 18032. In some embodiments of the present application, the receiver 1801, transmitter 1802, processor 1803, and memory 1804 may be connected by a bus or otherwise.
Memory 1804 may include both read-only memory and random-access memory, and provides instructions and data to processor 1803. A portion of the memory 1804 may also include non-volatile random access memory (NVRAM). The memory 1804 stores a processor and operating instructions, executable modules or data structures, or subsets thereof, or expanded sets thereof, wherein the operating instructions may include various operating instructions for performing various operations.
The processor 1803 controls the operation of the execution apparatus. In a particular application, the various components of the execution device are coupled together by a bus system that may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. For clarity of illustration, the various buses are referred to in the figures as a bus system.
The method disclosed in the embodiments of the present application may be applied to the processor 1803, or implemented by the processor 1803. The processor 1803 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 1803. The processor 1803 may be a general-purpose processor, a Digital Signal Processor (DSP), a microprocessor or a microcontroller, and may further include an Application Specific Integrated Circuit (ASIC), a field-programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components. The processor 1803 may implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 1804, and the processor 1803 reads the information in the memory 1804, and completes the steps of the above method in combination with the hardware thereof.
The receiver 1801 may be used to receive input numeric or character information and generate signal inputs related to performing device related settings and function control. The transmitter 1802 may be used to output numeric or character information through a first interface; the transmitter 1802 is further operable to send instructions to the disk groups via the first interface to modify data in the disk groups; the transmitter 1802 may also include a display device such as a display screen.
In this embodiment of the present application, in one case, the processor 1803 is configured to execute a training method of a noise reduction model executed by an execution device in the embodiment corresponding to fig. 4.
Referring to fig. 19, fig. 19 is a schematic structural diagram of a training device provided in the embodiment of the present application, and specifically, the training device 1900 is implemented by one or more servers, where the training device 1900 may generate a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 1919 (e.g., one or more processors) and a memory 1932, and one or more storage media 1930 (e.g., one or more mass storage devices) for storing an application program 1942 or data 1944. Memory 1932 and storage medium 1930 can be, among other things, transient or persistent storage. The program stored in storage medium 1930 may include one or more modules (not shown), each of which may include a sequence of instructions for operating on the exercise device. Still further, central processor 1919 may be configured to communicate with storage medium 1930 to perform a series of instruction operations on training device 1900 within storage medium 1930.
Training device 1900 may also include one or more power supplies 1926, one or more wired or wireless network interfaces 1950, one or more input-output interfaces 1958; or, one or more operating systems 1941, such as Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTMAnd so on.
In particular, the training device may perform the steps in the embodiment corresponding to fig. 4.
Embodiments of the present application also provide a computer program product, which when executed on a computer causes the computer to perform the steps performed by the aforementioned execution device, or causes the computer to perform the steps performed by the aforementioned training device.
Also provided in an embodiment of the present application is a computer-readable storage medium, in which a program for signal processing is stored, and when the program is run on a computer, the program causes the computer to execute the steps executed by the aforementioned execution device, or causes the computer to execute the steps executed by the aforementioned training device.
The execution device, the training device, or the terminal device provided in the embodiment of the present application may specifically be a chip, where the chip includes: a processing unit, which may be for example a processor, and a communication unit, which may be for example an input/output interface, a pin or a circuit, etc. The processing unit may execute the computer execution instructions stored by the storage unit to cause the chip in the execution device to execute the data processing method described in the above embodiment, or to cause the chip in the training device to execute the data processing method described in the above embodiment. Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, and the like, and the storage unit may also be a storage unit located outside the chip in the wireless access device, such as a read-only memory (ROM) or another type of static storage device that can store static information and instructions, a Random Access Memory (RAM), and the like.
Specifically, referring to fig. 20, fig. 20 is a schematic structural diagram of a chip provided in the embodiment of the present application, where the chip may be represented as a neural network processor NPU 2000, and the NPU 2000 is mounted on a main CPU (Host CPU) as a coprocessor, and the Host CPU allocates tasks. The core portion of the NPU is an arithmetic circuit 2003, and the controller 2004 controls the arithmetic circuit 2003 to extract matrix data in the memory and perform multiplication.
In some implementations, the arithmetic circuit 2003 internally includes a plurality of processing units (PEs). In some implementations, the arithmetic circuitry 2003 is a two-dimensional systolic array. The arithmetic circuit 2003 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuit 2003 is a general purpose matrix processor.
For example, assume that there is an input matrix A, a weight matrix B, and an output matrix C. The arithmetic circuit fetches the data corresponding to the matrix B from the weight memory 2002 and buffers it in each PE in the arithmetic circuit. The arithmetic circuit takes the matrix a data from the input memory 2001 and performs matrix arithmetic with the matrix B, and partial results or final results of the obtained matrix are stored in an accumulator (accumulator) 2008.
The unified memory 2006 is used to store input data and output data. The weight data directly passes through a Memory Access Controller (DMAC) 2005, and the DMAC is transferred to the weight Memory 2002. Input data is also carried into the unified memory 2006 by the DMAC.
The BIU is a Bus Interface Unit 2010 for the interaction of the AXI Bus with the DMAC and the Instruction Fetch Buffer (IFB) 2009.
The Bus Interface Unit 2010(Bus Interface Unit, BIU for short) is configured to obtain an instruction from the external memory by the instruction fetch memory 2009, and is further configured to obtain the original data of the input matrix a or the weight matrix B from the external memory by the storage Unit access controller 2005.
The DMAC is mainly used to transfer input data in the external memory DDR to the unified memory 2006 or to transfer weight data to the weight memory 2002 or to transfer input data to the input memory 2001.
The vector calculation unit 2007 includes a plurality of arithmetic processing units, and further processes the output of the arithmetic circuit 2003 such as vector multiplication, vector addition, exponential operation, logarithmic operation, magnitude comparison, and the like, if necessary. The method is mainly used for non-convolution/full-connection layer network calculation in the neural network, such as Batch Normalization, pixel-level summation, up-sampling of a feature plane and the like.
In some implementations, the vector calculation unit 2007 can store the vector of processed outputs to the unified memory 2006. For example, the vector calculation unit 2007 may calculate a linear function; alternatively, a nonlinear function is applied to the output of the arithmetic circuit 2003, such as linear interpolation of the feature planes extracted from the convolutional layers, and then, such as a vector of accumulated values, to generate an activation value. In some implementations, the vector calculation unit 2007 generates normalized values, pixel-level summed values, or both. In some implementations, the vector of processed outputs can be used as activation inputs to the arithmetic circuit 2003, e.g., for use in subsequent layers in a neural network.
An instruction fetch buffer 2009 connected to the controller 2004 for storing instructions used by the controller 2004;
the unified memory 2006, the input memory 2001, the weight memory 2002, and the instruction fetch memory 2009 are all On-Chip memories. The external memory is private to the NPU hardware architecture.
The processor mentioned in any of the above may be a general purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits for controlling the execution of the above programs.
It should be noted that the above-described embodiments of the apparatus are merely schematic, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiments of the apparatus provided in the present application, the connection relationship between the modules indicates that there is a communication connection therebetween, and may be implemented as one or more communication buses or signal lines.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present application can be implemented by software plus necessary general-purpose hardware, and certainly can also be implemented by special-purpose hardware including special-purpose integrated circuits, special-purpose CPUs, special-purpose memories, special-purpose components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions may be various, such as analog circuits, digital circuits, or dedicated circuits. However, for the present application, the implementation of a software program is more preferable. Based on such understanding, the technical solutions of the present application may be substantially embodied in the form of a software product, which is stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, an exercise device, or a network device) to execute the method according to the embodiments of the present application.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, training device, or data center to another website site, computer, training device, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a training device, a data center, etc., that incorporates one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

Claims (17)

1. A method for training a noise reduction model, comprising:
acquiring image samples to be subjected to noise reduction from a sample set, wherein the sample set comprises a plurality of image samples to be subjected to noise reduction;
performing first random downsampling processing and second random downsampling processing on the image sample to be denoised to respectively obtain a first sub-image and a second sub-image, wherein the resolution ratio of the first sub-image is the same as that of the second sub-image;
inputting the first sub-image into a noise reduction model to obtain a first target image;
obtaining a first loss function from the first target image and the second sub-image, the first loss function being indicative of a difference between the first target image and the second sub-image;
and training the noise reduction model at least according to the first loss function to obtain a target noise reduction model.
2. The method according to claim 1, wherein the performing a first random downsampling process and a second random downsampling process on the image samples to be denoised to obtain a first sub-image and a second sub-image respectively comprises:
dividing the image sample to be denoised into M image units, wherein each image unit in the M image units comprises n x n pixels;
performing a first random selection of pixels in each of the M image units to obtain M first pixels, and obtaining the first sub-image according to the M first pixels;
and performing second random selection of pixels in each of the M image units to obtain M second pixels, and obtaining the second sub-image according to the M second pixels.
3. The method of claim 2, wherein said performing a second random selection of pixels in each of said M image cells comprises:
acquiring n x n-1 target pixels in each of the M image cells, the n x n-1 target pixels being pixels in each image cell that have not been selected when performing a first random selection of pixels;
and performing random selection of pixels in n x n-1 target pixels in each of the M image units to obtain M second pixels, wherein the M second pixels are different from the M first pixels.
4. The method of claim 3, wherein performing a random selection of pixels among n x n-1 target pixels in each of the M image cells comprises:
and randomly selecting a second pixel adjacent to the pixel selected in the first random selection from n x n-1 target pixels in each of the M image units to obtain M second pixels, wherein each of the M second pixels is adjacent to the corresponding first pixel.
5. The method according to any one of claims 1-4, further comprising:
inputting the image sample to be denoised into the denoising model to obtain a second target image;
based on the sampling position of the pixel in the first random downsampling processing, downsampling the second target image to obtain a first sub-target image;
based on the sampling position of the pixel in the second random downsampling processing, downsampling the second target image to obtain a second sub-target image;
acquiring a second loss function according to the first target image, the second sub-image, the first sub-target image and the second sub-target image;
the training of the noise reduction model according to at least the first loss function comprises:
training the noise reduction model according to at least the first loss function and the second loss function.
6. The method of claim 5, wherein training the noise reduction model based on at least the first and second loss functions comprises:
training the noise reduction model according to at least the first loss function, the first weight coefficient, the second loss function and the second weight coefficient;
wherein the first weight coefficient is used to indicate a weight of the first loss function, and the second weight coefficient is used to indicate a weight of the second loss function.
7. The method of any one of claims 1-6, wherein the noise reduction model comprises a convolutional neural network or a sparse feature expression based noise reduction model.
8. An image noise reduction method, comprising:
acquiring an image to be denoised;
inputting the image to be denoised into a target denoising model to obtain a denoised image;
the target noise reduction model is obtained by training a noise reduction model at least based on a first loss function, the first loss function is obtained based on a first target image and a second sub-image, the first loss function is used for indicating the difference between the first target image and the second sub-image, the first target image is obtained by inputting a first sub-image into the noise reduction model, the first sub-image and the second sub-image are obtained by respectively performing first random down-sampling processing and second random down-sampling processing on an image sample to be subjected to noise reduction, and the resolutions of the first sub-image and the second sub-image are the same.
9. The method according to claim 8, wherein the first sub-image is derived from M first pixels, which are derived by performing a first random selection of pixels in each of the M image units after dividing the image sample to be denoised into M image units;
the second sub-image is obtained according to M second pixels, and the M second pixels are obtained by performing second random selection of pixels in each image unit of the M image units after dividing the image sample to be denoised into the M image units.
10. The method of claim 9, wherein the M second pixels are derived by performing a random selection of pixels from n x n-1 target pixels in each of the M image cells, the n x n-1 target pixels being pixels in each image cell that were not selected when the first random selection of pixels was performed, the M second pixels being different from the M first pixels.
11. The method according to claim 10, wherein the M second pixels are obtained by randomly selecting one second pixel adjacent to the selected pixel at the first random selection among n x n-1 target pixels in each of the M image units, each of the M second pixels being adjacent to the corresponding first pixel.
12. The method according to any one of claims 8 to 11, wherein the target noise reduction model is obtained by training the noise reduction model based on at least the first loss function and a second loss function, the second loss function is obtained according to the first target image, the second sub-image, a first sub-target image and a second sub-target image, the first sub-target image is obtained by down-sampling a second target image based on the sampling position of the pixel in the first random down-sampling process, the second sub-target image is obtained by down-sampling the second target image based on the sampling position of the pixel in the second random down-sampling process, and the second target image is obtained by inputting the sample of the image to be noise reduced into the noise reduction model.
13. The method of claim 12, wherein the target noise reduction model is trained based on at least the first loss function first weight coefficient, the second loss function, and a second weight coefficient;
wherein the first weight coefficient is used to indicate a weight of the first loss function, and the second weight coefficient is used to indicate a weight of the second loss function.
14. The method according to any one of claims 8-13, wherein the noise reduction model comprises a convolutional neural network or a sparse feature expression based noise reduction model.
15. A terminal comprising a memory and a processor; the memory stores code, the processor is configured to execute the code, and when executed, the terminal performs the method of any of claims 1 to 14.
16. A computer readable storage medium comprising computer readable instructions which, when run on a computer, cause the computer to perform the method of any of claims 1 to 14.
17. A computer program product comprising computer readable instructions which, when run on a computer, cause the computer to perform the method of any one of claims 1 to 14.
CN202011565423.XA 2020-12-25 2020-12-25 Training method of noise reduction model and related device Pending CN112598597A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011565423.XA CN112598597A (en) 2020-12-25 2020-12-25 Training method of noise reduction model and related device
PCT/CN2021/131656 WO2022134971A1 (en) 2020-12-25 2021-11-19 Noise reduction model training method and related apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011565423.XA CN112598597A (en) 2020-12-25 2020-12-25 Training method of noise reduction model and related device

Publications (1)

Publication Number Publication Date
CN112598597A true CN112598597A (en) 2021-04-02

Family

ID=75202186

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011565423.XA Pending CN112598597A (en) 2020-12-25 2020-12-25 Training method of noise reduction model and related device

Country Status (2)

Country Link
CN (1) CN112598597A (en)
WO (1) WO2022134971A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177497A (en) * 2021-05-10 2021-07-27 百度在线网络技术(北京)有限公司 Visual model training method, vehicle identification method and device
CN113362259A (en) * 2021-07-13 2021-09-07 商汤集团有限公司 Image noise reduction processing method and device, electronic equipment and storage medium
CN113611318A (en) * 2021-06-29 2021-11-05 华为技术有限公司 Audio data enhancement method and related equipment
CN113610731A (en) * 2021-08-06 2021-11-05 北京百度网讯科技有限公司 Method, apparatus and computer program product for generating an image quality enhancement model
WO2022134971A1 (en) * 2020-12-25 2022-06-30 华为技术有限公司 Noise reduction model training method and related apparatus
CN115565212A (en) * 2022-01-20 2023-01-03 荣耀终端有限公司 Image processing method, neural network model training method and device
CN117274109A (en) * 2023-11-14 2023-12-22 荣耀终端有限公司 Image processing method, noise reduction model training method and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110288547A (en) * 2019-06-27 2019-09-27 北京字节跳动网络技术有限公司 Method and apparatus for generating image denoising model
CN110782421A (en) * 2019-09-19 2020-02-11 平安科技(深圳)有限公司 Image processing method, image processing device, computer equipment and storage medium
CN111310903A (en) * 2020-02-24 2020-06-19 清华大学 Three-dimensional single molecule positioning system based on convolution neural network
CN111598808A (en) * 2020-05-18 2020-08-28 腾讯科技(深圳)有限公司 Image processing method, device and equipment and training method thereof
CN111768349A (en) * 2020-06-09 2020-10-13 山东师范大学 ESPI image noise reduction method and system based on deep learning
CN111882503A (en) * 2020-08-04 2020-11-03 深圳高性能医疗器械国家研究院有限公司 Image noise reduction method and application thereof
CN111951195A (en) * 2020-07-08 2020-11-17 华为技术有限公司 Image enhancement method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11234666B2 (en) * 2018-05-31 2022-02-01 Canon Medical Systems Corporation Apparatus and method for medical image reconstruction using deep learning to improve image quality in position emission tomography (PET)
CN111598804B (en) * 2020-05-12 2022-03-22 西安电子科技大学 Deep learning-based image multi-level denoising method
CN111968058B (en) * 2020-08-25 2023-08-04 北京交通大学 Low-dose CT image noise reduction method
CN112598597A (en) * 2020-12-25 2021-04-02 华为技术有限公司 Training method of noise reduction model and related device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110288547A (en) * 2019-06-27 2019-09-27 北京字节跳动网络技术有限公司 Method and apparatus for generating image denoising model
CN110782421A (en) * 2019-09-19 2020-02-11 平安科技(深圳)有限公司 Image processing method, image processing device, computer equipment and storage medium
CN111310903A (en) * 2020-02-24 2020-06-19 清华大学 Three-dimensional single molecule positioning system based on convolution neural network
CN111598808A (en) * 2020-05-18 2020-08-28 腾讯科技(深圳)有限公司 Image processing method, device and equipment and training method thereof
CN111768349A (en) * 2020-06-09 2020-10-13 山东师范大学 ESPI image noise reduction method and system based on deep learning
CN111951195A (en) * 2020-07-08 2020-11-17 华为技术有限公司 Image enhancement method and device
CN111882503A (en) * 2020-08-04 2020-11-03 深圳高性能医疗器械国家研究院有限公司 Image noise reduction method and application thereof

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022134971A1 (en) * 2020-12-25 2022-06-30 华为技术有限公司 Noise reduction model training method and related apparatus
CN113177497A (en) * 2021-05-10 2021-07-27 百度在线网络技术(北京)有限公司 Visual model training method, vehicle identification method and device
CN113177497B (en) * 2021-05-10 2024-04-12 百度在线网络技术(北京)有限公司 Training method of visual model, vehicle identification method and device
CN113611318A (en) * 2021-06-29 2021-11-05 华为技术有限公司 Audio data enhancement method and related equipment
CN113362259A (en) * 2021-07-13 2021-09-07 商汤集团有限公司 Image noise reduction processing method and device, electronic equipment and storage medium
CN113362259B (en) * 2021-07-13 2024-01-09 商汤集团有限公司 Image noise reduction processing method and device, electronic equipment and storage medium
CN113610731A (en) * 2021-08-06 2021-11-05 北京百度网讯科技有限公司 Method, apparatus and computer program product for generating an image quality enhancement model
CN113610731B (en) * 2021-08-06 2023-08-08 北京百度网讯科技有限公司 Method, apparatus and computer program product for generating image quality improvement model
CN115565212A (en) * 2022-01-20 2023-01-03 荣耀终端有限公司 Image processing method, neural network model training method and device
CN117274109A (en) * 2023-11-14 2023-12-22 荣耀终端有限公司 Image processing method, noise reduction model training method and electronic equipment
CN117274109B (en) * 2023-11-14 2024-04-23 荣耀终端有限公司 Image processing method, noise reduction model training method and electronic equipment

Also Published As

Publication number Publication date
WO2022134971A1 (en) 2022-06-30

Similar Documents

Publication Publication Date Title
CN112598597A (en) Training method of noise reduction model and related device
CN113066017B (en) Image enhancement method, model training method and equipment
CN111402130B (en) Data processing method and data processing device
CN112446270A (en) Training method of pedestrian re-identification network, and pedestrian re-identification method and device
CN111914997B (en) Method for training neural network, image processing method and device
CN113284054A (en) Image enhancement method and image enhancement device
CN113705769A (en) Neural network training method and device
CN112529150A (en) Model structure, model training method, image enhancement method and device
CN113191489B (en) Training method of binary neural network model, image processing method and device
CN113065635A (en) Model training method, image enhancement method and device
CN113065645B (en) Twin attention network, image processing method and device
US11915383B2 (en) Methods and systems for high definition image manipulation with neural networks
CN114359289A (en) Image processing method and related device
CN111951195A (en) Image enhancement method and device
WO2022111387A1 (en) Data processing method and related apparatus
CN112561028A (en) Method for training neural network model, and method and device for data processing
CN113011562A (en) Model training method and device
CN113066018A (en) Image enhancement method and related device
CN111950700A (en) Neural network optimization method and related equipment
CN115131256A (en) Image processing model, and training method and device of image processing model
CN113284055A (en) Image processing method and device
WO2021042774A1 (en) Image recovery method, image recovery network training method, device, and storage medium
CN113066125A (en) Augmented reality method and related equipment thereof
CN115623242A (en) Video processing method and related equipment thereof
CN114897728A (en) Image enhancement method and device, terminal equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination