CN110807741A - Training method of image processing network, image denoising method and device - Google Patents

Training method of image processing network, image denoising method and device Download PDF

Info

Publication number
CN110807741A
CN110807741A CN201910979627.9A CN201910979627A CN110807741A CN 110807741 A CN110807741 A CN 110807741A CN 201910979627 A CN201910979627 A CN 201910979627A CN 110807741 A CN110807741 A CN 110807741A
Authority
CN
China
Prior art keywords
image
noise
predicted
processing network
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910979627.9A
Other languages
Chinese (zh)
Inventor
卓志勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201910979627.9A priority Critical patent/CN110807741A/en
Publication of CN110807741A publication Critical patent/CN110807741A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4038Scaling the whole image or part thereof for image mosaicing, i.e. plane images composed of plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/32Indexing scheme for image data processing or generation, in general involving image mosaicing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Abstract

The present disclosure provides a training method of an image processing network, an image denoising method, a training apparatus of an image processing network, an image denoising apparatus, a computer readable storage medium, and an electronic device; relates to the technical field of image denoising. The method comprises the following steps: generating a noise image according to the reference noise and the original image, and performing noise extraction on the noise image to obtain predicted noise; adjusting a first network parameter of the image processing network according to a first loss function of the predicted noise and the reference noise; splicing the predicted noise and the noise image to obtain a target noise image, performing image restoration on the target noise image, and outputting a predicted image; and adjusting a second network parameter of the image processing network according to a second loss function of the predicted image and the original image. The method disclosed by the invention can overcome the problem of low training efficiency of the image processing network to a certain extent, and trains the image processing network through the noise sample so as to improve the training efficiency of the image processing network.

Description

Training method of image processing network, image denoising method and device
Technical Field
The present disclosure relates to the field of image denoising technology, and in particular, to a training method and an image denoising method for an image processing network, a training device and an image denoising device for an image processing network, a computer-readable storage medium, and an electronic device.
Background
An image is a visual information carrier through which people can obtain information. In general, noise is often interfered during image generation and transmission, which increases the difficulty of image processing of a computer. Based on the above situation, the solution is generally: the image processing network is trained through the paired samples, so that the image processing network learns the noise relationship among the samples, and the noise in the samples can be stripped, thereby facilitating the subsequent image processing. However, the training method needs to input a large number of paired samples, and the image processing network can be enabled to have the image denoising capability after a long time of learning, which may cause a problem of low training efficiency.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The present disclosure is directed to a training method for an image processing network, an image denoising method, a training device for an image processing network, an image denoising device, a computer-readable storage medium, and an electronic device, which overcome the problem of low training efficiency of an image processing network to a certain extent, and train an image processing network through a noise sample to improve the training efficiency of the image processing network.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
According to a first aspect of the present disclosure, there is provided a training method of an image processing network, including:
generating a noise image according to the reference noise and the original image, and performing noise extraction on the noise image to obtain predicted noise;
calculating a first loss function of the predicted noise and the reference noise, and adjusting a first network parameter of the image processing network according to the first loss function;
splicing the predicted noise and the noise image to obtain a target noise image, performing image restoration on the target noise image, and outputting a predicted image;
and calculating a second loss function of the predicted image and the original image, and adjusting a second network parameter of the image processing network according to the second loss function.
In an exemplary embodiment of the present disclosure, the training method of the image processing network may further include the steps of:
and overlapping the original noise and the random noise to obtain the reference noise.
In an exemplary embodiment of the present disclosure, the original noise is watermark noise, and the manner of obtaining the reference noise by overlapping the original noise and the random noise may specifically be:
distributing watermark noise in canvas with the same size as the original image to obtain an image to be synthesized;
and superposing the image to be synthesized and the random noise to obtain the reference noise.
In an exemplary embodiment of the present disclosure, the random noise includes at least one of poisson noise, multiplicative bernoulli noise, random-valued impulse noise, image-text watermark noise, monte carlo noise, and text noise.
In an exemplary embodiment of the present disclosure, the noise extraction is performed on the noise image, and the manner of obtaining the predicted noise may specifically be:
performing image convolution and nonlinear activation on the noise image to obtain a first image characteristic;
performing residual calculation on the first image characteristic for a first preset number of times to obtain a second image characteristic; wherein the first preset times are equal to the preset number;
and performing image convolution and normalization on the second image characteristics to obtain prediction noise.
In an exemplary embodiment of the disclosure, in performing residual calculation on the first image feature for a first preset number of times to obtain the second image feature, a manner of performing residual calculation on the first image feature for one time may specifically be:
inputting the first image characteristic into a current residual error window, and performing image convolution on the first image characteristic through a first convolution layer in the current residual error window to obtain a first convolution result;
normalizing the first convolution result, and performing nonlinear activation on the normalized result;
performing image convolution on the nonlinear activation result through a second convolution layer in the current residual error window to obtain a second convolution result;
normalizing the second convolution result to obtain the output of the current residual error window;
the output of the current residual window is combined with the output of the previous residual window as the input of the next residual window.
In an exemplary embodiment of the present disclosure, the image restoration is performed on the target noise image, and the manner of outputting the predicted image may specifically be:
performing convolution and nonlinear activation on the target noise image to obtain a third image characteristic;
performing residual calculation on the target noise image for a second preset number of times to obtain a fourth image characteristic; wherein the second preset times is greater than the first preset times;
performing image convolution and normalization on the second image characteristic, and splicing the obtained normalization result with the third image characteristic;
and performing image convolution on the splicing result to output a prediction image.
In an exemplary embodiment of the present disclosure, the noise extraction is performed on the noise image, and the manner of obtaining the predicted noise may specifically be:
superposing the original noise and the random noise to obtain standby noise;
splicing the noise image and the standby noise to obtain a target noise image;
and carrying out noise extraction on the target noise image to obtain predicted noise.
In an exemplary embodiment of the present disclosure, the manner of generating the noise image according to the reference noise and the original image may specifically be:
and adjusting the transparency parameter of the reference noise to a preset transparency range, and generating a noise image according to the adjusted reference noise and the original image.
In an exemplary embodiment of the present disclosure, the first loss function is a mean square error of prediction noise and reference noise, and the second loss function is a mean square error of a prediction image and an original image.
In an exemplary embodiment of the present disclosure, the manner of stitching the predicted noise and the noise image to obtain the target noise image may specifically be:
and splicing the character string corresponding to the predicted noise and the character string corresponding to the noise image to obtain the target noise image.
According to a second aspect of the present disclosure, there is provided an image denoising method, including:
generating a noise image according to the reference noise and the sample image, and performing noise extraction on the noise image to obtain predicted noise;
calculating a first loss function of the predicted noise and the reference noise, and adjusting a first network parameter of the image processing network according to the first loss function;
splicing the predicted noise and the noise image to obtain a target noise image, performing image restoration on the target noise image, and outputting a predicted image;
calculating a second loss function of the predicted image and the sample image, and adjusting a second network parameter of the image processing network according to the second loss function;
and carrying out image processing according to the image processing network of the adjusted first network parameter and second network parameter.
In an exemplary embodiment of the present disclosure, according to a third aspect of the present disclosure, there is provided a training apparatus of an image processing network, including a noise extraction unit, a parameter adjustment unit, and an image restoration unit, wherein:
the noise extraction unit is used for generating a noise image according to the reference noise and the original image and extracting noise of the noise image to obtain predicted noise;
the parameter adjusting unit is used for calculating a first loss function of the prediction noise and the reference noise and adjusting a first network parameter of the image processing network according to the first loss function;
the image restoration unit is used for splicing the predicted noise and the noise image to obtain a target noise image, restoring the target noise image and outputting a predicted image;
and the parameter adjusting unit is also used for calculating a second loss function of the predicted image and the original image and adjusting a second network parameter of the image processing network according to the second loss function.
In an exemplary embodiment of the present disclosure, the training apparatus of the image processing network may further include a noise coincidence unit, wherein:
and the noise superposition unit is used for superposing the original noise and the random noise to obtain reference noise.
In an exemplary embodiment of the present disclosure, the original noise is watermark noise, and the noise coincidence unit coincides the original noise and the random noise, and the manner of obtaining the reference noise may specifically be:
the noise overlapping unit evenly distributes watermark noise in canvas with the same size as the original image to obtain an image to be synthesized;
the noise coincidence unit coincides the image to be synthesized with the random noise to obtain reference noise.
In an exemplary embodiment of the present disclosure, the random noise includes at least one of poisson noise, multiplicative bernoulli noise, random-valued impulse noise, image-text watermark noise, monte carlo noise, and text noise.
In an exemplary embodiment of the disclosure, the noise extraction unit performs noise extraction on the noise image, and the manner of obtaining the predicted noise may specifically be:
the noise extraction unit performs image convolution and nonlinear activation on the noise image to obtain a first image characteristic;
the noise extraction unit carries out residual calculation on the first image characteristic for a first preset number of times to obtain a second image characteristic; wherein the first preset times are equal to the preset number;
and the noise extraction unit performs image convolution and normalization on the second image characteristic to obtain predicted noise.
In an exemplary embodiment of the disclosure, in the performing, by the noise extraction unit, residual calculation for a first preset number of times on the first image feature to obtain the second image feature, a manner of performing residual calculation for the first image feature once may specifically be:
the noise extraction unit inputs the first image characteristic into a current residual error window, and performs image convolution on the first image characteristic through a first convolution layer in the current residual error window to obtain a first convolution result;
the noise extraction unit normalizes the first convolution result and nonlinearly activates the normalized result;
the noise extraction unit performs image convolution on the nonlinear activation result through a second convolution layer in the current residual error window to obtain a second convolution result;
the noise extraction unit normalizes the second convolution result to obtain the output of the current residual window;
the noise extraction unit combines the output of the current residual window with the output of the previous residual window as the input of the next residual window.
In an exemplary embodiment of the present disclosure, the image restoration unit performs image restoration on the target noise image, and the manner of outputting the predicted image may specifically be:
the image restoration unit performs convolution and nonlinear activation on the target noise image to obtain a third image characteristic;
the image restoration unit carries out residual calculation on the target noise image for a second preset number of times to obtain a fourth image characteristic; wherein the second preset times is greater than the first preset times;
the image restoration unit performs image convolution and normalization on the second image characteristic, and splices the obtained normalization result with the third image characteristic;
and the image restoration unit performs image convolution on the splicing result to output a prediction image.
In an exemplary embodiment of the disclosure, the noise extraction unit performs noise extraction on the noise image, and the manner of obtaining the predicted noise may specifically be:
the noise extraction unit superposes the original noise and the random noise to obtain standby noise;
the noise extraction unit splices the noise image and the standby noise to obtain a target noise image;
the noise extraction unit extracts noise from the target noise image to obtain predicted noise.
In an exemplary embodiment of the disclosure, the way in which the noise extraction unit generates the noise image according to the reference noise and the original image may specifically be:
the noise extraction unit adjusts the transparency parameter of the reference noise to be within a preset transparency range, and generates a noise image according to the adjusted reference noise and the original image.
In an exemplary embodiment of the present disclosure, the first loss function is a mean square error of prediction noise and reference noise, and the second loss function is a mean square error of a prediction image and an original image.
In an exemplary embodiment of the disclosure, the image inpainting unit splices the predicted noise and the noise image to obtain the target noise image specifically may:
and the image restoration unit splices the character string corresponding to the predicted noise and the character string corresponding to the noise image to obtain the target noise image.
In an exemplary embodiment of the present disclosure, according to a fourth aspect of the present disclosure, there is provided an image denoising apparatus including a noise extraction unit, a parameter adjustment unit, an image restoration unit, and an image denoising unit, wherein:
the noise extraction unit is used for generating a noise image according to the reference noise and the sample image and extracting noise of the noise image to obtain predicted noise;
the parameter adjusting unit is used for calculating a first loss function of the prediction noise and the reference noise and adjusting a first network parameter of the image processing network according to the first loss function;
the image restoration unit is used for splicing the predicted noise and the noise image to obtain a target noise image, restoring the target noise image and outputting a predicted image;
the parameter adjusting unit is also used for calculating a second loss function of the predicted image and the sample image and adjusting a second network parameter of the image processing network according to the second loss function;
and the image denoising unit is used for carrying out image processing according to the image processing network of the adjusted first network parameter and the second network parameter.
According to a fifth aspect of the present disclosure, there is provided an electronic device comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the method of any one of the above via execution of the executable instructions.
According to a sixth aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of any one of the above.
Exemplary embodiments of the present disclosure may have some or all of the following benefits:
in the training method of the image processing network provided in an example embodiment of the present disclosure, a noise image may be generated according to a reference noise (e.g., watermark noise added with random noise) and an original image, and noise extraction may be performed on the noise image to obtain a predicted noise; furthermore, a first loss function of the predicted noise and the reference noise can be calculated, and a first network parameter of the image processing network is adjusted according to the first loss function; furthermore, the predicted noise and the noise image can be spliced to obtain a target noise image, the target noise image is subjected to image restoration, and a predicted image is output; further, a second loss function of the predicted image and the original image may be calculated, and a second network parameter of the image processing network may be adjusted according to the second loss function. According to the technical description, the problem of low training efficiency of the image processing network can be overcome to a certain extent, and the image processing network is trained through the generated noise image so as to improve the training efficiency of the image processing network. On the other hand, the method and the device can improve the data preparation efficiency before the image restoration training by generating the noise image and adjusting the first network parameter according to the noise image, and reduce the dependence degree of the image processing network on the clean sample by training the image processing network through the noise image. In another aspect, the image restoration effect can be improved through an image processing network obtained by training a noise sample.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.
FIG. 1 is a diagram illustrating an exemplary system architecture of a training method, an image denoising method, a training apparatus and an image denoising apparatus of an image processing network to which embodiments of the present disclosure may be applied;
FIG. 2 illustrates a schematic structural diagram of a computer system suitable for use with the electronic device used to implement embodiments of the present disclosure;
FIG. 3 schematically shows a flow diagram of a training method of an image processing network according to one embodiment of the present disclosure;
FIG. 4 schematically shows a schematic diagram of images to be synthesized according to one embodiment of the present disclosure;
FIG. 5 schematically illustrates a diagram of reference noise according to one embodiment of the present disclosure;
FIG. 6 schematically shows a schematic diagram of an original image according to one embodiment of the present disclosure;
FIG. 7 schematically shows a schematic of a noisy image according to one embodiment of the present disclosure;
FIG. 8 schematically shows an architecture diagram for noise extraction of a noisy image for predicted noise according to one embodiment of the present disclosure;
FIG. 9 schematically shows a structural diagram of a residual window according to an embodiment of the present disclosure;
fig. 10 schematically shows an architecture diagram of image inpainting a target noise image to output a predicted image according to one embodiment of the present disclosure;
FIG. 11 schematically shows an architecture diagram of an image processing network according to one embodiment of the present disclosure;
FIG. 12 schematically illustrates a flow diagram of an image denoising method according to one embodiment of the present disclosure;
FIG. 13 schematically shows a block diagram of a training apparatus of an image processing network in an embodiment in accordance with the present disclosure;
fig. 14 schematically shows a block diagram of an image denoising apparatus according to an embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and the like. In other instances, well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
Fig. 1 is a schematic diagram illustrating an example system architecture of a training method, an image denoising method, a training device of an image processing network, and an image denoising device, to which an embodiment of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include one or more of terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few. The terminal devices 101, 102, 103 may be various electronic devices having a display screen, including but not limited to desktop computers, portable computers, smart phones, tablet computers, and the like. It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example, server 105 may be a server cluster comprised of multiple servers, or the like.
The training method and the image denoising method of the image processing network provided by the embodiment of the present disclosure are generally performed by the server 105, and accordingly, the training device and the image denoising device of the image processing network are generally disposed in the server 105. However, it is easily understood by those skilled in the art that the training method and the image denoising method of the image processing network provided in the embodiment of the present disclosure may also be executed by the terminal devices 101, 102, and 103, and accordingly, the training apparatus and the image denoising apparatus of the image processing network may also be disposed in the terminal devices 101, 102, and 103, which is not particularly limited in the present exemplary embodiment. For example, in an exemplary embodiment, the server 105 may generate a noise image according to the reference noise and the original image, and perform noise extraction on the noise image to obtain a predicted noise; and calculating a first loss function of the predicted noise and the reference noise, and adjusting a first network parameter of the image processing network according to the first loss function; the predicted noise and the noise image can be spliced to obtain a target noise image, the target noise image is subjected to image restoration, and a predicted image is output; and a second loss function of the predicted image and the original image can be calculated, and a second network parameter of the image processing network can be adjusted according to the second loss function. In addition, the server 105 may perform image processing according to the image processing network of the adjusted first network parameter and second network parameter.
FIG. 2 illustrates a schematic structural diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present disclosure.
It should be noted that the computer system 200 of the electronic device shown in fig. 2 is only an example, and should not bring any limitation to the functions and the scope of the application of the embodiments of the present disclosure.
As shown in fig. 2, the computer system 200 includes a Central Processing Unit (CPU)201 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)202 or a program loaded from a storage section 208 into a Random Access Memory (RAM) 203. In the RAM 203, various programs and data necessary for system operation are also stored. The CPU201, ROM 202, and RAM 203 are connected to each other via a bus 204. An input/output (I/O) interface 205 is also connected to bus 204.
The following components are connected to the I/O interface 205: an input portion 206 including a keyboard, a mouse, and the like; an output section 207 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 208 including a hard disk and the like; and a communication section 209 including a network interface card such as a LAN card, a modem, or the like. The communication section 209 performs communication processing via a network such as the internet. A drive 210 is also connected to the I/O interface 205 as needed. A removable medium 211 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 210 as necessary, so that a computer program read out therefrom is mounted into the storage section 208 as necessary.
In particular, the processes described below with reference to the flowcharts may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 209 and/or installed from the removable medium 211. The computer program, when executed by a Central Processing Unit (CPU)201, performs various functions defined in the methods and apparatus of the present application. In some embodiments, the computer system 200 may further include an AI (artificial intelligence) processor for processing computing operations related to machine learning.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.
In the early traditional machine learning era, people need to carefully design how to extract useful features from data, design an objective function aiming at a specific task, and then build a machine learning system by using some universal optimization algorithms. After the rise of deep learning, people largely do not rely on well-designed features, but let neural networks learn useful features automatically. With the challenge generating network, a well-designed objective function is no longer needed in many scenarios.
The technical solution of the embodiment of the present disclosure is explained in detail below:
the purpose of image denoising is to recover an original image free of noise from a noisy image while preserving as much detail information in the image as possible. The existing image denoising method usually uses a clean sample as a premise to ensure the image denoising effect. However, in practical applications, samples without noise are rarely obtained for training, and the manufacturing cost of high-quality clean samples is high, which may increase the difficulty of network training and noise removal.
Based on one or more of the problems described above, the present example embodiment provides a training method of an image processing network. The training method of the image processing network may be applied to the server 105, and may also be applied to one or more of the terminal devices 101, 102, and 103, which is not particularly limited in this exemplary embodiment. Referring to fig. 3, the training method of the image processing network may include the following steps S310 to S340:
step S310: and generating a noise image according to the reference noise and the original image, and performing noise extraction on the noise image to obtain predicted noise.
Step S320: a first loss function of the predicted noise and the reference noise is calculated, and a first network parameter of the image processing network is adjusted according to the first loss function.
Step S330: and splicing the predicted noise and the noise image to obtain a target noise image, performing image restoration on the target noise image, and outputting a predicted image.
Step S340: and calculating a second loss function of the predicted image and the original image, and adjusting a second network parameter of the image processing network according to the second loss function.
It should be noted that the embodiment of the present disclosure may be applied to automatic denoising of a mobile phone camera, denoising of a medical image, and removing of an image watermark in an internet shared data set. In the medical field, it usually takes a lot of time to take multiple images of nuclear magnetic resonance to ensure a high-definition image. When a patient is in an emergency, long-time magnetic resonance imaging can delay the condition of the patient and delay the optimal treatment time. The image processing network obtained through training in the embodiment of the disclosure is used for image denoising, so that the acquisition efficiency of the nuclear magnetic resonance image meeting the definition requirement can be improved, a clearer injury judgment basis can be provided for doctors, and more treatment time can be strived for patients. On the other hand, along with the continuous improvement of the mobile phone photographing performance, the requirements of the user on the mobile phone photographing effect are gradually improved, the image processing network obtained by training according to the embodiment of the disclosure can improve the display effect of the image, the user experience is improved to a certain extent, and the use viscosity of the user is improved.
The above steps of the present exemplary embodiment will be described in more detail below.
In step S310, a noise image is generated from the reference noise and the original image, and noise extraction is performed on the noise image to obtain predicted noise.
In this exemplary embodiment, the format of the original image may be any one of bmp, jpg, png, tif, gif, pcx, tga, exif, fpx, svg, psd, cdr, pcd, dxf, ufo, eps, ai, raw, WMF, and webp, and the embodiment of the present disclosure is not limited thereto.
In this example embodiment, the format of the noise image may be an RGBA format, where RGBA is a color space representing Red (Red), Green (Green), Blue (Blue), and α channels RGBA is an additional information based on an RGB model, where the color used is RGB, and may belong to any RGB color space. α channels are generally used as the opacity parameter.if a α channel value of a pixel is 0%, it is completely transparent, if a α channel value is 100%, it means a completely opaque pixel.values between 0% and 100% allow the pixel to be displayed through the background, as though through glass, which effect is simply binary transparency (i.e., transparent or opaque). additionally, α channel values may be expressed in percentages, integers, or real numbers from 0 to 1.
For example, if 16-ary colors are currently used: black #000000 and white # FFFFFF. RGB (0,0,0) and RGB (255 ) can be obtained by RGB color representation, which is the same in nature except that one is represented by 16 and one is represented by binary.
The RGB model obtains various colors by changing three color channels of red, green and blue and superimposing the three color channels, RGB represents the colors of the three channels of red, green and blue, and this standard includes almost all colors that can be perceived by human vision, and is one of the most widely used color systems at present.
In this exemplary embodiment, optionally, the original noise is watermark noise, and the manner of obtaining the reference noise by overlapping the original noise and the random noise may specifically be:
distributing watermark noise in canvas with the same size as the original image to obtain an image to be synthesized;
and superposing the image to be synthesized and the random noise to obtain the reference noise.
In this example implementation, the watermark noise may include at least one of a text and a graphic, and the embodiment of the present disclosure is not limited. In addition, the same canvas size as the original image size is understood to mean that if the original image size is 1024 × 768, the size of the canvas should also be 1024 × 768, and the color of the canvas may be a preset color (e.g., black). The watermark noise is evenly distributed in a canvas with the same size as the original image, and it can also be understood that the same watermark noise is tiled in the canvas with the same size as the original image, and the canvas comprises a plurality of watermark noises.
In addition, optionally, the manner of obtaining the image to be synthesized may also be: and randomly distributing watermark noise in the canvas with the same size as the original image to obtain the image to be synthesized. At this point, the canvas includes a watermark noise. Or, the watermark noise is set at a target position of a canvas (e.g., the lower right corner of the canvas) with the same size as the original image, so as to obtain an image to be synthesized, where the target position includes an abscissa and an ordinate. At this point, the canvas includes a watermark noise.
In the present exemplary embodiment, the watermark noise may be called from a database storing various types of watermark noise, or may be customized; the method for customizing the watermark noise may specifically be: instantiating the watermark text, and setting the attribute of the watermark text to complete the customization of the watermark noise, where the attribute may include an abscissa, an ordinate, a transparency, a watermark text color, a watermark font, a font shadow, and the like, and the embodiment of the present disclosure is not limited.
For example, please refer to fig. 4 and 5. Fig. 4 schematically shows a schematic diagram of an image to be synthesized according to an embodiment of the present disclosure, and fig. 5 schematically shows a schematic diagram of reference noise according to an embodiment of the present disclosure. The image to be synthesized comprises watermark noises which are evenly distributed, and the distances among the watermark noises are equal. And the reference noise is obtained by coincidence of the image to be synthesized and random noise.
The random noise comprises at least one of Poisson noise, multiplicative Bernoulli noise, random-value impulse noise, image text watermark noise, Monte Carlo noise and text noise.
Specifically, random noise may also be transient noise, which is a random fluctuation of a signal over time. Thus, the random noise may also include thermal noise, shot noise, flicker noise, and the like. Wherein, due to the random movement of electrons in the resistor, fluctuation of charge potential and gate voltage fluctuation are caused, and thermal noise (or white noise) is generated; shot noise is generated when current flows through a potential barrier, and in an image sensor, the shot noise is related to incident photons and dark current and follows a poisson distribution; flicker noise (i.e., 1/f noise) may also be low-frequency noise or current noise, and its magnitude is proportional to the frequency.
Therefore, by implementing the optional implementation mode, the reference noise which is used for being superimposed in the original image can be generated so as to reduce the dependence of the image processing network on a clean image sample, and further, the sample amount of the image processing network is reduced under the condition of ensuring the image denoising effect so as to improve the training efficiency of the image processing network.
In this exemplary embodiment, optionally, the manner of generating the noise image according to the reference noise and the original image may specifically be:
and adjusting the transparency parameter of the reference noise to a preset transparency range, and generating a noise image according to the adjusted reference noise and the original image.
In this exemplary embodiment, the manner of generating the noise image according to the adjusted reference noise and the original image may specifically be: calculating a first product of the color vector of the original image and the first factor value and a second product of the color vector of the reference noise and the second factor value, and calculating a sum of the first product and the second product to determine a noise image from the sum; the first factor value is used for specifying the influence degree of the transparency parameter on the color vector of the original image, the second factor value is used for specifying the influence degree of the transparency parameter on the color vector of the reference noise, the color vector of the original image is used for representing the color of the original image, and the color vector of the reference noise is used for representing the color of the reference noise.
In this exemplary embodiment, the adjusting the transparency parameter of the reference noise to the preset transparency range may specifically be: and randomly setting any transparency value in the preset transparency range as a transparency parameter of the reference noise, so that the robustness of the image processing network can be improved.
For example, referring to fig. 6 and 7, fig. 6 schematically shows a schematic diagram of an original image according to an embodiment of the present disclosure, and fig. 7 schematically shows a schematic diagram of a noisy image according to an embodiment of the present disclosure. Specifically, the noise image shown in fig. 7 can be obtained by superimposing the reference noise with the adjusted transparency on fig. 6, and it can be seen that the noise image shown in fig. 7 includes a plurality of watermark noises with the transparency parameters distributed evenly within the preset transparency range.
Therefore, by implementing the optional implementation mode, the noise transparency can be adjusted, so that the noise image is generated according to the noise with different transparencies, and the robustness of the image processing network is improved.
In this exemplary embodiment, optionally, the training method for the image processing network may further include the following steps:
and overlapping the original noise and the random noise to obtain the reference noise.
In this exemplary embodiment, the manner of obtaining the reference noise by overlapping the original noise and the random noise may specifically be: and splicing the character strings of any noise in the original noise and the random noise to obtain the reference noise.
In this exemplary embodiment, optionally, the noise extraction may be performed on the noise image to obtain the predicted noise specifically by:
superposing the original noise and the random noise to obtain standby noise;
splicing the noise image and the standby noise to obtain a target noise image;
and carrying out noise extraction on the target noise image to obtain predicted noise.
In this example embodiment, the backup noise may be the same as or different from the reference noise, and the embodiment of the present disclosure is not limited.
In this exemplary embodiment, the manner of splicing the noise image and the standby noise to obtain the target noise image may specifically be: and merging the array corresponding to the noise image and the array corresponding to the standby noise, and returning the merged array to obtain the target noise image.
Therefore, by implementing the optional implementation mode, the target noise image can be obtained by splicing the noise image and the standby noise, and the noise extraction effect can be improved by training the noise extraction module in the image processing network through the target noise image.
In this exemplary embodiment, optionally, the noise extraction may be performed on the noise image to obtain the predicted noise specifically by:
performing image convolution and nonlinear activation on the noise image to obtain a first image characteristic;
performing residual calculation on the first image characteristic for a first preset number of times to obtain a second image characteristic; wherein the first preset times are equal to the preset number;
and performing image convolution and normalization on the second image characteristics to obtain prediction noise.
Further optionally, in the step of performing residual calculation on the first image feature for a first preset number of times to obtain the second image feature, the manner of performing residual calculation on the first image feature for one time may specifically be:
inputting the first image characteristic into a current residual error window, and performing image convolution on the first image characteristic through a first convolution layer in the current residual error window to obtain a first convolution result;
normalizing the first convolution result, and performing nonlinear activation on the normalized result;
performing image convolution on the nonlinear activation result through a second convolution layer in the current residual error window to obtain a second convolution result;
normalizing the second convolution result to obtain the output of the current residual error window;
the output of the current residual window is combined with the output of the previous residual window as the input of the next residual window.
In this exemplary embodiment, the manner of performing image convolution and nonlinear activation on noise may specifically be: carrying out convolution on the noise image through a preset convolution core to obtain image characteristics; and carrying out nonlinear activation on the image features through a PReLU function. Wherein, the expression corresponding to the PReLU function is:
Figure BDA0002234746470000171
wherein i represents the number of channels, aiDenotes the slope, x, of the i channeliIs the feature value of the image feature of the input i channel.
In addition, in addition to the nonlinear activation of the image feature by the prilu function, the image feature may also be nonlinearly activated by a ReLU function, an LReLU function, a prilu function, a CReLU function, an ELU function, or an SELU function, and the embodiments of the present disclosure are not limited thereto. The ReLU function, the LReLU function, the PReLU function, the CReLU function, the ELU function and the SELU function are all activation functions.
In this exemplary embodiment, the manner of performing the image convolution and normalization on the second image feature may specifically be: and performing image convolution on the second image characteristic through a preset convolution check to obtain an image characteristic, normalizing the image characteristic through a BactchNorm layer, and performing image convolution on a normalization result through a preset convolution check. The Bactch Norm can be understood as a regular mode, namely, the image features are converted into numerical values which can be applied to the next step, and the network learning efficiency can be improved by using the Bactch Norm to normalize the image features.
Specifically, the Bactch Norm layer is normalized for image features by the following equation:
Figure BDA0002234746470000173
Figure BDA0002234746470000181
it is assumed that the feature value corresponding to the input image feature is β ═ x1...mFor a total of m values, the output is yiBn (x), i is a positive integer. The mean value μ of the data x can be obtained by calculation according to equation 1β(ii) a Further, the variance of Bactch is calculated by formula 2
Figure BDA0002234746470000182
Further, the data x is normalized by the formula 3 to obtain
Figure BDA0002234746470000183
Wherein epsilon is a real number, and further, the normalization result is calculated through a scaling variable gamma and a translation variable η in a formula 4 to obtain a normalization result yi
In this exemplary embodiment, another optional way of extracting noise from a noise image to obtain predicted noise may specifically be:
modifying the number of residual windows to a preset number (e.g., 6);
and performing noise extraction on the noise image according to the modified residual error window to obtain predicted noise.
Referring to fig. 8, fig. 8 schematically illustrates an architecture diagram of noise extraction for a noisy image to obtain predicted noise according to an embodiment of the present disclosure, including: noise image 801, image convolution 802, non-linear activation 803, residual window 1804, residual window 2805, residual window 3806, residual window 4807, residual window 5808, residual window 6809, image convolution 810, feature normalization 811, image convolution 812, and prediction noise 813.
Specifically, an input noise image 801 may be subjected to image convolution 802 and nonlinear activation 803 to obtain a first image feature; further, residual calculation is performed on the first image feature for a first preset number of times, where the first preset number of times is six, that is, residual calculation is performed on the first image feature sequentially through a residual window 1804, a residual window 2805, a residual window 3806, a residual window 4807, a residual window 5808, and a residual window 6809, so as to obtain a second image feature; further, the second image feature may be subjected to image convolution 810, feature normalization 811, and image convolution 812 to obtain prediction noise 813. The convolution kernels of the image convolution 802, the image convolution 810 and the image convolution 812 may be the same or different, the number of channels of the image convolution 802, the image convolution 810 and the image convolution 812 may be the same or different, and the embodiment of the present disclosure is not limited.
Referring to fig. 9 in conjunction with fig. 8, fig. 9 schematically illustrates a structural diagram of a residual window according to an embodiment of the present disclosure. As shown in fig. 9, the residual window includes an image convolution layer 901, a feature normalization layer 902, a nonlinear activation layer 903, an image convolution layer 904, and a feature normalization layer 905.
Residual window 1804, residual window 2805, residual window 3806, residual window 4807, residual window 5808, and residual window 6809 in fig. 8 may all be represented by the residual windows in fig. 9. Specifically, a first image feature may be input into the current residual window, and an image convolution may be performed on the first image feature through an image convolution layer 901 (i.e., a first convolution layer) in the current residual window to obtain a first convolution result; wherein, the convolution kernel of the image convolution layer 901 is 3 × 1, and the convolution channel of the image convolution layer 901 is 64; furthermore, the first convolution result may be normalized by the feature normalization layer 902, and the normalized result may be nonlinearly activated by the nonlinear activation layer 903; further, the image convolution may be performed on the nonlinear activation result by the image convolution layer 904 (i.e., the second convolution layer) in the current residual window to obtain a second convolution result; wherein, the convolution kernel of the image convolution layer 904 is 3 × 1, and the convolution channel of the image convolution layer 901 is 64; further, the second convolution result may be normalized by the feature normalization layer 905 to obtain an output of the current residual window; and then, combining the output of the current residual window with the output of the previous residual window to be used as the input of the next residual window until all the residual windows are executed.
The scaling variable γ and the translation variable η in the feature normalization layer 902 and the feature normalization layer 905 may be the same or different, the convolution kernels of the image convolution layer 901 and the image convolution layer 904 may be the same or different, and the number of channels of the image convolution layer 901 and the image convolution layer 904 may be the same or different, which is not limited in the embodiment of the present disclosure.
Therefore, by implementing the optional implementation mode, noise extraction can be performed on the noise image through the residual error network, so that the probability of gradient explosion can be reduced, and the training efficiency of the image processing network is improved.
In step S320, a first loss function of the predicted noise and the reference noise is calculated, and a first network parameter of the image processing network is adjusted according to the first loss function.
In the present exemplary embodiment, the first network parameter is used to predict the reference noise in the noisy image. In addition, the second network parameter hereinafter is used to predict the original image in the target noise image. The image processing network comprises a first network parameter and a second network parameter, and in the image restoration process, firstly, noise needs to be predicted according to the first network parameter, and then, an original image is predicted according to the second network parameter; and the sequence of the application exists between the second network parameters of the first network parameters.
The present exemplary embodiment is by expression
Figure BDA0002234746470000201
A first network parameter of the image processing network is minimized. In thatAnd theta is a first network parameter,
Figure BDA0002234746470000203
for the corresponding mean value of the input noise image,i is a positive integer for the mean corresponding to the predicted noise.
In the present exemplary embodiment, the first loss function is a mean square error of the prediction noise and the reference noise.
In this exemplary embodiment, the Mean Square Error (MSE) is a regression loss function, the calculation method is to calculate the sum of squares of the distances between the predicted value and the true value, and the expression is:
Figure BDA0002234746470000205
in the first loss function, n is a positive integer, yiFor the value of the ith pixel point corresponding to the reference noise,
Figure BDA0002234746470000206
the value of the ith pixel point corresponding to the predicted noise is obtained.
In step S330, the prediction noise is merged with the noise image to obtain a target noise image, and the target noise image is subjected to image restoration to output a prediction image.
The present exemplary embodiment is based on minE (f)θ(z);x0) The second network parameter of the image processing network is minimized, in a specific manner see step S330. At minE (f)θ(z);x0) In, θ is a second network parameter, x0For predictive noise, z is the target noise image, fθ(z) is a predicted noise image.
In this exemplary embodiment, the manner of splicing the predicted noise and the noise image to obtain the target noise image may specifically be:
and splicing the character string corresponding to the predicted noise and the character string corresponding to the noise image to obtain the target noise image.
For example, if the character string corresponding to the predicted noise is [1,2,2] and the character string corresponding to the noise image is [4,5,6], then the obtained target noise image may be [1,2,2,4,5,6 ].
Therefore, by implementing the optional implementation mode, the learning efficiency of the image processing network can be improved by splicing the noise image and the prediction noise, and the denoising effect of the image is improved.
In this exemplary embodiment, optionally, the image restoration may be performed on the target noise image, and the manner of outputting the predicted image may specifically be:
performing convolution and nonlinear activation on the target noise image to obtain a third image characteristic;
performing residual calculation on the target noise image for a second preset number of times to obtain a fourth image characteristic; wherein the second preset times is greater than the first preset times;
performing image convolution and normalization on the second image characteristic, and splicing the obtained normalization result with the third image characteristic;
and performing image convolution on the splicing result to output a prediction image.
In this example implementation, referring to fig. 10, fig. 10 schematically illustrates an architecture diagram of image restoration of a target noise image to output a predicted image according to an embodiment of the present disclosure, including a target noise image 1001, an image convolution 1002, a non-linear activation 1003, a residual window 11004, a residual window 21005, a residual window 31006, a residual window 41007, a residual window 51008, a residual window 61009, a residual window 71010, a residual window 81011, a residual window 91012, a residual window 101013, a residual window 111014, a residual window 121015, a residual window 131016, a residual window 141017, a residual window 151018, a residual window 161019, an image convolution 1020, a feature normalization 1021, an image convolution 1022, and a predicted image 1023.
Specifically, the input target noise image 1001 may be convolved by an image convolution 1002 and the convolution result may be nonlinearly activated by a nonlinear activation 1003, resulting in a third image feature; furthermore, residual calculation may be performed on the target noise image for a second preset number of times (e.g., 16 times, i.e., 16 residual windows) to obtain a fourth image feature, and specifically, residual calculation may be performed on the target noise image sequentially through a residual window 11004, a residual window 21005, a residual window 31006, a residual window 41007, a residual window 51008, a residual window 61009, a residual window 71010, a residual window 81011, a residual window 91012, a residual window 101013, a residual window 111014, a residual window 121015, a residual window 131016, a residual window 141017, a residual window 151018, and a residual window 161019 to obtain the fourth image feature; further, the second image feature may be subjected to image convolution by the image convolution 1020 and the convolution result may be normalized by the feature normalization 1021, the obtained normalization result may be stitched with the third image feature, and the stitched result may be subjected to image convolution by the image convolution 1022 to output the predicted image 1023. Thus, an end-to-end model can be formed, and the image restoration is completed through one cycle, and the number of cycles can be multiple times, and the embodiment of the disclosure is not limited.
The convolution kernels of the image convolution 1002, the image convolution 1020 and the image convolution 1022 may be the same or different, the numbers of channels of the image convolution 1002, the image convolution 1020 and the image convolution 1022 may be the same or different, and the embodiment of the present disclosure is not limited thereto.
In addition, please refer to fig. 9 for an execution manner of each residual window, which is not described herein again.
Therefore, by implementing the optional implementation mode, the accuracy of image restoration can be improved by increasing the number of residual error networks, and further, the image restoration effect can be improved by splicing the nonlinear activation result and the normalization result.
In step S340, a second loss function of the predicted image and the original image is calculated, and a second network parameter of the image processing network is adjusted according to the second loss function.
In this exemplary embodiment, the second loss function is a mean square error of the predicted image and the original image.
In this example embodiment, the expression of MSE is:
in the second loss function, n is a positive integer, yiIs the value of the ith pixel point corresponding to the original image,and the value of the ith pixel point corresponding to the predicted image is obtained.
In addition, optionally, the first loss function and the second loss function may also be Root Mean Square Error (RMSE), Mean Absolute Error (MAE), or Standard Deviation (SD), and the embodiments of the disclosure are not limited thereto. The RMSE is used for measuring the deviation between a predicted value and a true value and is often used as a standard for measuring a machine learning model prediction result; MAE is the average value of absolute errors and is used for reflecting the actual situation of predicted value errors; SD is the arithmetic mean root of variance, which measures the degree of dispersion of a value.
Therefore, the training method of the image processing network shown in fig. 3 can overcome the problem of low training efficiency of the image processing network to a certain extent, and the image processing network is trained through the noise sample to reduce the training difficulty, so that the training efficiency of the image processing network is improved; the data preparation efficiency before image restoration training can be improved by generating a noise image and adjusting the first network parameter according to the noise image, and the dependence degree of the image processing network on a clean sample can be reduced by training the image processing network through the noise image; and the image restoration effect can be improved through an image processing network obtained by training a noise sample.
Referring to fig. 11, fig. 11 schematically shows an architecture diagram of an image processing network according to an embodiment of the present disclosure, as shown in fig. 11, the image processing network includes watermark noise 1101, α 1102, reference noise 1103, original image 1104, backup noise 1105, noise image 1106, prediction noise 1107, noise extraction module 1108, image restoration module 1109, and prediction image 1110.
Specifically, the image to be synthesized can be obtained by evenly distributing watermark noise 1101 in a canvas with the same size as the original image, the reference noise 1103 can be obtained by overlapping the image to be synthesized and random noise, the transparency parameter α 1102 of the reference noise can be adjusted to be within a preset transparency range, a noise image 1106 can be generated according to the adjusted reference noise 1103 and the original image 1104, the watermark noise 1101 and the random noise can be overlapped to obtain standby noise 1105, the noise image 1106 and the standby noise 1105 can be spliced to obtain a target noise image, the target noise image is subjected to noise extraction through a noise extraction module 1108 to obtain predicted noise 1107, a first loss function of the predicted noise 1107 and the reference noise 1103 can be calculated, a first network parameter of an image processing network can be adjusted according to the first loss function, the predicted noise image 1106 and the target noise image can be obtained by splicing the noise image 1106, the target noise image can be subjected to image restoration through an image restoration module 1109, a predicted image can be output, a second loss function of the predicted image and the original image 1110 can be subjected to actual noise processing in a network noise field, and the noise processing can be completed through the network noise restoration module 1109.
Therefore, the architecture diagram of the image processing network shown in fig. 11 can overcome the problem of low training efficiency of the image processing network to a certain extent, and the image processing network is trained through the noise sample to reduce the training difficulty, so as to improve the training efficiency of the image processing network; the data preparation efficiency before image restoration training can be improved by generating a noise image and adjusting the first network parameter according to the noise image, and the dependence degree of the image processing network on a clean sample can be reduced by training the image processing network through the noise image; and the image restoration effect can be improved through an image processing network obtained by training a noise sample.
Referring to fig. 12, fig. 12 schematically illustrates a flowchart of an image denoising method according to an embodiment of the present disclosure. The image denoising method may be applied to the server 105, and may also be applied to one or more of the terminal devices 101, 102, and 103, which is not particularly limited in this exemplary embodiment. As shown in fig. 12, the image denoising method may include steps S1210 to S1250, in which:
step 1210: and generating a noise image according to the reference noise and the sample image, and performing noise extraction on the noise image to obtain predicted noise.
Step S1220: a first loss function of the predicted noise and the reference noise is calculated, and a first network parameter of the image processing network is adjusted according to the first loss function.
Step S1230: and splicing the predicted noise and the noise image to obtain a target noise image, performing image restoration on the target noise image, and outputting a predicted image.
Step S1240: and calculating a second loss function of the predicted image and the sample image, and adjusting a second network parameter of the image processing network according to the second loss function.
Step S1250: and carrying out image processing according to the image processing network of the adjusted first network parameter and second network parameter.
Wherein the sample image can be understood as the original image in fig. 3. Please refer to the embodiment in fig. 3 for the detailed implementation of steps S1210 to S1250, which is not described herein.
By the image denoising method shown in fig. 12, the problem of low training efficiency of the image processing network can be overcome to a certain extent, and the image processing network is trained through the noise sample to reduce the training difficulty and further improve the training efficiency of the image processing network; the data preparation efficiency before image restoration training can be improved by generating a noise image and adjusting the first network parameter according to the noise image, and the dependence degree of the image processing network on a clean sample can be reduced by training the image processing network through the noise image; and the image restoration effect can be improved through an image processing network obtained by training a noise sample.
Further, in the present exemplary embodiment, a training apparatus for an image processing network is also provided. The training device of the image processing network can be applied to a server or a terminal device. Referring to fig. 13, the training apparatus 1300 of the image processing network may include a noise extraction unit 1301, a parameter adjustment unit 1302, and an image restoration unit 1303, wherein:
a noise extraction unit 1301, configured to generate a noise image according to the reference noise and the original image, and perform noise extraction on the noise image to obtain predicted noise;
a parameter adjusting unit 1302, configured to calculate a first loss function of the predicted noise and the reference noise, and adjust a first network parameter of the image processing network according to the first loss function;
an image restoration unit 1303, configured to splice the predicted noise and the noise image to obtain a target noise image, perform image restoration on the target noise image, and output a predicted image;
the parameter adjusting unit 1302 is further configured to calculate a second loss function between the predicted image and the original image, and adjust a second network parameter of the image processing network according to the second loss function.
The first loss function is the mean square error of the prediction noise and the reference noise, and the second loss function is the mean square error of the prediction image and the original image.
It can be seen that the training device implementing the image processing network shown in fig. 13 can overcome the problem of low training efficiency of the image processing network to a certain extent, and train the image processing network through the noise sample to reduce the training difficulty, thereby improving the training efficiency of the image processing network; the data preparation efficiency before image restoration training can be improved by generating a noise image and adjusting the first network parameter according to the noise image, and the dependence degree of the image processing network on a clean sample can be reduced by training the image processing network through the noise image; and the image restoration effect can be improved through an image processing network obtained by training a noise sample.
In an exemplary embodiment of the present disclosure, the training apparatus of the image processing network may further include a noise coincidence unit (not shown), wherein:
and the noise superposition unit is used for superposing the original noise and the random noise to obtain reference noise.
The random noise comprises at least one of Poisson noise, multiplicative Bernoulli noise, random-value impulse noise, image text watermark noise, Monte Carlo noise and text noise.
In an exemplary embodiment of the present disclosure, the original noise is watermark noise, and the noise coincidence unit coincides the original noise and the random noise, and the manner of obtaining the reference noise may specifically be:
the noise overlapping unit evenly distributes watermark noise in canvas with the same size as the original image to obtain an image to be synthesized;
the noise coincidence unit coincides the image to be synthesized with the random noise to obtain reference noise.
Therefore, by implementing the exemplary embodiment, the reference noise superimposed in the original image can be generated to reduce the dependence of the image processing network on a clean image sample, and further, the sample size of the image processing network is reduced under the condition of ensuring the image denoising effect, so that the training efficiency of the image processing network is improved.
In an exemplary embodiment of the disclosure, the noise extraction unit performs noise extraction on the noise image, and the manner of obtaining the predicted noise may specifically be:
the noise extraction unit performs image convolution and nonlinear activation on the noise image to obtain a first image characteristic;
the noise extraction unit carries out residual calculation on the first image characteristic for a first preset number of times to obtain a second image characteristic; wherein the first preset times are equal to the preset number;
and the noise extraction unit performs image convolution and normalization on the second image characteristic to obtain predicted noise.
In the case that the noise extraction unit performs residual calculation for the first image feature for a first preset number of times to obtain the second image feature, the manner of performing residual calculation for the first image feature for one time may specifically be:
the noise extraction unit inputs the first image characteristic into a current residual error window, and performs image convolution on the first image characteristic through a first convolution layer in the current residual error window to obtain a first convolution result;
the noise extraction unit normalizes the first convolution result and nonlinearly activates the normalized result;
the noise extraction unit performs image convolution on the nonlinear activation result through a second convolution layer in the current residual error window to obtain a second convolution result;
the noise extraction unit normalizes the second convolution result to obtain the output of the current residual window;
the noise extraction unit combines the output of the current residual window with the output of the previous residual window as the input of the next residual window.
Therefore, by implementing the exemplary embodiment, noise extraction can be performed on the noise image through the residual error network, so that the probability of gradient explosion can be reduced, and the training efficiency of the image processing network is improved.
In an exemplary embodiment of the present disclosure, the image restoration unit performs image restoration on the target noise image, and the manner of outputting the predicted image may specifically be:
the image restoration unit performs convolution and nonlinear activation on the target noise image to obtain a third image characteristic;
the image restoration unit carries out residual calculation on the target noise image for a second preset number of times to obtain a fourth image characteristic; wherein the second preset times is greater than the first preset times;
the image restoration unit performs image convolution and normalization on the second image characteristic, and splices the obtained normalization result with the third image characteristic;
and the image restoration unit performs image convolution on the splicing result to output a prediction image.
Therefore, by implementing the exemplary embodiment, the accuracy of image restoration can be improved by increasing the number of residual error networks, and further, the image restoration effect can be improved by splicing the nonlinear activation result and the normalization result.
In an exemplary embodiment of the disclosure, the noise extraction unit performs noise extraction on the noise image, and the manner of obtaining the predicted noise may specifically be:
the noise extraction unit superposes the original noise and the random noise to obtain standby noise;
the noise extraction unit splices the noise image and the standby noise to obtain a target noise image;
the noise extraction unit extracts noise from the target noise image to obtain predicted noise.
Therefore, by implementing the exemplary embodiment, the target noise image can be obtained by splicing the noise image and the standby noise, and the noise extraction effect can be improved by training the noise extraction module in the image processing network through the target noise image.
In an exemplary embodiment of the disclosure, the way in which the noise extraction unit generates the noise image according to the reference noise and the original image may specifically be:
the noise extraction unit adjusts the transparency parameter of the reference noise to be within a preset transparency range, and generates a noise image according to the adjusted reference noise and the original image.
Therefore, by implementing the exemplary embodiment, the noise transparency can be adjusted to generate the noise image according to the noise with different transparencies, so as to improve the robustness of the image processing network.
In an exemplary embodiment of the disclosure, the image inpainting unit splices the predicted noise and the noise image to obtain the target noise image specifically may:
and the image restoration unit splices the character string corresponding to the predicted noise and the character string corresponding to the noise image to obtain the target noise image.
Therefore, by implementing the exemplary embodiment, the learning efficiency of the image processing network can be improved by splicing the noise image and the prediction noise, and the denoising effect of the image is improved.
For details that are not disclosed in the embodiments of the apparatus of the present disclosure, please refer to the embodiments of the training method of the image processing network described above for the details that are not disclosed in the embodiments of the apparatus of the present disclosure.
Furthermore, in the present exemplary embodiment, an image denoising apparatus is also provided. The image denoising device can be applied to a server or a terminal device. Referring to fig. 14, the image denoising apparatus 1400 may include a noise extraction unit 1401, a parameter adjustment unit 1402, an image inpainting unit 1403, and an image denoising unit 1404, wherein:
a noise extraction unit 1401 configured to generate a noise image according to the reference noise and the sample image, and perform noise extraction on the noise image to obtain predicted noise;
a parameter adjusting unit 1402, configured to calculate a first loss function of the predicted noise and the reference noise, and adjust a first network parameter of the image processing network according to the first loss function;
an image restoration unit 1403, configured to splice the prediction noise and the noise image to obtain a target noise image, perform image restoration on the target noise image, and output a prediction image;
the parameter adjusting unit 1402 is further configured to calculate a second loss function of the predicted image and the sample image, and adjust a second network parameter of the image processing network according to the second loss function;
and an image denoising unit 1404, configured to perform image processing according to the adjusted image processing network of the first network parameter and the second network parameter.
Therefore, the image denoising device shown in fig. 14 can overcome the problem of low training efficiency of the image processing network to a certain extent, and the image processing network is trained through the noise sample to reduce the training difficulty, so as to improve the training efficiency of the image processing network; the data preparation efficiency before image restoration training can be improved by generating a noise image and adjusting the first network parameter according to the noise image, and the dependence degree of the image processing network on a clean sample can be reduced by training the image processing network through the noise image; and the image restoration effect can be improved through an image processing network obtained by training a noise sample.
Since each functional module of the image denoising device in the exemplary embodiment of the present disclosure corresponds to the steps of the exemplary embodiment of the image denoising method, please refer to the embodiment of the image denoising method in the present disclosure for details that are not disclosed in the embodiment of the present disclosure.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by an electronic device, cause the electronic device to implement the method described in the above embodiments.
It should be noted that the computer readable media shown in the present disclosure may be computer readable signal media or computer readable storage media or any combination of the two. The computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable superposition of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (15)

1. A method of training an image processing network, comprising:
generating a noise image according to the reference noise and the original image, and performing noise extraction on the noise image to obtain predicted noise;
calculating a first loss function of the predicted noise and the reference noise, and adjusting a first network parameter of an image processing network according to the first loss function;
splicing the predicted noise and the noise image to obtain a target noise image, performing image restoration on the target noise image, and outputting a predicted image;
and calculating a second loss function of the predicted image and the original image, and adjusting a second network parameter of the image processing network according to the second loss function.
2. The method of claim 1, further comprising:
and overlapping the original noise and the random noise to obtain the reference noise.
3. The method of claim 2, wherein the original noise is watermark noise, and the step of superposing the original noise and random noise to obtain the reference noise comprises:
distributing the watermark noise in the canvas with the same size as the original image to obtain an image to be synthesized;
and superposing the image to be synthesized and random noise to obtain reference noise.
4. The method of claim 2, wherein the random noise comprises at least one of poisson noise, multiplicative bernoulli noise, random-valued impulse noise, image-text watermark noise, monte carlo noise, and text noise.
5. The method of claim 1, wherein performing noise extraction on the noise image to obtain predicted noise comprises:
performing image convolution and nonlinear activation on the noise image to obtain a first image characteristic;
performing residual calculation on the first image characteristic for a first preset number of times to obtain a second image characteristic; wherein the first preset number of times is equal to the preset number;
and performing image convolution and normalization on the second image characteristics to obtain prediction noise.
6. The method according to claim 5, wherein, in the performing the residual calculation for the first image feature a first preset number of times to obtain the second image feature, the performing the residual calculation for the first image feature once is performed by:
inputting the first image characteristic into a current residual error window, and performing image convolution on the first image characteristic through a first convolution layer in the current residual error window to obtain a first convolution result;
normalizing the first convolution result, and performing nonlinear activation on the normalized result;
performing image convolution on the nonlinear activation result through a second convolution layer in the current residual error window to obtain a second convolution result;
normalizing the second convolution result to obtain the output of the current residual window;
and combining the output of the current residual error window with the output of the previous residual error window to be used as the input of the next residual error window.
7. The method according to claim 5, wherein performing image restoration on the target noise image and outputting a prediction image comprises:
performing convolution and nonlinear activation on the target noise image to obtain a third image characteristic;
performing residual calculation on the target noise image for a second preset number of times to obtain a fourth image characteristic; wherein the second preset times is greater than the first preset times;
performing image convolution and normalization on the second image characteristic, and splicing an obtained normalization result with the third image characteristic;
and performing image convolution on the splicing result to output a prediction image.
8. The method of claim 1, wherein performing noise extraction on the noise image to obtain predicted noise comprises:
superposing the original noise and the random noise to obtain standby noise;
splicing the noise image and the standby noise to obtain a target noise image;
and carrying out noise extraction on the target noise image to obtain predicted noise.
9. The method of claim 1, wherein generating a noise image from the reference noise and the original image comprises:
and adjusting the transparency parameter of the reference noise to a preset transparency range, and generating a noise image according to the adjusted reference noise and the original image.
10. The method according to claim 1, wherein the first loss function is a mean square error between the prediction noise and the reference noise, and the second loss function is a mean square error between the prediction image and the original image.
11. The method of claim 1, wherein the predicted noise is stitched to the noise image to obtain a target noise image by:
and splicing the character string corresponding to the predicted noise with the character string corresponding to the noise image to obtain a target noise image.
12. An image denoising method, comprising:
generating a noise image according to the reference noise and the sample image, and performing noise extraction on the noise image to obtain predicted noise;
calculating a first loss function of the predicted noise and the reference noise, and adjusting a first network parameter of an image processing network according to the first loss function;
splicing the predicted noise and the noise image to obtain a target noise image, performing image restoration on the target noise image, and outputting a predicted image;
calculating a second loss function of the predicted image and the sample image, and adjusting a second network parameter of the image processing network according to the second loss function;
and carrying out image processing according to the image processing network of the adjusted first network parameter and second network parameter.
13. An apparatus for training an image processing network, comprising:
the noise extraction unit is used for generating a noise image according to the reference noise and the original image and extracting the noise of the noise image to obtain predicted noise;
the parameter adjusting unit is used for calculating a first loss function of the predicted noise and the reference noise and adjusting a first network parameter of the image processing network according to the first loss function;
the image restoration unit is used for splicing the prediction noise and the noise image to obtain a target noise image, restoring the target noise image and outputting a prediction image;
the parameter adjusting unit is further configured to calculate a second loss function of the predicted image and the original image, and adjust a second network parameter of the image processing network according to the second loss function.
14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of any one of claims 1-12.
15. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of any of claims 1-12 via execution of the executable instructions.
CN201910979627.9A 2019-10-15 2019-10-15 Training method of image processing network, image denoising method and device Pending CN110807741A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910979627.9A CN110807741A (en) 2019-10-15 2019-10-15 Training method of image processing network, image denoising method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910979627.9A CN110807741A (en) 2019-10-15 2019-10-15 Training method of image processing network, image denoising method and device

Publications (1)

Publication Number Publication Date
CN110807741A true CN110807741A (en) 2020-02-18

Family

ID=69488546

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910979627.9A Pending CN110807741A (en) 2019-10-15 2019-10-15 Training method of image processing network, image denoising method and device

Country Status (1)

Country Link
CN (1) CN110807741A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111369456A (en) * 2020-02-28 2020-07-03 深圳市商汤科技有限公司 Image denoising method and device, electronic device and storage medium
CN112419135A (en) * 2020-11-19 2021-02-26 广州华多网络科技有限公司 Watermark recognition online training, sampling and removing method, device, equipment and medium
CN113012068A (en) * 2021-03-16 2021-06-22 深圳壹账通智能科技有限公司 Image denoising method and device, electronic equipment and computer readable storage medium
CN116433674A (en) * 2023-06-15 2023-07-14 锋睿领创(珠海)科技有限公司 Semiconductor silicon wafer detection method, device, computer equipment and medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111369456A (en) * 2020-02-28 2020-07-03 深圳市商汤科技有限公司 Image denoising method and device, electronic device and storage medium
CN111369456B (en) * 2020-02-28 2021-08-31 深圳市商汤科技有限公司 Image denoising method and device, electronic device and storage medium
CN112419135A (en) * 2020-11-19 2021-02-26 广州华多网络科技有限公司 Watermark recognition online training, sampling and removing method, device, equipment and medium
CN113012068A (en) * 2021-03-16 2021-06-22 深圳壹账通智能科技有限公司 Image denoising method and device, electronic equipment and computer readable storage medium
CN113012068B (en) * 2021-03-16 2023-07-04 深圳壹账通智能科技有限公司 Image denoising method, image denoising device, electronic equipment and computer-readable storage medium
CN116433674A (en) * 2023-06-15 2023-07-14 锋睿领创(珠海)科技有限公司 Semiconductor silicon wafer detection method, device, computer equipment and medium
CN116433674B (en) * 2023-06-15 2023-08-18 锋睿领创(珠海)科技有限公司 Semiconductor silicon wafer detection method, device, computer equipment and medium

Similar Documents

Publication Publication Date Title
Golts et al. Unsupervised single image dehazing using dark channel prior loss
CN110807741A (en) Training method of image processing network, image denoising method and device
CN111292264B (en) Image high dynamic range reconstruction method based on deep learning
CN113240580A (en) Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation
Panetta et al. Tmo-net: A parameter-free tone mapping operator using generative adversarial network, and performance benchmarking on large scale hdr dataset
Wang et al. Joint iterative color correction and dehazing for underwater image enhancement
CN110189260B (en) Image noise reduction method based on multi-scale parallel gated neural network
Montulet et al. Deep learning for robust end-to-end tone mapping
CN111681177A (en) Video processing method and device, computer readable storage medium and electronic equipment
CN116468746B (en) Bidirectional copy-paste semi-supervised medical image segmentation method
CN114004766A (en) Underwater image enhancement method, system and equipment
CN115526803A (en) Non-uniform illumination image enhancement method, system, storage medium and device
Sharif et al. Deep color reconstruction for a sparse color sensor
Tao et al. Effective solution for underwater image enhancement
Yin et al. Adams-based hierarchical features fusion network for image dehazing
Kumar et al. Dynamic stochastic resonance and image fusion based model for quality enhancement of dark and hazy images
CN116137023A (en) Low-illumination image enhancement method based on background modeling and detail enhancement
Zhu et al. Learning knowledge representation with meta knowledge distillation for single image super-resolution
Li et al. Multi-scale fusion framework via retinex and transmittance optimization for underwater image enhancement
Li et al. A review of image colourisation
CN114049290A (en) Image processing method, device, equipment and storage medium
Lang et al. Effective enhancement method of low-light-level images based on the guided filter and multi-scale fusion
Sun et al. Explore unsupervised exposure correction via illumination component divided guidance
EP4248365A1 (en) Gating of contextual attention and convolutional features
Bi et al. Non-uniform illumination underwater image enhancement via events and frame fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40022082

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination