CN113822289A

CN113822289A - Training method, device and equipment of image noise reduction model and storage medium

Info

Publication number: CN113822289A
Application number: CN202110662101.5A
Authority: CN
Inventors: 何楠君; 李悦翔; 林一; 马锴; 郑冶枫
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-06-15
Filing date: 2021-06-15
Publication date: 2021-12-21

Abstract

The embodiment of the application provides a training method, a training device, equipment and a storage medium of an image noise reduction model, and relates to the technical field of image processing and machine learning. The method comprises the following steps: acquiring at least one noisy sample image and at least one noise-free sample image; adding the noise characteristics of the noisy sample image into the noiseless sample image to generate a noisy image corresponding to the noiseless sample image; carrying out noise reduction processing on the noise-added image by adopting an image noise reduction model to generate a noise-reduced image corresponding to the noise-added image; calculating the training loss of the image noise reduction model based on the noise-free sample image and the noise reduction image corresponding to the noise adding image; and training the image noise reduction model according to the training loss. According to the technical scheme, the noise reduction performance of the image noise reduction model can be improved.

Description

Training method, device and equipment of image noise reduction model and storage medium

Technical Field

The embodiment of the application relates to the technical field of image processing and machine learning, in particular to a training method, a device, equipment and a storage medium for an image noise reduction model.

Background

Medical imaging techniques are widely used in modern medicine. Sometimes, further processing is required for the acquired original medical image.

In the related art, the paired sample images refer to two voxel images captured/generated from the same or similar angles with respect to the same body part of the same person, and a CNN (Convolutional Neural network) model is used to learn a mapping relationship between the paired noisy images and the noiseless images in order to reduce or remove noise in the images; and then carrying out noise reduction treatment on the image with noise by adopting the trained CNN model.

In the related art, since it is difficult to acquire a sufficient number of paired image samples to train the CNN model in clinical medicine, the model learning effect is poor and the noise reduction performance is poor.

Disclosure of Invention

The embodiment of the application provides a training method, a training device, equipment and a storage medium of an image noise reduction model, and the noise reduction performance of the image noise reduction model can be improved. The technical scheme is as follows:

according to an aspect of an embodiment of the present application, there is provided a method for training an image noise reduction model, the method including:

acquiring at least one noisy sample image and at least one noise-free sample image;

adding the noise characteristics of the noisy sample image into the noiseless sample image to generate a noisy image corresponding to the noiseless sample image;

carrying out noise reduction processing on the noise-added image by adopting the image noise reduction model to generate a noise-reduced image corresponding to the noise-added image;

calculating the training loss of the image noise reduction model based on the noise-free sample image and the noise reduction image corresponding to the noise adding image;

and training the image noise reduction model according to the training loss.

According to an aspect of an embodiment of the present application, there is provided an apparatus for training an image noise reduction model, the apparatus including:

the image acquisition module is used for acquiring at least one noisy sample image and at least one noiseless sample image;

a feature adding module, configured to add a noise feature of the noisy sample image to the noiseless sample image, and generate a noisy image corresponding to the noiseless sample image;

the image generation module is used for carrying out noise reduction processing on the noise-added image by adopting the image noise reduction model to generate a noise-reduced image corresponding to the noise-added image;

the loss calculation module is used for calculating the training loss of the image noise reduction model based on the noise-free sample image and the noise reduction image corresponding to the noise adding image;

and the model training module is used for training the image noise reduction model according to the training loss.

According to an aspect of embodiments of the present application, there is provided a computer device, the computer device comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by the processor to implement the above-mentioned training method of an image noise reduction model.

According to an aspect of embodiments of the present application, there is provided a computer-readable storage medium having at least one instruction, at least one program, a set of codes, or a set of instructions stored therein, which is loaded and executed by a processor to implement the above-mentioned training method for an image noise reduction model.

According to an aspect of embodiments herein, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to execute the training method of the image noise reduction model.

The technical scheme provided by the embodiment of the application can have the following beneficial effects:

the noise characteristics of the sample images with noise are extracted and added into the sample images without noise, noise adding images corresponding to the sample images without noise are generated, so that the sample images without noise and the noise adding images corresponding to the sample images without noise become paired images in a model training process, then an image noise reduction model is adopted to reduce the noise of the noise adding images to generate noise reduction images corresponding to the noise adding images, and training loss is calculated based on the sample images without noise and the noise reduction images corresponding to the noise adding images to train the image noise reduction model. The training method of the image noise reduction model provided by the embodiment of the application can directly train the image noise reduction model by using unpaired sample images, does not need paired sample images, and is sufficient in quantity and easy to obtain unpaired real sample images, so that the training effect of the image noise reduction model can be improved, and the noise reduction performance of the trained image noise reduction model is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a flow diagram of a method of training an image noise reduction model provided by one embodiment of the present application;

FIG. 2 is a schematic diagram of a training system for an image denoising model according to an embodiment of the present application;

FIG. 3 is a flow chart of a method of training an image noise reduction model provided by another embodiment of the present application;

FIG. 4 is a schematic diagram of a noise extraction network provided by one embodiment of the present application;

FIG. 5 is a schematic diagram of a training system for an image denoising model according to another embodiment of the present application;

FIG. 6 is a flow diagram of a method of training an image noise reduction model provided by another embodiment of the present application;

FIG. 7 is a block diagram of a training apparatus for an image denoising model according to an embodiment of the present application;

FIG. 8 is a block diagram of an image denoising model training apparatus according to another embodiment of the present application;

FIG. 9 is a block diagram of a computer device provided by one embodiment of the present application.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of methods consistent with aspects of the present application, as detailed in the appended claims.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Computer Vision technology (CV) is a science for researching how to make a machine "see", and further refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or is transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image segmentation, image Recognition, image semantic understanding, image retrieval, OCR (Optical Character Recognition), video processing, video semantic understanding, video content/behavior Recognition, three-dimensional object reconstruction, 3D (three-dimensional) technology, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face Recognition and fingerprint Recognition.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and teaching learning.

Alternatively, various pre-stored data referred to in this application may be saved on the block chain, such as image data of a noisy sample image, image data of a noise-free sample image, parameters of an image noise reduction model, and so forth.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.

The scheme provided by the embodiment of the application relates to the technical field of computer vision technology and machine learning, an image noise reduction model is obtained by utilizing the training of the computer vision technology and the machine learning technology, and then the image noise reduction model is adopted to perform noise reduction processing on a noisy image, and the following embodiment is specifically used for explanation.

According to the method provided by the embodiment of the application, the execution main body of each step can be a computer device, and the computer device refers to an electronic device with data calculation, processing and storage capabilities. The computer device may be a terminal such as a PC (personal computer), a tablet, a smartphone, a wearable device, a smart robot, or the like; or may be a server. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing services.

The technical solution of the present application will be described below by means of several embodiments.

Referring to fig. 1, a flowchart of a method for training an image noise reduction model according to an embodiment of the present application is shown. In the present embodiment, the method is exemplified as being applied to the computer apparatus described above. The method comprises the following steps (101-105):

step 101, obtaining at least one noisy sample image and at least one noise-free sample image.

The image referred to in the embodiments of the present application may be an image of biological or non-biological internal tissue taken by a non-invasive means, which is not directly visible to the human eye. In the biomedical field, the image in the embodiment of the present application may be a biological image (e.g., a medical image). Biological images refer to images of internal tissues of a living being or a part of a living being (e.g., a human body or a part of a human body) that are obtained in a non-invasive manner for medical treatment or medical research. In one example, for the medical field, the image in the embodiment of the present application may be an image of a heart, a lung, a liver, a stomach, a large intestine, a small intestine, a brain, a bone, a blood vessel, and other organs of a human body; or images of non-human organs such as tumors. The image according to the embodiment of the present application may be an image generated by an Imaging technique such as an X-ray (X-ray) technique, a Computed Tomography (CT) technique, a Positron Emission Tomography (PET) technique, a Nuclear Magnetic Resonance Imaging (NMRI) technique, or a Medical ultrasound (Medical ultrasound) technique. In addition, the sample image in the embodiment of the present application may also be a what-you-see-is-what-you-get image generated by a visual presentation technology, such as an image captured by a camera (e.g., a camera of a camera, a camera of a terminal, etc.).

In this embodiment, the noisy sample image and the noiseless sample image used in each training round of the image noise reduction model may not be paired sample images. For example, the noisy sample image and the quiet sample image may be images of the same part of different human bodies, images of different parts of the same human body, or images of different parts of different human bodies. In one example, as shown in fig. 2, the image noise reduction model is used in a certain training round, and the noise-containing sample image 21 and the noise-free sample image 22 are images corresponding to different organs.

In some embodiments, due to shooting condition limitations or other reasons, image content which is not needed or has negative influence on the analysis and understanding of the image content, namely image noise, may exist in the shot image; this image is a noisy image. For example, artifacts (i.e., image noise) due to metal interference often occur in medical images, which may affect the judgment of the physical health condition of the doctor. And an image without image noise, or with image noise not affecting or having less impact on the expression of image content, may be referred to as a noiseless image. Thus, there is a need to reduce or remove image noise in noisy images to restore as much true image content as possible. This process may be referred to as a noise reduction process (or denoising process) on a noisy image.

And 102, adding the noise characteristics of the noisy sample image into the noiseless sample image to generate a noisy image corresponding to the noiseless sample image.

In some embodiments, noise features of image noise in the noisy sample image are extracted and added to the noise-free sample image such that the generated noisy image contains the same or similar noise features as the noisy sample image. Thus, the training system of the image denoising model generates a highlight that is paired with the noisy sample image, i.e., a noisy image.

And 103, carrying out noise reduction treatment on the noise-added image by adopting an image noise reduction model to generate a noise-reduced image corresponding to the noise-added image.

In some embodiments, the image noise reduction model is used to perform noise reduction processing on the noise-added image, so as to reduce or remove image noise in the noise-added image, thereby generating a noise-reduced image corresponding to the noise-added image.

And 104, calculating the training loss of the image noise reduction model based on the noise-free sample image and the noise reduction image corresponding to the noise addition image.

The loss calculated based on the noise-free sample image and the noise-added image corresponding to the noise-reduced image can be used for measuring the training effect of the image noise-reduced model.

And 105, training the image noise reduction model according to the training loss.

Optionally, based on the training loss of the image noise reduction model, adjusting the model parameters of the image noise reduction model until the image noise reduction model or the training system of the image noise reduction model reaches the training stop condition, so as to obtain the trained image noise reduction model. In some embodiments, the model parameters of other learning models in the training system of the image noise reduction model are also continuously adjusted according to the training loss before the training stop condition is reached.

In summary, according to the technical scheme provided by the embodiment of the application, the noise characteristics of the noisy sample image are extracted and added to the noiseless sample image to generate the noisy image corresponding to the noiseless sample image, so that the noisy sample image and the noisy image corresponding to the noiseless sample image become paired images in the model training process, then the image denoising model is adopted to denoise the noisy image to generate a denoised image corresponding to the noisy image, and the training loss is calculated based on the noiseless sample image and the denoised image corresponding to the denoised image to train the image denoising model. The training method of the image noise reduction model provided by the embodiment of the application can directly train the image noise reduction model by using unpaired sample images, does not need paired sample images, and is sufficient in quantity and easy to obtain unpaired real sample images, so that the training effect of the image noise reduction model can be improved, and the noise reduction performance of the trained image noise reduction model is improved.

In some embodiments, as shown in FIG. 2, a training system for an image denoising model includes 200: an image noise reduction model 210, an image generation model 220, and a noise extraction network 230; the image noise reduction model 210 includes a first encoding network 211 and a first decoding network 212, and the image generation model 220 includes a second encoding network 221 and a second decoding network 222.

Referring to fig. 3, a flowchart of a method for training an image noise reduction model according to another embodiment of the present application is shown. In this embodiment, the method is applied to the training system 200 of the image noise reduction model in the embodiment of fig. 2. Optionally, the training system 200 of the image noise reduction model is provided in the computer device described above. The method comprises the following steps (301-308):

step 301, at least one noisy sample image and at least one noise-free sample image are obtained.

The content of step 301 is the same as or similar to the content of step 101 in the embodiment of fig. 1, and is not described herein again.

Step 302, extracting a second feature map of the noisy sample image through a second coding network, where the second feature map of the noisy sample image includes a noise feature.

In some embodiments, the noisy sample image is input to a second coding network, and a second feature map of the noisy sample image is extracted. Since the training target of the second coding network does not include the noise reduction processing on the image input thereto, the second feature map of the generated noisy sample image still retains the same or similar noise features as the noisy sample image (of course, the noise features in the second feature map of the noisy sample image may not be in the same form as the noise features in the noisy sample image).

In one example, this step 302 can be embodied as the following formula one:

the formula I is as follows: f. of_a＝E_a(x_a)∈R^H×W×C

Wherein f is_aSecond feature map representing noisy sample image, E_aRepresenting a second coding network, x_aRepresenting the noisy sample image, H, W, C representing the length, width, and height, respectively, of the second feature map of the noisy sample image, and the vectors in the second feature map of the noisy sample image are all real vectors.

And 303, decoupling from the second characteristic diagram of the noisy sample image through a noise extraction network to obtain noise characteristics.

Optionally, the noise features are separately separated and extracted from the second feature map of the noisy sample image by a noise extraction network (e.g., a transform network). Optionally, the noise features decoupled by the noise extraction network exist in the form of a noise feature map.

In some embodiments, this step 303 further comprises the steps of:

1. constructing an initial noise characteristic matrix, wherein the initial noise characteristic matrix is an expansion matrix of a second characteristic diagram of the noisy sample image;

2. performing multi-round iterative update processing on the noise characteristic matrix through a noise extraction network to obtain an updated noise characteristic matrix; wherein each round of iterative update processing comprises at least one of the following: normalization processing, self-learning processing and linear projection processing;

3. and obtaining the noise characteristics based on the updated noise characteristic matrix.

In the implementation mode, the second feature map of the noisy sample image is expanded to obtain an expanded matrix of the second feature map of the noisy sample image, the expanded matrix of the second feature map of the noisy sample image is used as an initial noise feature matrix, and iterative update processing of specified number of turns is performed on the noise feature matrix through a noise extraction network to obtain an updated noise feature matrix; and converting the updated noise characteristic matrix into a noise characteristic diagram to obtain the noise characteristics of the noisy sample image.

As shown in fig. 4, the noise extraction network 40 includes a regression network 41, a multi-head attention network 42, a regression network 43, and a linear projection layer 44.

In one example, the above-mentioned multiple iteration processes may refer to the following formulas two, three, four, and five:

the formula II is as follows:

the formula III is as follows: z'_l＝MSA(LN(Z_l-1))+Z_l-1，l＝1，…，L

The formula four is as follows: z_l＝MLP(LN(Z′_l))+Z′_l

The formula five is as follows:

in conjunction with FIG. 4, wherein Z₀A matrix of the initial noise characteristics is represented,

representing a feature vector, Z 'corresponding to the Nth pixel in a second feature map of the noisy sample image'_lAn output matrix, LN (Z), representing the multi-head attention network 42 in the ith round of iterative processing_l-1) Representing the output matrix, Z, of the regression network 41 in the first iteration_lThe output matrix of the linear projection layer 44 in the ith iteration is shown. The fifth expression is to convert the output matrix of the linear projection layer 44 in the L-th iteration process into the corresponding characteristic diagram

And step 304, extracting a first feature map of the noise-free sample image through a first coding network.

In some embodiments, the noise-free sample image is input into a first coding network, and a first feature map of the noise-free sample image is extracted. Since the noise-free sample image has no noise feature, the first feature map of the generated noise-free sample image also has no noise feature.

And 305, generating a noise-added image corresponding to the noise-free sample image according to the first feature map and the noise feature of the noise-free sample image through a second decoding network.

In some embodiments, a noisy image corresponding to the noiseless sample image is generated based on the noisy feature map and the first feature map of the noiseless sample image by the second decoding network.

In some embodiments, this step 305 further comprises the steps of:

1. fusing the first feature map and the noise features of the noise-free sample image to generate a fused feature map;

2. and decoding the fusion characteristic graph through a second decoding network to generate a noise-added image corresponding to the noise-free sample image.

In this implementation, a first feature map of a noise-free sample image and a noise feature map are subjected to fusion processing, and a fusion feature map including noise features is generated. Because the training target of the second decoding network does not include the image denoising, the noise characteristics in the noisy image generated by decoding the fusion characteristic diagram through the second decoding network are not obviously reduced compared with the noise characteristics extracted from the noisy sample image, and thus the denoising process of the noise-free sample image is realized.

And step 306, carrying out noise reduction processing on the noise-added image by adopting the image noise reduction model to generate a noise-reduced image corresponding to the noise-added image.

Optionally, since the training target of the image noise reduction model is to reduce noise of the image with noise, after the image with noise is input into the image noise reduction model, the image output by the image noise reduction model is the noise reduction image corresponding to the image with noise.

In some embodiments, this step 306 further comprises the steps of:

1. extracting a first feature map of the noise-added image through a first coding network;

2. and decoding the first characteristic diagram of the noise-added image through a first decoding network to generate a noise-reduced image corresponding to the noise-added image.

Optionally, a first coding network and a first decoding network in the image noise reduction model are adopted, and in the process of coding and decoding the noise-added image, noise reduction is performed on the noise-added image, so that a noise-reduced image corresponding to the noise-added image is generated.

And 307, calculating the training loss of the image noise reduction model based on the noise-free sample image and the noise reduction image corresponding to the noise addition image.

In some embodiments, after the image noise reduction model performs noise reduction processing on the noise-added image and generates a noise-reduced image corresponding to the noise-added image, a training loss may be calculated based on the noise-reduced image to measure a current noise reduction effect of the image noise reduction model (i.e., to determine a noise reduction performance of the current image noise reduction model).

In some possible implementations, the training loss includes a first contrast loss and/or a second contrast loss, the first contrast loss is used for measuring a local similarity between the noisy sample image and a noise-reduced image corresponding to the noisy sample image, and the second contrast loss is used for measuring a local similarity between the noise-free sample image and the noise-added image. It will be appreciated that the first and second contrast losses are used to measure the similarity between the image and the local features of the image. That is, the first contrast loss is used to measure the retention condition of the local features of the noise-reduced image corresponding to the processed noisy sample image, compared with the noisy sample image as the initial image; the smaller the first contrast loss is, the higher the local similarity between the noisy sample image and the noise-reduced image corresponding to the noisy sample image is, and the better the local features related to the noisy sample image are kept in the noise-reduced image of the noisy sample image. The second contrast loss is used for measuring the retention condition of the local features of the processed noise-added image compared with the noise-free sample image serving as the initial image; the smaller the second contrast loss is, the higher the local similarity between the noise-free sample image and the noise-added image is, and the better the local features related to the noise-free sample image in the noise-added image are kept.

In some embodiments, the method further comprises the steps of:

1. under the condition that the training loss comprises a first contrast loss, extracting a first feature map of the noisy sample image through a first coding network;

2. decoding the first characteristic diagram of the noisy sample image through a first decoding network to generate a noise reduction image corresponding to the noisy sample image;

3. extracting a first feature map of a noise-reduced image corresponding to the noisy sample image through a first coding network;

4. calculating a first contrast loss based on the first feature map of the noisy sample image and the first feature map of the noise-reduced image corresponding to the noisy sample image;

in this implementation, if the training loss includes a first comparison loss, the first comparison loss is obtained by extracting the first feature map of the noisy sample image and the local features in the first feature map of the noise-reduced image corresponding to the noisy sample image, and performing comparison calculation.

Optionally, calculating a first contrast loss based on the first feature map of the noisy sample image and the first feature map of the noise-reduced image corresponding to the noisy sample image, including the following steps:

(1) at least one first positive sample pair and at least one first negative sample pair are obtained from the first feature map of the noise-carrying sample image and the first feature map of the noise-reduced image corresponding to the noise-carrying sample image, each first positive sample pair comprises a pair of feature vectors respectively from the same position in the first feature map of the noise-carrying sample image and the first feature map of the noise-reduced image corresponding to the noise-carrying sample image, and each first negative sample pair comprises a pair of feature vectors respectively from different positions in the first feature map of the noise-carrying sample image and the first feature map of the noise-reduced image corresponding to the noise-carrying sample image.

In one example, as shown in fig. 5, in the first feature map 51 of the noisy sample image and the first feature map 52 of the noise-reduced image corresponding to the noisy sample image, a first pixel (i.e., a first feature vector) 53 and a second pixel (i.e., a second feature vector) 54 located at the same position of the two feature maps may constitute a pair of positive sample pairs; and the third pixel (i.e. the third feature vector) 55 in the first feature map 51 of the noisy sample image does not correspond to the position of the second pixel (i.e. the second feature vector) 54 in the respective feature maps, so that the second pixel 54 and the third pixel 55 can form a pair of negative sample pairs. Optionally, the second pixel 54 may also form a negative sample pair with other pixels in the first feature map 51 of the noisy sample image.

(2) A first contrast loss is calculated based on the at least one first positive sample pair and the at least one first negative sample pair.

In some embodiments, the distance between the feature vectors in each sample pair is calculated, and then the calculation result of the distance corresponding to each sample pair is calculated based on the cross entropy function. In order to make the trained image noise reduction model have good local feature retention capability in the image processing process, one of the targets of the training process of the image noise reduction model is to make the distance between the feature vectors in the positive sample pair smaller and smaller (this means that the pixels corresponding to the positions are more and more similar, that is, the features of the local regions at the same positions are kept the same as much as possible), and make the distance between the feature vectors in the negative sample pair larger and larger (this means that the difference between the pixels not corresponding to the positions is more and more large, that is, the features of the local regions at different positions are kept larger and different as much as possible).

In one example, for the first feature map 52 of the noise-reduced image corresponding to the noisy sample image shown in fig. 5, the second pixel 54 can only form a positive sample pair with the first pixel 51 in the first feature map 51 of the noisy sample image, and form a negative sample pair with the other pixels except the first pixel 51 in the first feature map 51 of the noisy sample image, and it can be seen that the second pixel 54 can only correspond to one positive sample pair, but can correspond to one or more negative sample pairs. Illustratively, for the second pixel 54, its corresponding second vector is represented as

(the second pixel is the selected qth pixel in the first feature map 52 of the noise-reduced image corresponding to the noisy sample image); a first vector corresponding to a first pixel 53 constituting a positive sample pair with a second pixel 54 is represented as

Vectors respectively corresponding to m pixels constituting the negative sample pair with the second pixel 54 are respectively expressed as

The specific value of m may be set by a person skilled in the art according to actual conditions, and is not specifically limited in this embodiment of the present application.

Then, the corresponding loss of the second pixel 54

The approximate calculation can be referenced to the following equation six:

formula six:

optionally, τ is a scaling constant, which may take an empirical value of 0.07. Of course, τ may also be other values, which is not specifically limited in this embodiment of the present application.

The feature vectors in each positive sample and each negative sample are real vectors of K dimensions, and the specific value of K is not specifically limited in the embodiment of the present application.

In the first feature map 52 in which the noise-reduced image corresponding to the noisy sample image is calculated, the loss corresponding to each pixel is calculated, and then the first contrast loss L is calculated_PCLReference may be made to the following formula seven:

the formula seven:

s represents that S pixels are randomly selected from the first feature map 52 of the noise-reduced image corresponding to the noisy sample image, and are used to form a sample pair for calculating the first loss with the pixels in the first feature map 51 of the noisy sample image.

In some embodiments, the method further comprises the steps of:

1. under the condition that the training loss comprises a second contrast loss, extracting a first feature map of the noise image through a first coding network;

2. and calculating a second contrast loss based on the first feature map of the noise-free sample image and the first feature map of the noise-added image.

Optionally, calculating a second contrast loss based on the first feature map of the noise-free sample image and the first feature map of the noise-added image, including the following steps:

(1) obtaining at least one second positive sample pair and at least one second negative sample pair from the first feature map of the noise-free sample image and the first feature map of the noise-added image, wherein each second positive sample pair comprises a pair of feature vectors respectively from the same position in the first feature map of the noise-free sample image and the first feature map of the noise-added image, and each second negative sample pair comprises a pair of feature vectors respectively from different positions in the first feature map of the noise-free sample image and the first feature map of the noise-added image;

(2) a second contrast loss is calculated based on the at least one second positive sample pair and the at least one second negative sample pair.

For the explanation of the second contrast loss, reference may be made to the above description of the first contrast loss, and details thereof are not repeated here.

In some possible implementations, the training loss further includes a first regression loss, a second regression loss, a first opponent loss, and a second opponent loss. This step 307 also includes the following steps:

1. calculating a first regression loss based on the noise-free sample image and the noise-added image corresponding to the noise-added image, wherein the first regression loss is used for measuring the overall similarity between the noise-free sample image and the noise-added image corresponding to the noise-added image;

2. calculating a second regression loss based on the noise-free sample image and the noise-reduced image corresponding to the noise-free sample image, wherein the second regression loss is used for measuring the overall similarity between the noise-free sample image and the noise-reduced image corresponding to the noise-free sample image;

3. calculating a first countermeasure loss based on the noisy sample image and the noisy image, wherein the first countermeasure loss is used for measuring the trueness of the noisy image;

4. calculating a second countermeasure loss based on the noise-free sample image and the noise-reduced image corresponding to the noise-free sample image, wherein the second countermeasure loss is used for measuring the trueness of the noise-reduced image corresponding to the noise-free sample image;

5. a training loss is calculated based on the first comparison loss, the second comparison loss, the first regression loss, the second regression loss, the first opponent loss, and the second opponent loss.

In some embodiments, a first contrast loss, a second contrast loss, a first regression loss, a second regression loss, a first immunity loss and a second immunity loss are calculated respectively, and training losses are calculated based on the first contrast loss, the second regression loss, the first immunity loss and the second immunity loss, so that the image noise reduction model tends to remove image noise in the whole noisy image as clean as possible in the training process, and meanwhile local features of the image are kept unchanged as far as possible.

In some embodiments, before calculating the second regression loss based on the noise-free sample image and the noise-reduced image corresponding to the noise-free sample image, the method further includes the following steps to generate a noise-reduced image corresponding to the noise-free sample image:

(1) extracting a first feature map of a noise-free sample image through a first coding network;

(2) and decoding the first feature map of the noise-free sample image through a first decoding network to generate a noise-reduced image corresponding to the noise-free sample image.

Optionally, the first and second regression losses may be L₁The loss may be a regression loss, and this is not particularly limited in the embodiment of the present application.

In some embodiments, the first feature map of the noisy sample image is decoded by a second coding network to obtain a processed noisy sample image; and calculating a third regression loss based on the noisy sample image and the processed noisy sample image, wherein the third regression loss is used for measuring the overall similarity between the noisy sample image and the processed noisy sample image.

In some embodiments, a first difference image between the noise-free sample image and the noise-reduced image corresponding to the noise-free sample image and a second difference image between the noise-reduced sample image and the noise-reduced sample image are obtained; calculating a fourth regression loss based on the first difference image and the second difference image; the fourth regression loss is used to measure the overall similarity between the first difference image and the second difference image.

Optionally, the first comparison loss, the second comparison loss, the first regression loss, the second regression loss, the third regression loss, the fourth regression loss, the first immunity loss, and the second immunity loss are weighted and summed to obtain the training loss.

The training loss can be calculated with reference to the following equation eight:

the formula eight:

wherein x is_aRepresenting a noisy sample image of the image,

representing the noise-reduced image corresponding to the noisy sample image,

representing the processed noisy sample image, y representing the noise-free sample image,

representing a noise-reduced image corresponding to the noise-free sample image,

representing the image of the noisy sample as it is,

representing a noise-reduced image corresponding to the noise-added image; lambda [ alpha ]₁、λ₂、λ₃、λ₄、λ₅、λ₆、λ₇、λ₈The weights corresponding to the losses are respectively.

And 308, training the image noise reduction model according to the training loss.

The content of step 308 is the same as or similar to the content of step 105 in the embodiment of fig. 1, and is not described herein again.

In summary, according to the technical scheme provided by the embodiment of the application, by introducing the PCL (Patch-wise contrast Learning), the local features of the image are kept unchanged as much as possible while the image noise in the whole noisy image is removed as much as possible by the image noise reduction model obtained by training.

In some possible implementations, the image noise in the noisy sample image is a metal artifact, the image noise reduction model is a de-artifact model, the noisy sample image is a sample image with an artifact, and the non-noisy sample image is a sample image without an artifact. Metal artifacts refer to artifacts that result from metal object interference. As shown in FIG. 6, the method can comprise the following steps (601-605):

step 601, obtaining at least one sample image with an artifact and at least one sample image without the artifact;

step 602, adding the artifact characteristics of the artifact-containing sample image to the artifact-free sample image to generate an artifact-added image corresponding to the artifact-free sample image;

step 603, performing artifact removing processing on the artifact-added image by using an artifact removing model to generate an artifact-removed image corresponding to the artifact-added image;

step 604, calculating training loss of the artifact-removed model based on the artifact-free sample image and the artifact-removed image corresponding to the artifact-added image;

step 605, training the image deghost model according to the training loss.

The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.

Referring to fig. 7, a block diagram of a training apparatus for an image noise reduction model according to an embodiment of the present application is shown. The device has the function of realizing the method example of the training of the image noise reduction model, and the function can be realized by hardware or by hardware executing corresponding software. The device may be the computer device described above, or may be provided on a computer device. The apparatus 700 may include: an image acquisition module 710, a feature addition module 720, an image generation module 730, a loss calculation module 740, and a model training module 750.

The image obtaining module 710 is configured to obtain at least one noisy sample image and at least one noiseless sample image.

The feature adding module 720 is configured to add the noise feature of the noisy sample image to the noiseless sample image, and generate a noisy image corresponding to the noiseless sample image.

The image generating module 730 is configured to perform noise reduction processing on the noise-added image by using the image noise reduction model, and generate a noise-reduced image corresponding to the noise-added image.

The loss calculating module 740 is configured to calculate a training loss of the image denoising model based on the noise-free sample image and the denoising image corresponding to the denoising image.

The model training module 750 is configured to train the image denoising model according to the training loss.

In an exemplary embodiment, the training system of the image noise reduction model includes: the image noise reduction model, the image generation model and the noise extraction network; wherein the image noise reduction model comprises a first encoding network and a first decoding network, and the image generation model comprises a second encoding network and a second decoding network; as shown in fig. 8, the image generating module 730 includes: a feature map extraction sub-module 731, a noise decoupling sub-module 732, and an image generation sub-module 733.

The feature map extracting sub-module 731 is configured to extract a second feature map of the noisy sample image through the second coding network, where the second feature map of the noisy sample image includes the noise feature.

The noise decoupling sub-module 732 is configured to decouple the noise feature from the second feature map of the noisy sample image through the noise extraction network.

The feature map extracting sub-module 731 is further configured to extract a first feature map of the noise-free sample image through the first coding network.

The image generation sub-module 733 is configured to generate, by the second decoding network, a noise-added image corresponding to the noise-free sample image according to the first feature map of the noise-free sample image and the noise feature.

In an exemplary embodiment, as shown in fig. 8, the noise decoupling sub-module 732 is configured to:

constructing an initial noise characteristic matrix, wherein the initial noise characteristic matrix is an expansion matrix of a second characteristic map of the noisy sample image;

performing multiple rounds of iterative update processing on the noise characteristic matrix through the noise extraction network to obtain an updated noise characteristic matrix; wherein each round of iterative update processing comprises at least one of the following: normalization processing, self-learning processing and linear projection processing;

and obtaining the noise characteristics based on the updated noise characteristic matrix.

In an exemplary embodiment, as shown in fig. 8, the image generation sub-module 733 is configured to:

performing fusion processing on the first feature map of the noise-free sample image and the noise feature to generate a fusion feature map;

and decoding the fusion characteristic graph through the second decoding network to generate a noise-added image corresponding to the noise-free sample image.

In an exemplary embodiment, as shown in fig. 8, the image generation sub-module 733 is further configured to:

extracting a first feature map of the noisy image through the first coding network;

and decoding the first feature map of the noise-added image through the first decoding network to generate a noise-reduced image corresponding to the noise-added image.

In an exemplary embodiment, the training loss comprises a first contrast loss and/or a second contrast loss, the first contrast loss is used for measuring the local similarity between the noisy sample image and the noise-reduced image corresponding to the noisy sample image, and the second contrast loss is used for measuring the local similarity between the noise-free sample image and the noise-reduced image; as shown in fig. 8, the apparatus 700 further includes: feature map extraction module 760.

The feature map extraction module 760 is configured to extract a first feature map of the noisy sample image through the first coding network if the training loss includes the first contrast loss. The image generating module 730 is further configured to perform decoding processing on the first feature map of the noisy sample image through the first decoding network, and generate a noise-reduced image corresponding to the noisy sample image. The feature map extracting module 760 is further configured to extract a first feature map of the noise-reduced image corresponding to the noisy sample image through the first coding network. The loss calculating module 740 is further configured to calculate the first contrast loss based on the first feature map of the noisy sample image and the first feature map of the noise-reduced image corresponding to the noisy sample image.

The feature map extracting module 760 is further configured to extract a first feature map of the noisy image through the first coding network if the training loss includes the second comparison loss. The loss calculating module 740 is further configured to calculate the second contrast loss based on the first feature map of the noise-free sample image and the first feature map of the noise-added image.

In an exemplary embodiment, the loss calculation module 740 is configured to:

obtaining at least one first positive sample pair and at least one first negative sample pair from the first feature map of the noisy sample image and the first feature map of the noise-reduced image corresponding to the noisy sample image, wherein each first positive sample pair comprises a pair of feature vectors respectively from the same position in the first feature map of the noisy sample image and the first feature map of the noise-reduced image corresponding to the noisy sample image, and each first negative sample pair comprises a pair of feature vectors respectively from different positions in the first feature map of the noisy sample image and the first feature map of the noise-reduced image corresponding to the noisy sample image;

calculating the first contrast loss based on the at least one first positive sample pair and the at least one first negative sample pair.

In an exemplary embodiment, the loss calculation module 740 is configured to:

obtaining at least one second positive sample pair and at least one second negative sample pair from the first feature map of the noise-free sample image and the first feature map of the noise-added image, wherein each second positive sample pair comprises a pair of feature vectors from the same position in the first feature map of the noise-free sample image and the first feature map of the noise-added image respectively, and each second negative sample pair comprises a pair of feature vectors from different positions in the first feature map of the noise-free sample image and the first feature map of the noise-added image respectively;

calculating the second contrast loss based on the at least one second positive sample pair and the at least one second negative sample pair.

In an exemplary embodiment, the calculating the second contrast loss is based on the first feature map of the noise-free sample image and the first feature map of the noise-added image; the loss calculation module 740 is configured to:

calculating the first return loss based on the noise-free sample image and the noise-added image corresponding to the noise-added image, wherein the first return loss is used for measuring the overall similarity between the noise-free sample image and the noise-added image corresponding to the noise-added image;

calculating the second regression loss based on the noise-free sample image and the noise-reduced image corresponding to the noise-free sample image, wherein the second regression loss is used for measuring the overall similarity between the noise-free sample image and the noise-reduced image corresponding to the noise-free sample image;

calculating a first countermeasure loss based on the noisy sample image and the noisy image, the first countermeasure loss being used for measuring the trueness of the noisy image;

calculating a second countermeasure loss based on the noise-free sample image and the noise-reduced image corresponding to the noise-free sample image, wherein the second countermeasure loss is used for measuring the truth of the noise-reduced image corresponding to the noise-free sample image;

calculating the training loss based on the first comparison loss, the second comparison loss, the first regression loss, the second regression loss, the first opponent loss, and the second opponent loss.

It should be noted that, when the apparatus provided in the foregoing embodiment implements the functions thereof, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the apparatus may be divided into different functional modules to implement all or part of the functions described above. In addition, the apparatus and method embodiments provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments for details, which are not described herein again.

Referring to fig. 9, a block diagram of a computer device according to an embodiment of the present application is shown. The computer device is used for implementing the training method of the image noise reduction model provided in the above embodiment. Specifically, the method comprises the following steps:

the computer apparatus 900 includes a CPU (Central Processing Unit) 901, a system Memory 904 including a RAM (Random Access Memory) 902 and a ROM (Read-Only Memory) 903, and a system bus 905 connecting the system Memory 904 and the Central Processing Unit 901. The computer device 900 also includes a basic I/O (Input/Output) system 906, which facilitates the transfer of information between devices within the computer, and a mass storage device 907 for storing an operating system 913, application programs 914, and other program modules 915.

The basic input/output system 906 includes a display 908 for displaying information and an input device 908, such as a mouse, keyboard, etc., for a user to input information. Wherein the display 908 and the input device 908 are connected to the central processing unit 901 via an input-output controller 910 connected to the system bus 905. The basic input/output system 906 may also include an input/output controller 910 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, input-output controller 910 also provides output to a display screen, a printer, or other type of output device.

The mass storage device 907 is connected to the central processing unit 901 through a mass storage controller (not shown) connected to the system bus 905. The mass storage device 907 and its associated computer-readable media provide non-volatile storage for the computer device 900. That is, the mass storage device 907 may include a computer-readable medium (not shown) such as a hard disk or CD-ROM (Compact disk Read-Only Memory) drive.

Without loss of generality, the computer-readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, EPROM (Erasable Programmable Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), flash Memory or other solid state Memory technology, CD-ROM, DVD (Digital Video Disc) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will appreciate that the computer storage media is not limited to the foregoing. The system memory 904 and mass storage device 907 described above may be collectively referred to as memory.

According to various embodiments of the present application, the computer device 900 may also operate as a remote computer connected to a network via a network, such as the Internet. That is, the computer device 900 may be connected to the network 912 through the network interface unit 911 coupled to the system bus 905, or the network interface unit 911 may be used to connect to other types of networks or remote computer systems (not shown).

In an exemplary embodiment, a computer readable storage medium is also provided, in which at least one instruction, at least one program, a set of codes, or a set of instructions is stored, which when executed by a processor, implements the above-mentioned training method of the image noise reduction model.

Optionally, the computer-readable storage medium may include: ROM (Read-Only Memory), RAM (Random-Access Memory), SSD (Solid State drive), or optical disk. The Random Access Memory may include a ReRAM (resistive Random Access Memory) and a DRAM (Dynamic Random Access Memory).

In an exemplary embodiment, a computer program product or computer program is also provided, the computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to execute the training method of the image noise reduction model.

It should be understood that reference to "a plurality" herein means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

The above description is only exemplary of the present application and should not be taken as limiting the present application, and any modifications, equivalents, improvements and the like that are made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method for training an image noise reduction model, the method comprising:

and training the image noise reduction model according to the training loss.

2. The method of claim 1, wherein the training system of the image denoising model comprises: the image noise reduction model, the image generation model and the noise extraction network; wherein the image noise reduction model comprises a first encoding network and a first decoding network, and the image generation model comprises a second encoding network and a second decoding network;

adding the noise characteristics of the noisy sample image to the noiseless sample image to generate a noisy image corresponding to the noiseless sample image, including:

extracting a second feature map of the noisy sample image through the second coding network, wherein the second feature map of the noisy sample image comprises the noise feature;

decoupling from a second feature map of the noisy sample image through the noise extraction network to obtain the noise feature;

extracting a first feature map of the noise-free sample image through the first coding network;

and generating a noise-added image corresponding to the noise-free sample image according to the first feature map of the noise-free sample image and the noise feature through the second decoding network.

3. The method of claim 2, wherein the decoupling the noisy feature from the second feature map of the noisy sample image by the noise extraction network comprises:

4. The method according to claim 2, wherein the generating, by the second decoding network, a noisy image corresponding to the noiseless sample image according to the first feature map of the noiseless sample image and the noise feature comprises:

5. The method of claim 2, wherein the training loss comprises a first contrast loss and/or a second contrast loss, the first contrast loss is used for measuring a local similarity between the noisy sample image and a noise-reduced image corresponding to the noisy sample image, and the second contrast loss is used for measuring a local similarity between the noise-free sample image and the noise-reduced image; the method further comprises the following steps:

extracting, by the first coding network, a first feature map of the noisy sample image if the training loss comprises the first contrast loss; decoding the first feature map of the noisy sample image through the first decoding network to generate a noise reduction image corresponding to the noisy sample image; extracting a first feature map of a noise reduction image corresponding to the noisy sample image through the first coding network; calculating the first contrast loss based on the first feature map of the noisy sample image and the first feature map of the noise-reduced image corresponding to the noisy sample image;

extracting a first feature map of the noisy image by the first coding network if the training loss comprises the second contrast loss; and calculating the second contrast loss based on the first feature map of the noise-free sample image and the first feature map of the noise-added image.

6. The method of claim 5, wherein calculating the second contrast loss based on the first feature map of the noise-free sample image and the first feature map of the noise-added image comprises:

7. The method of claim 5, wherein the training losses further comprise a first regression loss, a second regression loss, a first opponent loss, and a second opponent loss;

the calculating the training loss of the image noise reduction model based on the noise-free sample image and the noise reduction image corresponding to the noise adding image comprises:

8. An apparatus for training an image noise reduction model, the apparatus comprising:

9. A computer device comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by the processor to implement the method of training an image noise reduction model according to any of the claims 1 to 7.

10. A computer-readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the method of training an image noise reduction model according to any one of claims 1 to 7.