CN112541864A - Image restoration method based on multi-scale generation type confrontation network model - Google Patents

Image restoration method based on multi-scale generation type confrontation network model Download PDF

Info

Publication number
CN112541864A
CN112541864A CN202011021917.1A CN202011021917A CN112541864A CN 112541864 A CN112541864 A CN 112541864A CN 202011021917 A CN202011021917 A CN 202011021917A CN 112541864 A CN112541864 A CN 112541864A
Authority
CN
China
Prior art keywords
image
discriminator
generator
network
loss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011021917.1A
Other languages
Chinese (zh)
Inventor
邵明文
张文龙
宋晓霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong To Letter Information Science And Technology Ltd
China University of Petroleum East China
Original Assignee
Shandong To Letter Information Science And Technology Ltd
China University of Petroleum East China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong To Letter Information Science And Technology Ltd, China University of Petroleum East China filed Critical Shandong To Letter Information Science And Technology Ltd
Priority to CN202011021917.1A priority Critical patent/CN112541864A/en
Publication of CN112541864A publication Critical patent/CN112541864A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • G06T5/77
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention belongs to the technical field of image restoration, and discloses an image restoration method and system based on a multi-scale generation type confrontation network model, wherein a deep generation confrontation restoration model consisting of a generator and a confrontation discriminator is constructed, and missing contents are synthesized from random noise by utilizing reconstruction loss and confrontation loss; improving the network structure of the discriminator, putting forward a multi-scale discriminator structure, carrying out countermeasure training on the discriminator structure, and repairing an image; performing subsequent processing on the repaired image by using a Poisson mixing method; and verifying the advantages of the image restoration algorithm based on the generative confrontation network model and the restoration effect of the image. The method generates a countermeasure repair model through multiple scales, and synthesizes missing contents from random noise by utilizing reconstruction loss and multiple countermeasure losses; by using the idea of WGAN and adopting EM distance to simulate data distribution, the network stability is improved and the picture restoration effect is improved.

Description

Image restoration method based on multi-scale generation type confrontation network model
Technical Field
The invention belongs to the technical field of image restoration, and particularly relates to an image restoration method based on a multi-scale generation type confrontation network model.
Background
Currently, with the rapid development of deep learning in the field of computer vision, the research on the problems of image editing (image editing) and image generation (image generation) has achieved significant success. The image restoration is taken as a research hotspot in the field of current deep learning, and has important significance in the real life of people. The existing image restoration method has various problems, so that the requirements of people cannot be met visually.
Image inpainting is a problem with traditional graphics: a certain area of a certain size is missed at a certain position on an image, and the missing area is recovered by other information, so that people cannot distinguish the repaired part. As shown in fig. 8, the two images have cups and flowers in the missing areas, respectively, so that people can easily complement the images according to the content of the surrounding images. Different human repairing effects are different, so that the principles of structure, similarity, texture consistency, structure priority and the like must be followed in the image repairing process. However, the task of image restoration is extremely difficult for computers, because the problem has no uniquely determined solution, and it is a concern of researchers how to use other information to assist restoration and how to judge whether the restoration result is true enough.
At present, the image restoration algorithm mainly includes three directions: the invention mainly aims at the image restoration algorithm based on deep learning. Early image inpainting methods such as Bertalmio et al iteratively propagate low-level features of known regions along mask boundaries to unknown regions using diffusion equations. Although performing well in repair, it is limited to treating small, uniform areas. By introducing texture synthesis, the repairing effect is further improved. In Zoran and Weiss, the image with missing pixels is recovered by learning the patch a priori. In recent years, Convolutional Neural Networks (CNNs) have greatly improved the performance of tasks such as classification, object detection, and segmentation of semantic images. Ren et al have learned a convolutional network, which greatly improves the performance of image restoration through an efficient patch matching algorithm. When a similar patch is found, it performs well, but when the dataset does not contain enough data to fill the unknown region, it is likely to fail. Since each part may be unique in object repair and no patch with a trusted missing region can be found. While this problem can be alleviated through the use of external databases, the next problem is the need to learn a high-level representation of a particular object class for patch matching. Wright et al take image inpainting as a task to recover sparse signals from the input. By solving for sparse linear systems, images can be repaired from some corrupted input images. However, such algorithms require a high degree of structuring of the image. The purpose of image inpainting is to enable algorithms to complete inpainting of images without strict constraints. Vincent et al introduced a de-noised self-encoder that could learn to reconstruct a clean signal from a corrupted input. Dosovitskiy et al demonstrate that object images can be reconstructed by inverting the deep convolutional network features through the decoder network. Kingma et al propose Variational Autocoders (VAEs) that allow images to be generated by sampling or interpolation from potential units by imposing a priori on the potential units. However, VAE generated images are often blurred due to training targets based on pixel-level high gaussian likelihood.
Larsen et al improve VAE by adding a resistively trained discriminator from a generative resistively network and demonstrate that more realistic images can be generated. The closest to this work is the method proposed by deep et al, which uses an auto-encoder to combine learning visual representation with image restoration, but the picture restored using this method is not ideal in some cases, the restored area is not consistent with the whole picture, and the effect is not very good at the edge of the restored area. Yang et al proposed a multi-scale neural patch synthesis method based on joint optimization of image content and texture constraints in 2017, which not only preserves a context structure, but also generates high-frequency details by matching and adjusting the correlation between patches and the most similar intermediate layer features, thereby achieving the most advanced repair accuracy for high-resolution images at that time. Gao et al have studied the weakness of the traditional "fixed" model, have proposed an on-demand learning algorithm, is used for training the image restoration model with deep convolutional neural network, the main idea is to utilize the feedback mechanism to produce the training example that needs most oneself, thus study the model that can be promoted across the difficulty level. Aiming at the problems of the Context Encoder model, IIZUKA and the like of the early rice field university are improved, the design is expanded into two discriminators, and the trained global and local Context discriminators are used for distinguishing a real image and a repaired image respectively, so that the network can generate images which are locally and globally consistent. Liu et al believe that existing deep learning based image inpainting methods use standard convolution networks on the corrupted image, use of convolution filter responses conditioned on valid pixels and substitution values (typically averages) in the missing regions can often lead to artifacts such as color differences and blurring, and propose the use of partial convolution methods that can inpaint arbitrary non-central, irregular regions. Yan et al proposed in 2018 a "Shift-Net" model for filling any shape of missing region with sharp structures and fine textures, the encoder features of the known regions being shifted to serve as an estimate of the missing part, introducing a guiding loss on the decoder features to minimize the distance between the decoder features after the fully-connected layer and the ground-truth encoder features of the missing part, with this constraint, the decoder features of the missing region can be used to guide the shifting of the encoder features in the known regions.
In summary, researchers at home and abroad have proposed many methods for image restoration, but most of the restoration methods have low result precision and greatly improve performance. Aiming at the problems of low accuracy of the repair result, inconsistent visual repair effect, unstable training and the like of the existing method, the invention obtains the repair image with high precision, high accuracy and strong visual consistency by using the multi-scale generation type confrontation network model.
Through the above analysis, the problems and defects of the prior art are as follows:
(1) early image inpainting methods used the diffusion equation to iteratively propagate low-level features of known regions along the mask boundary to unknown regions, and although performing well in inpainting, were limited to processing small, uniform regions.
(2) In existing convolutional neural network-based image inpainting methods, it performs well when a similar patch is found, but it is likely to fail when the dataset does not contain enough data to fill the unknown region.
(3) Image inpainting is used as a task to recover sparse signals from the input. By solving for sparse linear systems, images can be repaired from some corrupted input images. However, such algorithms require a high degree of structuring of the image.
(4) By applying a priori on the potential elements by a variational self-encoder, an image can be generated by sampling or interpolation from the potential elements. However, VAE generated images are often blurred due to training targets based on pixel-level high gaussian likelihood.
The difficulty in solving the above problems and defects is:
when the image damaged area is large, the repairing effect is poor and unsatisfactory, and the global and local consistency of the repaired image cannot be maintained, so that the repaired image lacks integrity.
The significance of solving the problems and the defects is as follows:
the stability of image restoration is improved, the restored image with high precision, high accuracy and strong visual consistency is obtained, and the restoration effect of the image is improved.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides an image restoration method based on a multi-scale generation type confrontation network model.
The invention is realized in such a way that an image restoration method based on a multi-scale generation type confrontation network model comprises the following steps:
step one, constructing a depth generation countermeasure repair model consisting of a generator and an countermeasure discriminator, and synthesizing missing contents from random noise by using reconstruction loss and countermeasure loss.
And step two, improving the network structure of the discriminator, proposing a multi-scale discriminator structure on the basis of the global discriminator and the local discriminator, and carrying out countermeasure training on the multi-scale discriminator by using images with different resolutions to repair the images.
And step three, performing subsequent processing on the repaired image by using expansion convolution and a Poisson mixing method in a generator.
And step four, verifying the advantages of the image repairing algorithm based on the generative confrontation network model and the repairing effect of the image through CelebA, ImageNet and Place2 data sets.
Further, in step one, the multi-scale confrontation network model comprises a generation network for image restoration, and four additional discriminator networks for training assistance, namely two multi-scale discriminator networks, a global discriminator network and a local discriminator network.
Further, in step one, the generator uses a convolution self-encoder as a generator model
Figure BDA0002700901950000054
I.e. a standard encoder-decoder architecture that takes as input an image with missing regions, and generates a latent feature representation of the image by a convolution operation. The decoder architecture uses this latent feature representation to restore the original resolution by a transposed convolution operation, resulting in image content of the missing region. Unlike the original GAN model, which starts directly from the noise vector, the hidden representation obtained from the encoder captures more variations and relationships between the unknown and known regions, which are then input to the decoder to generate the content. The intermediate layer uses the expanded convolution, and each output pixel is calculated by using a larger input area without additional parameters and calculation amount, and compared with a standard convolution layer, the expanded convolution network model can calculate each output pixel under the influence of a larger pixel area of an input image. If the expansion convolution is not used, it will only use a small pixel area, which is not advantageousThe image composition is performed with more context information. The generator uses a standard self-encoder network, and an expansion convolutional layer is added on the basis, namely the generator network removes two layers of convolutional layers in the middle, and the types of the network layers, the sizes of convolutional cores, the number of zero padding of the convolutional cores, the step length and the number of output channels of the layers are sequentially arranged from left to right.
Further, in the first step, the discriminator compresses the image into corresponding small feature vectors based on a convolutional neural network. The prediction corresponds to a probability value that the image is authentic.
First, a local discriminator
Figure BDA0002700901950000051
Determining whether the composite content of the missing region is authentic can help the network generate information of the missing content, which encourages the generated objects to be semantically effective. Its limitations are also apparent due to the locality of the local arbiter. The local discriminator loss can neither normalize the global structure of a face nor ensure the consistency of the inner and outer edges of the missing region. Therefore, the inconsistency of the pixel values of the repair picture along the repair area boundary is significant. Due to the limitation of local discriminators, another network structure named global discriminator is introduced
Figure BDA0002700901950000052
To determine the accuracy of the image as a whole.
Finally, a multi-scale discriminator network structure is proposed. The basic idea is to down-sample the real and synthesized images with down-sampling coefficients of 2 and 4, respectively, train two discriminators
Figure BDA0002700901950000053
The real image and the restored image are distinguished on two different scales, respectively. The process of repairing the image by the generator is strictly controlled by two discriminator networks which input images with different resolutions, and the two multi-scale discriminators and the global discriminator have similar architectures but have different receptive fields. Compared with the soleThe global discriminator and the combined multi-scale discriminator are used for training, so that the generator can be guided to generate the repairing picture with stronger global consistency and finer details, and the repairing effect of the whole picture is more reasonable visually. By adding the two multi-scale discriminators into the network, a restored picture with better effect can be obtained.
And removing the last two full-connection layers from the global arbiter and the local arbiter in the model, and keeping other structures unchanged. The global arbiter, local arbiter, and multi-scale arbiter network architectures are shown in table 2. From left to right, the network layer type, the convolution kernel size, the step length and the number of output channels of the layer are sequentially arranged. a. b, c and d are respectively
Figure BDA0002700901950000061
Further, in step one, the method for modeling by the loss function is as follows:
first introducing reconstruction losses to the generator
Figure BDA0002700901950000062
Responsible for capturing structural information of the missing region and keeping consistent with the context, i.e. L between the pixels of the restored image and the original image2Distance, z is noise mask:
Figure BDA0002700901950000063
but only using the losses
Figure BDA0002700901950000064
It was observed that the resulting restored image content tended to be blurred and smooth. Because L is2The reason for the loss is due to L2The loss of (c) penalizes outliers severely encouraging the network to smoothly cross various assumptions to avoid large penalties. By using a discriminator, a penalty on antagonism is introduced, which reflects how the generator fools the discriminator to the maximum extent, and how the discriminator distinguishes between true and false. Antagonism loss is based on the loss of GAN, which learns an antagonism discriminationModel of device
Figure BDA0002700901950000065
A loss gradient is provided for the generator model. Antagonism discriminator
Figure BDA0002700901950000066
Simultaneous pair generator
Figure BDA0002700901950000067
Generating samples and true samples for prediction and attempting to distinguish them, and generators
Figure BDA0002700901950000068
The arbiter is obfuscated by generating samples that are as "true" as possible
Figure BDA0002700901950000069
Figure BDA00027009019500000610
Wherein, Pdata(x) And Pz(z) represents the distribution of the real data x and the noise variance z, respectively. The network is optimized by minimizing the generator loss and maximizing the arbiter loss.
Further, the Wassertein distance is used as an optimization method to train the GAN, and the specific method is that
Figure BDA00027009019500000611
The sigmoid is removed in the last layer,
Figure BDA00027009019500000612
and
Figure BDA00027009019500000613
the loss function of (2) does not take the logarithm of the loss function, and is updated every time
Figure BDA00027009019500000614
After the parameter, its absolute value is truncatedUntil a fixed constant is not exceeded, namely gradient clipping:
Figure BDA0002700901950000071
wherein the content of the first and second substances,
Figure BDA00027009019500000718
is a set of 1-Lipschitz functions.
Four discrimination networks
Figure BDA0002700901950000072
The definition of the loss function is the same. The only difference is that the local arbiter only provides the trained loss gradient for the missing region, and the global arbiter and the multi-scale arbiter back-propagate the loss gradient over the entire image at different resolutions. The discriminators are defined as:
Figure BDA0002700901950000073
wherein, the local discriminator
Figure BDA0002700901950000074
Is input into the generator
Figure BDA0002700901950000075
And outputting the repaired part of the image and the corresponding part of the real image.
Figure BDA0002700901950000076
Wherein the global arbiter
Figure BDA0002700901950000077
Input is generator
Figure BDA0002700901950000078
And (4) outputting an image and a real image.
Figure BDA0002700901950000079
Wherein, the multi-scale discriminator
Figure BDA00027009019500000710
Is input into the generator
Figure BDA00027009019500000711
The output image and the real image are down-sampled by 2 times, respectively.
Figure BDA00027009019500000712
Wherein, the multi-scale discriminator
Figure BDA00027009019500000713
Is input into the generator
Figure BDA00027009019500000714
The output image and the real image are down-sampled by 4 times, respectively.
The overall loss function for the entire network optimization is defined as:
Figure BDA00027009019500000715
λ1、λ2、λ3、λ4weights corresponding to the different losses for balancing the influence of the different losses on the overall loss function, λ1、λ2、λ3、λ4The specific numerical value of (A) needs to be set manually in the experimental process.
Further, in the second step, the training process is divided into three stages. First, a generator network is trained
Figure BDA00027009019500000716
By training the network with reconstruction loss, the generator can obtain fuzzy repair content, and the stage does not comprise countertraining and counterloss. Secondly, training all the discriminator networks by using the generator network finished by the first stage training
Figure BDA00027009019500000717
All discriminators are updated with loss of immunity. The last stage performs joint countermeasure training for the generator and all discriminators. Each stage prepares for the improvement of the next stage, thereby greatly improving the effectiveness and efficiency of network training, and the training process is completed by back propagation.
Setting lambda using default hyper-parameters when performing training for resistance loss1、λ2、λ3、λ4Are all 0.001. Training is done by adjusting the image size, using the image cropped to 256 × 256 as the input image. For the missing region, the input of the central square region in the image is set to 0, i.e. the missing part of the image, approximately covering the 1/4 image. The input for global discrimination is a full image of 256 × 256 size, the input for local discriminators is an image of a repair area of 128 × 128 size, and the input for the two multi-scale discriminators are full images of 128 × 128 and 64 × 64 size, respectively.
Another object of the present invention is to provide a repairing system for implementing the image repairing method based on the multi-scale generation type confrontation network model, wherein the image repairing system based on the multi-scale generation type confrontation network model comprises:
the depth generation countermeasure repair model building module is used for building a depth generation countermeasure repair model consisting of a generator and an countermeasure discriminator, and synthesizing missing contents from random noise by utilizing reconstruction loss and countermeasure loss;
the image restoration module is used for improving the network structure of the discriminator, providing a multi-scale discriminator structure on the basis of the global discriminator and the local discriminator, and restoring the image by performing countermeasure training on the multi-scale discriminator by using images with different resolutions;
the image subsequent processing module uses expansion convolution in the generator and carries out subsequent processing on the repaired image by using a Poisson mixing method;
and the image repairing module verifies the advantages of the image repairing algorithm based on the generative confrontation network model and the image repairing effect through CelebA, ImageNet and Place2 data sets.
It is a further object of the invention to provide a computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of:
step one, constructing a depth generation countermeasure repair model consisting of a generator and an countermeasure discriminator, and synthesizing missing contents from random noise by utilizing reconstruction loss and countermeasure loss;
improving the network structure of the discriminator, providing a multi-scale discriminator structure on the basis of a global discriminator and a local discriminator, and performing countermeasure training on the multi-scale discriminator by using images with different resolutions to repair the images;
step three, expanding convolution is used in a generator, and a Poisson mixing method is used for carrying out subsequent processing on the repaired image;
and step four, verifying the advantages of the image repairing algorithm based on the generative confrontation network model and the repairing effect of the image through CelebA, ImageNet and Place2 data sets.
It is another object of the present invention to provide a computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
step one, constructing a depth generation countermeasure repair model consisting of a generator and an countermeasure discriminator, and synthesizing missing contents from random noise by utilizing reconstruction loss and countermeasure loss;
improving the network structure of the discriminator, providing a multi-scale discriminator structure on the basis of a global discriminator and a local discriminator, and performing countermeasure training on the multi-scale discriminator by using images with different resolutions to repair the images;
step three, expanding convolution is used in a generator, and a Poisson mixing method is used for carrying out subsequent processing on the repaired image;
step four, verifying the advantages of the image repairing algorithm based on the generative confrontation network model and the image repairing effect through CelebA, ImageNet and Place2 data sets
By combining all the technical schemes, the invention has the advantages and positive effects that: the invention provides an image restoration method based on a multi-scale generation type countermeasure network model, which provides a multi-scale generation countermeasure restoration model consisting of a generator and a plurality of countermeasure discriminators, and synthesizes missing contents from random noise by utilizing reconstruction loss and a plurality of countermeasure losses; by using the idea of WGAN and adopting EM distance to simulate data distribution, the network stability is improved and the picture restoration effect is improved. Finally, verification is carried out on the CelebA data set, subjective and objective evaluation methods are utilized to prove that the image restoration algorithm based on the multi-scale generation type countermeasure network, which is provided by the method, has higher restoration performance compared with the current image restoration method, corresponding training and testing are carried out on the ImageNet data set and the Places2 data set, the algorithm can be applied to restoration of various types of pictures, the algorithm has good effects, and the method has great significance in the fields of public security criminal investigation facial restoration, image scaling, redundant target elimination, image lossy compression, biomedical image application and the like.
Technical effect or experimental effect of comparison. The method comprises the following steps:
the first index is Peak Signal to Noise Ratio (PSNR), which is an objective standard for evaluating images, and is used to measure the pixel difference between a real image and a repaired image, and a larger value indicates less distortion.
Quantitative experimental results on PSNR
Figure BDA0002700901950000101
The second Index is a Structural Similarity Index (SSIM), which is used to evaluate the Structural Similarity between two images, and the value is a number between 0 and 1, and a larger value represents a smaller difference between the repaired image and the real image.
Quantitative experimental results on SSIM
Figure BDA0002700901950000102
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained from the drawings without creative efforts.
Fig. 1 is a flowchart of an image inpainting method based on a multi-scale generative confrontation network model according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a generative confrontation network model according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a multi-scale discriminator model according to an embodiment of the invention.
Fig. 4 is a schematic diagram of a network architecture according to an embodiment of the present invention.
FIG. 5 is a diagram illustrating a comparison of repair results for different models provided by an embodiment of the present invention;
in the figure: fig. (a) is an original image; graph (b) is a missing image; panel (c) is the CE result; panel (d) GLCIC results; graph (e) is the result of the algorithm provided by the present invention.
Fig. 6 is a schematic diagram of a repair result on the ImageNet dataset according to the embodiment of the present invention.
FIG. 7 is a diagram illustrating the repair result on the Places2 data set according to an embodiment of the present invention.
Fig. 8 is a schematic diagram of a repair result of two different pictures according to an embodiment of the present invention;
in the figure: figure (a) is an original picture; graph (b) is a missing picture; fig. (c) is a repair picture.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Aiming at the problems in the prior art, the invention provides an image restoration method based on a multi-scale generation type confrontation network model, and the invention is described in detail below with reference to the accompanying drawings.
As shown in fig. 1, an image repairing method based on a multi-scale generation type confrontation network model provided by an embodiment of the present invention includes the following steps:
s101, constructing a depth generation countermeasure repair model consisting of a generator and an countermeasure discriminator, and synthesizing missing contents from random noise by using reconstruction loss and countermeasure loss.
S102, improving a network structure of the discriminator, proposing a multi-scale discriminator structure on the basis of a global discriminator and a local discriminator, and carrying out countermeasure training on the multi-scale discriminator by using images with different resolutions to repair the images.
And S103, performing subsequent processing on the repaired image by using a Poisson mixing method by using expansion convolution in a generator.
S104, verifying the advantages of the image repairing algorithm based on the generative confrontation network model and the repairing effect of the image through CelebA, ImageNet and Place2 data sets.
The present invention will be further described with reference to the following examples.
1. Summary of the invention
First, a deep-generation countermeasure restoration model composed of a generator and a countermeasure discriminator is proposed, and missing contents are synthesized from random noise using a reconstruction loss and a countermeasure loss. Secondly, a multi-scale discriminator structure is provided, and image restoration is carried out by using images with different resolutions for countertraining. Thirdly, the generator uses expansion convolution to reduce the information loss in the down-sampling process of the image, and a certain post-processing is carried out on the repaired image by using the currently popular Poisson mixing method. Finally, the advantages of the algorithm and the image restoration effect are demonstrated through experiments.
2. Related work
With the development of deep learning, a generative confrontation network model (GANs) proposed by Goodfellow et al in 2014 is a milestone development in the development of deep learning, and with the advent of GAN, the problem of blurring pictures generated by using a traditional VAE is solved, a frightening effect is achieved, and a large number of clear pictures can be generated theoretically.
The main inspiration of GAN comes from the idea of zero-sum game in game theory, and the whole network includes two network structures that are antagonistic to each other, namely, generation network g (generator) and discrimination network d (discriminator), as shown in fig. 2. And the game is continuously played through G and D, so that G learns the distribution of real data, and if the countermeasure network is used for generating images, the G can generate vivid images from noise after continuous training. G, D main functions are: g is a generative network, which inputs a random noise Z (random number) by which a false picture for spoofing D, G (Z), is generated. D is a discrimination network to discriminate whether a picture is "real". The input of the method is a picture, which may be from a real picture in a data set or from a picture generated by G, the output is the probability that the input picture is the real picture, if the output probability is 1, the D is represented to determine that the picture is the real picture, and if the output probability is 0, the D is represented to judge that the picture cannot be the real picture (namely, the picture generated by G). In the training process, the goal of generating the network G is to generate vivid false images as much as possible to deceive the discrimination network D. The goal of D is to try to distinguish between false images generated by G and true images. Thus, the training process of G and D forms a dynamic 'game process', and finally, the balance state, namely Nash equilibrium, is achieved. The result of the game is that in the most desirable state, G can generate a picture that is sufficiently true. For D, it is difficult to determine whether the picture generated by G is real or not, i.e. the output probability is 0.5, thus obtaining a generative network model G that can be used to generate the picture.
One of the main problems of GAN is instability in the learning process, such as network failure to converge, easy gradient disappearance and gradient descent, which results in a great deal of research on the problem. Wasserstein GAN proposed by Arjovsky et al improves GAN from the perspective of loss function, WGAN after the loss function improvement can obtain good performance results even on a full link layer, and the problem of unstable training is solved. Gulrajani et al improved on the basis of Wasserstein GAN, optimized the conditions of continuity limitation, solved the problems of disappearance of training gradient and gradient explosion and accelerated the convergence rate. The LSGAN model proposed by Mao et al uses a least square loss function instead of a loss function of GAN, and also alleviates the problems of unstable GAN training, poor quality of generated images and insufficient diversity.
Due to the rapid development of GAN in recent years, people have higher and higher requirements for resolution of GAN generated pictures, and another problem with GAN is that a network downsamples images in a pooling process to extract low-dimensional features, so that much key information in the images is lost, and a discriminator is easier to distinguish whether the images are true or false, so that gradients cannot indicate a correct optimization direction. How to effectively utilize the features extracted from each layer of the neural network, and fully extract the low-dimensional features of the image while minimizing the loss caused by the down-sampling process is a hot spot of current research. Yu et al in 2016 proposed an extended convolution method, which can expand the receptive field while keeping the feature size unchanged during the convolution process, effectively reducing the information loss caused by downsampling during the conventional convolution process, and using the method for image processing. The "pix 2 pixHD" model proposed by Wang et al uses conditional generation countermeasure networks (conditional GANs) to synthesize high-resolution realistic images, and uses a latest multi-scale generator-discriminator structure to improve the picture quality and the resolution of pictures while performing stable training, and fig. 3 shows a schematic diagram of multi-scale discriminator models, which have the same network structure but operate at different picture scales. These discriminators are referred to as D1, D2 and D3. In particular, the real and synthesized high resolution images are down sampled separately. Then training D1, D2 and D3 distinguish real images from synthetic images at three different scales, respectively.
The work of the invention is based on the method of "Context Encoder" proposed by Pathak et al and the method of "Global and Locally Consistent Image Completion" proposed by IIZUKA et al. The original purpose of GAN is to train the generative model using a convolutional neural network. These generators are trained with the aid of a discriminator, which is used to distinguish whether an image is generated by the generator or is real. The generator is trained to fool the arbiter while updating the arbiter. By using Mean Square Error (MSE) loss in combination with GAN loss, an image restoration network can be trained, avoiding the blur that is common when MSE loss is used alone. Using only this approach can make network training unstable. The present invention avoids this problem by replacing the loss of the traditional GAN with the loss in WGAN, using EM distance to measure the difference between data distributions, not training a pure generative model and tuning the learning process to prioritize stability. In addition, the framework and the training process are greatly optimized specially aiming at the image repairing problem. In particular, instead of using a single discriminator, a multi-scale discriminator similar to that in the "pix 2 pixHD" model is employed to improve the visual quality, using multiple discriminators.
3. Multi-scale countermeasure network model
In this section, the present invention introduces a multi-scale countermeasure network model, including a generation network for image restoration, four additional discriminant networks for training assistance, i.e., two multi-scale discriminant networks, a global discriminant network and a local discriminant network, so that the entire network can be trained to perform the image restoration task with excellent results. During training, the discriminators are trained to determine whether the image has been successfully repaired, while the generator is trained to fool all discriminators. Only through all the networks trained together can the generator actually repair the various images. The network architecture is shown in fig. 4.
3.1 generators
Using a convolutional autocoder as a generator model
Figure BDA0002700901950000141
I.e. a standard encoder-decoder architecture that takes as input an image with missing regions, and generates a latent feature representation of the image by a convolution operation. The decoder architecture uses this latent feature representation to restore the original resolution by a transposed convolution operation, resulting in image content of the missing region. Unlike the original GAN model, which starts directly from the noise vector, the hidden representation obtained from the encoder captures more variations and relationships between the unknown and known regions, which are then input to the decoder to generate the content. The intermediate layer uses the expanded convolution, and each output pixel is calculated by using a larger input area without additional parameters and calculation amount, and compared with a standard convolution layer, the expanded convolution network model can calculate each output pixel under the influence of a larger pixel area of an input image. If the expansion convolution is not used, it only uses a small pixel area, and can not use more context information to synthesize the image. The generator uses a standard self-encoder network, and an expansion convolutional layer is added on the basis, namely the generator network introduced in the existing document removes two layers of convolutional layers in the middle, and the network architecture is as shown in table 1, and sequentially comprises a network layer type, a convolutional kernel size, the number of zero padding of the convolutional kernel, a step length and the number of output channels of the layer from left to right.
Table 1 Generator network architecture
Figure BDA0002700901950000151
3.2 discriminator
By training the generator it is possible to fill in the corresponding pixels of the missing region with small reconstruction losses. Using the generator alone does not ensure that the filled area remains visually consistent. The generated image missing region is very blurred in pixels, and only the general shape of the missing region can be captured. In order to obtain a more vivid effect, a global discriminator, a local discriminator and a multi-scale discriminator are added as binary classifiers to distinguish true and false images, so as to distinguish whether the images are real or repaired. The discriminators help the network to improve the quality of the repair result, and a trained discriminator is not fooled by unrealistic images. These discriminators compress the image into corresponding small feature vectors based on a convolutional neural network. The prediction corresponds to a probability value that the image is authentic.
First, a local discriminator
Figure BDA0002700901950000161
Determining whether the composite content of the missing region is authentic can help the network generate information of the missing content, which encourages the generated objects to be semantically effective. Its limitations are also apparent due to the locality of the local arbiter. The local discriminator loss can neither normalize the global structure of a face nor ensure the consistency of the inner and outer edges of the missing region. Therefore, the inconsistency of the pixel values of the repair picture along the repair area boundary is significant.
Due to the limitation of local discriminators, another network structure named global discriminator is introduced
Figure BDA0002700901950000162
To determine the accuracy of the image as a whole. The basic idea is that the content of the generated image restoration area is not only realistic, but also consistent with the context. The network with the global discriminator greatly relieves the problem of inconsistency, further improves the effect of generating the repair picture and ensures that the repair picture is more real.
Finally, a multi-scale discriminator network structure is proposed. The basic idea is to down-sample the real and synthesized images with down-sampling coefficients of 2 and 4, respectively, train two discriminators
Figure BDA0002700901950000163
The real image and the restored image are distinguished on two different scales, respectively. The process of repairing the image by the generator is strictly controlled by two discriminator networks which input images with different resolutions, and the two multi-scale discriminators and the global discriminator have similar architectures but have different receptive fields. Compared with the method of singly using the global discriminator, the combined multi-scale discriminator is used for training to guide the generator to generate the repairing picture with stronger global consistency and finer details, and the repairing effect of the whole picture is more reasonable visually. By adding the two multi-scale discriminators into the network, a restored picture with better effect can be obtained.
And removing the last two full-connection layers from the global arbiter and the local arbiter in the model, and keeping other structures unchanged. The global arbiter, local arbiter, and multi-scale arbiter network architectures are shown in table 2. From left to right, the network layer type, the convolution kernel size, the step length and the number of output channels of the layer are sequentially arranged. a. b, c and d are respectively
Figure BDA0002700901950000164
TABLE 2 Multi-Scale discriminator architecture
Figure BDA0002700901950000171
3.3 loss function
There are generally many reasonable ways to fill in missing image regions that are consistent with the context. This behavior can be modeled, for example, by a loss function. Thus introducing reconstruction losses to the generator first
Figure BDA0002700901950000172
Responsible for capturing structural information of the missing region and keeping consistent with the context, i.e. L between the pixels of the restored image and the original image2Distance, z is noise mask:
Figure BDA0002700901950000173
but only using the losses
Figure BDA0002700901950000174
It is observed that the resulting restored image content tends to blur and smooth because of L2The reason for the loss is due to L2The loss of (c) penalizes outliers severely encouraging the network to smoothly cross various assumptions to avoid large penalties. By using a discriminator, a penalty on antagonism is introduced, which reflects how the generator fools the discriminator to the maximum extent, and how the discriminator distinguishes between true and false. The antagonism loss is based on the loss of GAN. To learn the generative model of the data distribution, the GAN learns a antagonism discriminator model
Figure BDA0002700901950000181
A loss gradient is provided for the generator model. Antagonism discriminator
Figure BDA0002700901950000182
Simultaneous pair generator
Figure BDA0002700901950000183
Generating samples and true samples for prediction and attempting to distinguish them, and generators
Figure BDA0002700901950000184
The arbiter is obfuscated by generating samples that are as "true" as possible
Figure BDA0002700901950000185
Figure BDA0002700901950000186
Wherein, Pdata(x) And Pz(z) represents the distribution of the real data x and the noise variance z, respectively. By minimizing generator losses and maximizing discriminator lossesAnd optimizing the network.
The cross entropy (JS divergence) in the traditional GAN is not suitable for measuring the distance between generated data distribution and real data distribution, if training the GAN by optimizing the JS divergence can lead to that a correct optimization target can not be found, so the WGAN proposes to use Wasserein distance (called Earth-Mover distance) as an optimization method to train the GAN, and the specific method is that
Figure BDA0002700901950000187
The sigmoid is removed in the last layer,
Figure BDA0002700901950000188
and
Figure BDA0002700901950000189
the loss function of (2) does not take the logarithm of the loss function, and is updated every time
Figure BDA00027009019500001810
After a parameter, its absolute value is truncated to not exceed a fixed constant, the gradient clipping. The algorithm of the present invention does not use the traditional goal function of GAN but uses this approach:
Figure BDA00027009019500001811
wherein the content of the first and second substances,
Figure BDA00027009019500001816
is a set of 1-Lipschitz functions.
Four discrimination networks
Figure BDA00027009019500001812
The definition of the loss function is the same. The only difference is that the local arbiter only provides the trained loss gradient for the missing region, and the global arbiter and the multi-scale arbiter back-propagate the loss gradient over the entire image at different resolutions. The discriminators are defined as:
Figure BDA00027009019500001813
wherein, the local discriminator
Figure BDA00027009019500001814
Is input into the generator
Figure BDA00027009019500001815
And outputting the repaired part of the image and the corresponding part of the real image.
Figure BDA0002700901950000191
Wherein the global arbiter
Figure BDA0002700901950000192
Input is generator
Figure BDA0002700901950000193
And (4) outputting an image and a real image.
Figure BDA0002700901950000194
Wherein, the multi-scale discriminator
Figure BDA0002700901950000195
Is input into the generator
Figure BDA0002700901950000196
The output image and the real image are down-sampled by 2 times, respectively.
Figure BDA0002700901950000197
Wherein, the multi-scale discriminator
Figure BDA0002700901950000198
Is input into the generator
Figure BDA0002700901950000199
The output image and the real image are down-sampled by 4 times, respectively.
In summary, the total loss function for the whole network optimization is defined as:
Figure BDA00027009019500001910
λ1、λ2、λ3、λ4weights corresponding to the different losses for balancing the influence of the different losses on the overall loss function, λ1、λ2、λ3、λ4The specific numerical value of (A) needs to be set manually in the experimental process.
4. Training
The work of the invention is based on the realization of a deep convolution antithetical neural network, and in order to effectively train the network, the training process is divided into three stages. First, a generator network is trained
Figure BDA00027009019500001911
By training the network with reconstruction loss, the generator can obtain fuzzy repair content, and the stage does not comprise countertraining and counterloss. Secondly, training all the discriminator networks by using the generator network finished by the first stage training
Figure BDA00027009019500001912
All discriminators are updated with loss of immunity. The last stage performs joint countermeasure training for the generator and all discriminators. Each stage prepares for the improvement of the next stage, thereby greatly improving the effectiveness and efficiency of network training, and the training process is completed by back propagation.
Figure BDA0002700901950000201
When the training of the antagonism loss is carried out, the situation that the recognizer is too strong at the beginning of the training process is avoided. A default hyper-parameter (e.g., learning rate) is used. Setting of lambda1、λ2、λ3、λ4Are all 0.001. Training is done by adjusting the image size, using the image cropped to 256 × 256 as the input image. For the missing region, the input of the central square region in the image is set to 0, i.e. the missing part of the image, approximately covering the 1/4 image. The input for global discrimination is a full image of 256 × 256 size, the input for local discriminators is an image of a repair area of 128 × 128 size, and the input for the two multi-scale discriminators are full images of 128 × 128 and 64 × 64 size, respectively. Our network model can reasonably fill in missing regions, but sometimes the generated regions have color inconsistency with surrounding regions. To avoid this, a simple post-processing is performed by mixing the repaired area with the colors of the surrounding pixels. In particular, the present invention uses poisson image blending for subsequent processing of images.
5. Results and analysis of the experiments
The present invention trains a multi-scale generative confrontation network model by using 100000 images acquired from the CelebA dataset. 80000 sheets for training and 20000 sheets for testing, the data set includes a wide variety of face images, and the batch size is set to 32. The generator network goes through 20000 iterations; then training a discriminator to iterate for 10000 times; and finally, training the whole network 70000 times together. The device parameters are CPU: intel i7-8700, GPU: RTX2080Ti-11G, memory: DDR 4-3000-32G. The code runs under a Pythrch deep learning framework, and the whole network training completion time is about 5 days.
The obtained experimental results were compared with those of the "Context Encoders" method using only one discriminator acting on the repair area and the "Globally and Locally Consistent Image Completion" method using a generator and two discriminators. For comparative fairness, the model was retrained for the same number of iterations, and the results are shown in FIG. 5.
In each test image, the network will automatically cover the area in the middle of the image, since important components of the face (e.g., eyes, mouth, eyebrows, hair, nose) will typically be included in the middle. The four rows represent the repair results of four different test images, respectively. The first column a corresponds to four original non-missing images. The second column b is the missing image with the mask added. The third column c is the repair result of the "Context Encoders" network, and because the structure lacks the understanding of global consistency, the result repaired by the method has obvious global inconsistency and the repair effect of the missing area is very fuzzy, so that the requirement of the image repair task cannot be met.
The fourth column d is a repairing effect diagram of a global discriminator and a Locally discriminated 'global and local Consistent Image Completion' method, the network can repair the Image more reasonably by introducing the countermeasure loss, the local discriminator influences the Image missing region to successfully complete the repair of the missing region part, the global discriminator influences the whole Image according to the global inconsistency of the repaired Image to force the network to generate the Globally Consistent Image, so that the obvious edge difference is eliminated, and the repairing result is better. The fifth column e, the repair result of the algorithm proposed by the present invention, uses WGAN loss to make the training of the entire antagonistic network more stable. A multi-scale discriminator is added and is trained together with a global discriminator and a local discriminator. It can be seen that compared with the result of d, e is improved to a certain extent in the aspect of the details of restoration, the image integrity is higher, and the restoration effect is better.
Besides the visual effect, the invention also carries out quantitative evaluation on the CelebA test data set by using the PSNR and the SSIM, and the two indexes are calculated between the repair result obtained by different methods and the original face image.
The first index is peak signal-to-noise ratio (PSNR), an objective criterion for evaluating images, which directly measures the difference in pixel values, with larger values indicating less distortion. Assuming that the two input images are X and Y, respectively, the calculation formula is as follows:
Figure BDA0002700901950000221
Figure BDA0002700901950000222
where MSE represents the Mean Square Error (Mean Square Error) of the restored image X and the real image Y, H, W represents the height and width of the image, respectively, n represents the number of bits per pixel, and is generally 8, i.e. the number of pixel gray levels is 256, and the result is shown in table 3.
TABLE 3 results of quantitative experiments on PSNR
Figure BDA0002700901950000223
The second index is a Structural Similarity Index (SSIM), which is an index for measuring the similarity between two images, and is a number between 0 and 1, and a larger value represents a smaller difference between a repaired image and a real image, i.e., the image quality is better. When the two images are identical, their value is 1. Assuming that the two input images are X and Y, respectively, the calculation formula is as follows:
Figure BDA0002700901950000224
wherein, muXAnd muYRespectively represent the mean values of X, Y,. sigmaXAnd σYRespectively represents the standard deviation, sigma, of X and YXYRepresents the covariance of X and Y, and c1,c2Are respectively constant, avoiding denominator of 0. The calculation results are shown in table 4.
TABLE 4 results of quantitative experiments on SSIM
Figure BDA0002700901950000231
In addition, in order to prove that the algorithm provided by the invention can be suitable for various types of image restoration, 50000 images acquired from ImageNet data sets and 50000 images acquired from Places2 data sets are respectively used for correspondingly training the model of the invention. The network model training method is the same as the training method used in the CelebA dataset, and the experimental results are respectively shown in FIG. 6 and FIG. 7, which shows that the model also has good performance on ImageNet dataset and Places2 dataset.
In a word, the invention analyzes the defects of the existing algorithm, correspondingly introduces the principle of the generative confrontation network, applies the generative confrontation network to the image restoration algorithm, provides a multi-scale generative confrontation restoration model consisting of a generator and a plurality of confrontation discriminators, and synthesizes the missing content from random noise by utilizing reconstruction loss and a plurality of confrontation losses; by using the idea of WGAN and adopting EM distance to simulate data distribution, the network stability is improved and the picture restoration effect is improved. Finally, verification is carried out on the CelebA data set, and subjective and objective evaluation methods are utilized to prove that the image restoration algorithm based on the multi-scale generation type confrontation network has higher restoration performance compared with the current image restoration method, corresponding training and testing are carried out on the ImageNet data set and the Places2 data set, and the algorithm can be applied to restoration of various types of pictures and has good effect.
The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. An image restoration method based on a multi-scale generation type confrontation network model is characterized in that the image restoration method based on the multi-scale generation type confrontation network model comprises the following steps:
constructing a depth generation countermeasure repair model consisting of a generator and an countermeasure discriminator, and synthesizing missing contents from random noise by utilizing reconstruction loss and countermeasure loss;
improving a network structure of a discriminator, providing a multi-scale discriminator structure on the basis of a global discriminator and a local discriminator, and performing countermeasure training on the multi-scale discriminator by using images with different resolutions to repair the images;
expanding convolution is used in a generator, and a Poisson mixing method is utilized to carry out subsequent processing on the repaired image;
the advantages of the image restoration algorithm based on the generative confrontation network model and the restoration effect of the image are verified through CelebA, ImageNet and Place2 data sets.
2. The method as claimed in claim 1, wherein the multi-scale generation type confrontation network model comprises a generation network for image restoration, and four additional discriminant networks for training assistance, namely two multi-scale discriminant networks, a global discriminant network and a local discriminant network.
3. The method as claimed in claim 1, wherein the generator uses a convolutional auto-encoder as the generator model
Figure FDA0002700901940000011
Namely a standard encoder-decoder structure, the encoder structure takes an image with a missing region as an input, and generates a potential feature representation of the image through a convolution operation;
the decoder structure utilizes the potential feature representation to restore the original resolution through a transposition convolution operation to generate the image content of the missing area; unlike the original GAN model, which starts directly from the noise vector, the hidden representation obtained from the encoder captures more variations and relationships between the unknown and known regions, which are then input to the decoder to generate the content; the intermediate layer uses the expanded convolution, each output pixel is allowed to be calculated by using a larger input area, no additional parameter and calculated amount exist, and compared with a standard convolution layer, the expanded convolution network model can calculate each output pixel under the influence of a larger pixel area of an input image; the generator uses a standard self-encoder network, and an expansion convolutional layer is added on the basis, namely the generator network removes two layers of convolutional layers in the middle, and the types of the network layers, the sizes of convolutional cores, the number of zero padding of the convolutional cores, the step length and the number of output channels of the layers are sequentially arranged from left to right.
4. The image inpainting method based on the multi-scale generation type confrontation network model as claimed in claim 1, wherein the discriminator compresses the image into corresponding small feature vectors based on a convolutional neural network; predicting a probability value corresponding to the image being authentic;
first, a local discriminator
Figure FDA0002700901940000022
Whether the synthesized content of the missing area is real or not is determined, the network can be helped to generate the information of the missing content, and the generated object is encouraged to be semantically effective;
due to the locality of the local arbiter, another network structure named global arbiter is introduced
Figure FDA0002700901940000023
To determine the accuracy of the image as a whole;
finally, a multi-scale discriminator network structure is provided; the basic idea is to down-sample the real and synthesized images with down-sampling coefficients of 2 and 4, respectively, train two discriminators
Figure FDA0002700901940000024
Distinguishing a real image and a restored image on two different scales respectively; the process of repairing an image by a generator is carried out by two discriminator networks with different resolution images as inputStrict control is performed, and the two multi-scale discriminators and the global discriminator have similar architectures but have different-sized reception fields;
removing the last two full-connection layers from the global arbiter and the local arbiter in the model, and keeping other structures unchanged; from left to right, sequentially setting the type of a network layer, the size of a convolution kernel, the step length and the number of output channels of the layer; a. b, c and d are respectively
Figure FDA0002700901940000025
5. The method for image inpainting based on multi-scale generative confrontation network model as claimed in claim 1, wherein the method of modeling by loss function is as follows:
first introducing reconstruction losses to the generator
Figure FDA0002700901940000026
Responsible for capturing structural information of the missing region and keeping consistent with the context, i.e. L between the pixels of the restored image and the original image2Distance, z is noise mask:
Figure FDA0002700901940000021
but only using the losses
Figure FDA0002700901940000027
The resulting restored image content was observed to tend to be blurred and smooth; because L is2The reason for the loss is due to L2The loss of (1) penalizes outliers severely, encouraging the network to smoothly cross various assumptions to avoid large penalties; by using a discriminator, a penalty on antagonism is introduced, which reflects how the generator fools the discriminator to the maximum extent, and how the discriminator distinguishes between true and false; antagonism loss is based on the loss of GAN, which learns an antagonism discriminator model
Figure FDA0002700901940000037
Providing a loss gradient for the generator model; antagonism discriminator
Figure FDA0002700901940000038
Simultaneous pair generator
Figure FDA0002700901940000039
Generating samples and true samples for prediction and attempting to distinguish, and generators
Figure FDA00027009019400000311
By generating a true sample for the confusion arbiter
Figure FDA00027009019400000310
Figure FDA0002700901940000031
Wherein, Pdata(x) And Pz(z) represents distributions of real data x and noise variance z, respectively; the network is optimized by minimizing the generator loss and maximizing the arbiter loss.
6. The image inpainting method based on the multi-scale generation type confrontation network model as claimed in claim 1, wherein Wasserein distance is used as an optimization method to train GAN
Figure FDA00027009019400000312
The sigmoid is removed in the last layer,
Figure FDA00027009019400000314
and
Figure FDA00027009019400000315
is not taking the logarithm of the loss functionEach time of update
Figure FDA00027009019400000313
After a parameter, its absolute value is truncated to not more than a fixed constant, gradientclipping:
Figure FDA0002700901940000032
wherein l is a set of 1-Lipschitz functions;
four discrimination networks
Figure FDA00027009019400000322
The definition of the loss function is the same; the only difference is that the local discriminator only provides a loss gradient for training for the missing region, and the global discriminator and the multi-scale discriminator reversely propagate the loss gradient on the whole image with different resolutions; the discriminators are defined as:
Figure FDA0002700901940000033
wherein, the local discriminator
Figure FDA00027009019400000316
Is input into the generator
Figure FDA00027009019400000317
Outputting a repaired part of the image and a part corresponding to the real image;
Figure FDA0002700901940000034
wherein the global arbiter
Figure FDA00027009019400000318
Input is generator
Figure FDA00027009019400000319
An image and a real image are obtained;
Figure FDA0002700901940000035
wherein, the multi-scale discriminator
Figure FDA00027009019400000320
Is input into the generator
Figure FDA00027009019400000321
Respectively sampling the output image and the real image by 2 times;
Figure FDA0002700901940000036
wherein, the multi-scale discriminator
Figure FDA0002700901940000042
Is input into the generator
Figure FDA0002700901940000043
Respectively down-sampling 4 times of the output image and the real image;
the overall loss function for the entire network optimization is defined as:
Figure FDA0002700901940000041
λ1、λ2、λ3、λ4weights corresponding to the different losses for balancing the influence of the different losses on the overall loss function, λ1、λ2、λ3、λ4The specific value of (A) is required in the experimental processIt is artificially set.
7. The image inpainting method based on the multi-scale generative confrontation network model as claimed in claim 1, wherein the training process is divided into three stages; first, a generator network is trained
Figure FDA0002700901940000044
Training the network by using reconstruction loss, wherein a generator can obtain fuzzy repair content, and the stage does not comprise countermeasure training and countermeasure loss; secondly, training all the discriminator networks by using the generator network finished by the first stage training
Figure FDA0002700901940000045
Updating all the discriminators by using the countermeasure loss; in the last stage, the generator and all the discriminators are subjected to combined confrontation training, and the training process is completed through back propagation;
setting lambda using default hyper-parameters when performing training for resistance loss1、λ2、λ3、λ4Are all 0.001; training is completed by adjusting the image size, and the image is cut into 256 × 256 images to be used as input images; for the missing region, the input of the central square region in the image is set to 0, i.e., the missing portion of the image, approximately covering the 1/4 image; the input for global discrimination is a full image of 256 × 256 size, the input for local discriminators is an image of a repair area of 128 × 128 size, and the input for the two multi-scale discriminators are full images of 128 × 128 and 64 × 64 size, respectively.
8. A restoration system for implementing the image restoration method based on the multi-scale generation type confrontation network model according to any one of claims 1 to 7, characterized in that the image restoration system based on the multi-scale generation type confrontation network model comprises:
the depth generation countermeasure repair model building module is used for building a depth generation countermeasure repair model consisting of a generator and an countermeasure discriminator, and synthesizing missing contents from random noise by utilizing reconstruction loss and countermeasure loss;
the image restoration module is used for improving the network structure of the discriminator, providing a multi-scale discriminator structure on the basis of the global discriminator and the local discriminator, and restoring the image by performing countermeasure training on the multi-scale discriminator by using images with different resolutions;
the image subsequent processing module uses expansion convolution in the generator and carries out subsequent processing on the repaired image by using a Poisson mixing method;
and the image repairing module verifies the advantages of the image repairing algorithm based on the generative confrontation network model and the image repairing effect through CelebA, ImageNet and Place2 data sets.
9. A computer device, characterized in that the computer device comprises a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to carry out the steps of:
step one, constructing a depth generation countermeasure repair model consisting of a generator and an countermeasure discriminator, and synthesizing missing contents from random noise by utilizing reconstruction loss and countermeasure loss;
improving the network structure of the discriminator, providing a multi-scale discriminator structure on the basis of a global discriminator and a local discriminator, and performing countermeasure training on the multi-scale discriminator by using images with different resolutions to repair the images;
step three, expanding convolution is used in a generator, and a Poisson mixing method is used for carrying out subsequent processing on the repaired image;
and step four, verifying the advantages of the image repairing algorithm based on the generative confrontation network model and the repairing effect of the image through CelebA, ImageNet and Place2 data sets.
10. A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
step one, constructing a depth generation countermeasure repair model consisting of a generator and an countermeasure discriminator, and synthesizing missing contents from random noise by utilizing reconstruction loss and countermeasure loss;
improving the network structure of the discriminator, providing a multi-scale discriminator structure on the basis of a global discriminator and a local discriminator, and performing countermeasure training on the multi-scale discriminator by using images with different resolutions to repair the images;
step three, expanding convolution is used in a generator, and a Poisson mixing method is used for carrying out subsequent processing on the repaired image;
and step four, verifying the advantages of the image repairing algorithm based on the generative confrontation network model and the repairing effect of the image through CelebA, ImageNet and Place2 data sets.
CN202011021917.1A 2020-09-25 2020-09-25 Image restoration method based on multi-scale generation type confrontation network model Pending CN112541864A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011021917.1A CN112541864A (en) 2020-09-25 2020-09-25 Image restoration method based on multi-scale generation type confrontation network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011021917.1A CN112541864A (en) 2020-09-25 2020-09-25 Image restoration method based on multi-scale generation type confrontation network model

Publications (1)

Publication Number Publication Date
CN112541864A true CN112541864A (en) 2021-03-23

Family

ID=75013872

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011021917.1A Pending CN112541864A (en) 2020-09-25 2020-09-25 Image restoration method based on multi-scale generation type confrontation network model

Country Status (1)

Country Link
CN (1) CN112541864A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113240613A (en) * 2021-06-07 2021-08-10 北京航空航天大学 Image restoration method based on edge information reconstruction
CN113256541A (en) * 2021-07-16 2021-08-13 四川泓宝润业工程技术有限公司 Method for removing water mist from drilling platform monitoring picture by machine learning
CN113345077A (en) * 2021-06-10 2021-09-03 西北大学 Method for reconstructing three-dimensional model of Qin cavity figure improved based on PIFu and 3D-GAN
CN113362311A (en) * 2021-06-10 2021-09-07 山东大学 Deep generation network assisted functional full crown prosthesis form generation method
CN113469177A (en) * 2021-06-30 2021-10-01 河海大学 Drainage pipeline defect detection method and system based on deep learning
CN113837953A (en) * 2021-06-11 2021-12-24 西安工业大学 Image restoration method based on generation countermeasure network
CN114359300A (en) * 2022-03-18 2022-04-15 成都数之联科技股份有限公司 Method, device and system for optimizing image segmentation model and storage medium
CN115115783A (en) * 2022-07-08 2022-09-27 西南石油大学 Digital core construction method and system for simulating shale matrix nano-micron pores
CN115689910A (en) * 2022-09-07 2023-02-03 江苏济远医疗科技有限公司 Image restoration method for processing multi-scale noise
CN116320459A (en) * 2023-01-08 2023-06-23 南阳理工学院 Computer network communication data processing method and system based on artificial intelligence
WO2024022485A1 (en) * 2022-07-29 2024-02-01 中国人民解放军总医院第一医学中心 Computer angiography imaging synthesis method based on multi-scale discrimination

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109146784A (en) * 2018-07-27 2019-01-04 徐州工程学院 A kind of image super-resolution rebuilding method based on multiple dimensioned generation confrontation network
CN110222628A (en) * 2019-06-03 2019-09-10 电子科技大学 A kind of face restorative procedure based on production confrontation network
CN110223259A (en) * 2019-06-14 2019-09-10 华北电力大学(保定) A kind of road traffic fuzzy image enhancement method based on production confrontation network
CN110378844A (en) * 2019-06-14 2019-10-25 杭州电子科技大学 Motion blur method is gone based on the multiple dimensioned Image Blind for generating confrontation network is recycled
AU2020100274A4 (en) * 2020-02-25 2020-03-26 Huang, Shuying DR A Multi-Scale Feature Fusion Network based on GANs for Haze Removal
US20200234402A1 (en) * 2019-01-18 2020-07-23 Ramot At Tel-Aviv University Ltd. Method and system for end-to-end image processing

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109146784A (en) * 2018-07-27 2019-01-04 徐州工程学院 A kind of image super-resolution rebuilding method based on multiple dimensioned generation confrontation network
US20200234402A1 (en) * 2019-01-18 2020-07-23 Ramot At Tel-Aviv University Ltd. Method and system for end-to-end image processing
CN110222628A (en) * 2019-06-03 2019-09-10 电子科技大学 A kind of face restorative procedure based on production confrontation network
CN110223259A (en) * 2019-06-14 2019-09-10 华北电力大学(保定) A kind of road traffic fuzzy image enhancement method based on production confrontation network
CN110378844A (en) * 2019-06-14 2019-10-25 杭州电子科技大学 Motion blur method is gone based on the multiple dimensioned Image Blind for generating confrontation network is recycled
AU2020100274A4 (en) * 2020-02-25 2020-03-26 Huang, Shuying DR A Multi-Scale Feature Fusion Network based on GANs for Haze Removal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李克文等: "多尺度生成式对抗网络图像修复算法", 《计算机科学与探索》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113240613A (en) * 2021-06-07 2021-08-10 北京航空航天大学 Image restoration method based on edge information reconstruction
CN113345077B (en) * 2021-06-10 2024-03-15 西北大学 PIFU and 3D-GAN-based improved three-dimensional model reconstruction method for Qin cavity character
CN113345077A (en) * 2021-06-10 2021-09-03 西北大学 Method for reconstructing three-dimensional model of Qin cavity figure improved based on PIFu and 3D-GAN
CN113362311A (en) * 2021-06-10 2021-09-07 山东大学 Deep generation network assisted functional full crown prosthesis form generation method
CN113837953A (en) * 2021-06-11 2021-12-24 西安工业大学 Image restoration method based on generation countermeasure network
CN113837953B (en) * 2021-06-11 2024-04-12 西安工业大学 Image restoration method based on generation countermeasure network
CN113469177A (en) * 2021-06-30 2021-10-01 河海大学 Drainage pipeline defect detection method and system based on deep learning
CN113469177B (en) * 2021-06-30 2024-04-26 河海大学 Deep learning-based drainage pipeline defect detection method and system
CN113256541A (en) * 2021-07-16 2021-08-13 四川泓宝润业工程技术有限公司 Method for removing water mist from drilling platform monitoring picture by machine learning
CN114359300A (en) * 2022-03-18 2022-04-15 成都数之联科技股份有限公司 Method, device and system for optimizing image segmentation model and storage medium
CN114359300B (en) * 2022-03-18 2022-06-28 成都数之联科技股份有限公司 Optimization method, device and system of image segmentation model and storage medium
CN115115783A (en) * 2022-07-08 2022-09-27 西南石油大学 Digital core construction method and system for simulating shale matrix nano-micron pores
CN115115783B (en) * 2022-07-08 2023-08-15 西南石油大学 Digital rock core construction method and system for simulating shale matrix nano-micro pores
WO2024022485A1 (en) * 2022-07-29 2024-02-01 中国人民解放军总医院第一医学中心 Computer angiography imaging synthesis method based on multi-scale discrimination
CN115689910A (en) * 2022-09-07 2023-02-03 江苏济远医疗科技有限公司 Image restoration method for processing multi-scale noise
CN116320459B (en) * 2023-01-08 2024-01-23 南阳理工学院 Computer network communication data processing method and system based on artificial intelligence
CN116320459A (en) * 2023-01-08 2023-06-23 南阳理工学院 Computer network communication data processing method and system based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN112541864A (en) Image restoration method based on multi-scale generation type confrontation network model
CN111047516B (en) Image processing method, image processing device, computer equipment and storage medium
CN108520503B (en) Face defect image restoration method based on self-encoder and generation countermeasure network
Liu et al. Robust single image super-resolution via deep networks with sparse prior
Liu et al. Learning converged propagations with deep prior ensemble for image enhancement
Zhang et al. Adaptive residual networks for high-quality image restoration
EP3951702A1 (en) Method for training image processing model, image processing method, network device, and storage medium
Kolesnikov et al. PixelCNN models with auxiliary variables for natural image modeling
Tuzel et al. Global-local face upsampling network
CN106408550A (en) Improved self-adaptive multi-dictionary learning image super-resolution reconstruction method
CN112132959A (en) Digital rock core image processing method and device, computer equipment and storage medium
Zhao et al. Image super-resolution via adaptive sparse representation
CN110706303A (en) Face image generation method based on GANs
CN114339409A (en) Video processing method, video processing device, computer equipment and storage medium
CN113658040A (en) Face super-resolution method based on prior information and attention fusion mechanism
CN115457568B (en) Historical document image noise reduction method and system based on generation countermeasure network
Shiri et al. Identity-preserving face recovery from stylized portraits
CN112801914A (en) Two-stage image restoration method based on texture structure perception
Liu et al. Survey on gan‐based face hallucination with its model development
Uddin et al. A perceptually inspired new blind image denoising method using $ L_ {1} $ and perceptual loss
CN116645569A (en) Infrared image colorization method and system based on generation countermeasure network
CN117151990B (en) Image defogging method based on self-attention coding and decoding
CN116523985B (en) Structure and texture feature guided double-encoder image restoration method
US20230110393A1 (en) System and method for image transformation
CN114862699B (en) Face repairing method, device and storage medium based on generation countermeasure network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210323