CN113808031A - Image restoration method based on LSK-FNet model - Google Patents

Image restoration method based on LSK-FNet model Download PDF

Info

Publication number
CN113808031A
CN113808031A CN202110757574.3A CN202110757574A CN113808031A CN 113808031 A CN113808031 A CN 113808031A CN 202110757574 A CN202110757574 A CN 202110757574A CN 113808031 A CN113808031 A CN 113808031A
Authority
CN
China
Prior art keywords
image
network
model
edge
fnet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110757574.3A
Other languages
Chinese (zh)
Inventor
杨有
刘思汛
李可森
余平
杨学森
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Normal University
Original Assignee
Chongqing Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Normal University filed Critical Chongqing Normal University
Priority to CN202110757574.3A priority Critical patent/CN113808031A/en
Publication of CN113808031A publication Critical patent/CN113808031A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to the technical field of image restoration, and particularly discloses an image restoration method based on an LSK-FNet model, wherein an edge generation network is constructed and trained; constructing and training an image restoration network; combining the edge generation network and the image restoration network to form an end-to-end LSK-FNet model, and training the LSK-FNet model; and repairing the face image by using an LSK-FNet model. The image restoration work is divided into two steps by an LSK-FNet model: firstly, generating repaired edge information by using an edge generation network as prior information of subsequent image repair; then, the generated prior information and the damaged original image are put into an image restoration network together for restoration; the gated convolution is integrated in the whole generated confrontation network for training, and the region normalization is utilized in the image restoration network to improve the restoration details and accuracy, so that the phenomena of fuzzy edge structure, excessive smoothness, unreasonable semantic understanding and visual artifacts of the restored image are avoided, and the image restoration effect is improved.

Description

Image restoration method based on LSK-FNet model
Technical Field
The invention relates to the technical field of image restoration, in particular to an image restoration method based on an LSK-FNet model.
Background
The Image inpainting technology is to repair the missing area in the Image according to the known information of the undamaged area of the Image, so that the subjectively seen repaired Image is reasonable, the continuity of the whole structure of the Image is kept, the visual continuity is met, and the algorithm of the current Image scene is met.
The existing image restoration uses a generation countermeasure network and an attention mechanism to increase semantic understanding between elements of a face, but the restored image has blurred edge structure, excessive smoothness, unreasonable semantic understanding, visual artifacts and poor restoration effect.
Disclosure of Invention
The invention aims to provide an image restoration method based on an LSK-FNet model, and aims to solve the technical problems of fuzzy edge structure, excessive smoothness, unreasonable semantic understanding, visual artifacts and poor restoration effect of a restored image in the prior art.
In order to achieve the above object, the present invention provides an image restoration method based on an LSK-FNet model, comprising the following steps:
constructing and training an edge generation network;
constructing and training an image restoration network;
combining the edge generation network and the image restoration network to form an end-to-end LSK-FNet model, and training the LSK-FNet model;
and repairing the face image by using the LSK-FNet model.
Wherein, in the step of constructing and training the edge generation network:
the edge generation network is composed of a first generator and a first discriminator, the first generator comprises a first encoder for sampling downwards twice, eight continuous first residual blocks and a first decoder for sampling upwards twice, and a gate control convolution module is added into the first residual blocks.
Wherein the edge generation network generates a countermeasure network model based on deep convolution.
Wherein, in the step of constructing and training the edge generation network:
and training the confrontation network model by using a confrontation loss function and a characteristic matching loss function to obtain the edge generation network.
Wherein, in the step of constructing and training the image inpainting network:
the image restoration network consists of a second generator and a second discriminator, the second generator consists of a second encoder which samples downwards twice, eight continuous second residual blocks and a second decoder which samples upwards twice, and a gating convolution module is added in the second residual blocks.
Wherein, in the step of constructing and training the image inpainting network:
training the confrontation network model by utilizing a reconstruction loss function, a confrontation loss function, a style loss function and a perception loss function to obtain the image restoration network; and area normalization is used in the image inpainting network.
Combining the edge generation network and the image restoration network to form an end-to-end LSK-FNet model, and training the LSK-FNet model, wherein the method comprises the following steps:
firstly, repairing a damaged edge image through the first generator in the edge generation network, inputting a complete edge image extracted by a Canny algorithm, and generating a repaired edge image through training an edge generation model;
secondly, using the complete edge image extracted by the Canny algorithm as prior information and a damaged face image as input, and enabling the image repairing network to adapt to the edge information to repair the image;
and finally, training the damaged face image by a model formed by combining the edge generation network and the image restoration network, realizing end-to-end damaged face restoration, and forming the end-to-end LSK-FNet model.
The method for repairing the face image by using the LSK-FNet model comprises the following steps:
firstly, generating repaired edge information by using the edge generation network as prior information of subsequent image repair;
and then, the generated prior information and the damaged original image are put into the image restoration network together for restoration.
The invention discloses an image restoration method based on an LSK-FNet model, which divides the image restoration work into two steps through the LSK-FNet model: firstly, generating repaired edge information by using an edge generation network as prior information of subsequent image repair; and then, the generated prior information and the damaged original image are put into an image restoration network together for restoration. Gating convolution is integrated in the whole generated countermeasure network for training, region normalization is utilized in the image restoration network to improve the details and accuracy of restoration, the phenomena of fuzzy edge structure, excessive smoothness, unreasonable semantic understanding and visual artifacts of the restored image are avoided, and the image restoration effect is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of the LSK-FNet model of the present invention.
Fig. 2 is a block diagram of an edge-generating network of the present invention.
Fig. 3 is a schematic diagram of the spatial region planning of the present invention.
Fig. 4 is a block diagram of an image repair network of the present invention.
Fig. 5 is a schematic diagram of an irregular mask image of the present invention.
FIG. 6 is a display diagram of different algorithm repair results of the present invention.
Fig. 7 is a detailed comparison of various algorithms of the present invention.
FIG. 8 is a diagram showing the image inpainting result based on edge information according to the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
In the description of the present invention, it is to be understood that the terms "length", "width", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on the orientations or positional relationships illustrated in the drawings, and are used merely for convenience in describing the present invention and for simplicity in description, and do not indicate or imply that the devices or elements referred to must have a particular orientation, be constructed in a particular orientation, and be operated, and thus, are not to be construed as limiting the present invention. Further, in the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
Referring to fig. 1, the present invention provides an image restoration method based on an LSK-FNet model, comprising the following steps:
constructing and training an edge generation network;
constructing and training an image restoration network;
combining the edge generation network and the image restoration network to form an end-to-end LSK-FNet model, and training the LSK-FNet model;
and repairing the face image by using the LSK-FNet model.
In this embodiment, the image restoration model proposed by the present invention integrates prior information and an image restoration network to restore a picture, and the overall Structure is shown in fig. 1, wherein an image restoration method based on an LSK-FNet model is an image restoration method based on a learnable Structure Knowledge Fusion network, and a generator in the Structure of the LSK-FNet (learnable Structure Knowledge of Fusion network) model is composed of an encoder sampling twice downward, eight residual blocks formed by gate convolution, and a decoder sampling twice upward. In the image restoration network structure, gate convolution and area normalization are fused, so that fusion of damaged areas can be accelerated. The accuracy of edge generation information and image restoration information of a restored face image is improved by utilizing deep neural network learning. And the face edge image contains a face fine structure for guiding different areas on two sides of the edge to be repaired to be closer to the structure and color information of the original image. After the priori knowledge of the edge information is utilized, the facial structure characteristics can be more reasonably repaired, and the phenomenon of edge blurring is avoided.
In fig. 1, G1 represents a generator of the edge generation network, D1 represents a discriminator of the edge generation network, G2 represents a generator in the image inpainting network, and D2 represents a discriminator of the image inpainting network.
Mask stands for Mask, Edge for broken Edge, and Grayscale for Grayscale. Truth edge represents a true edge. Presect edge represents the generator generated edge. An Incomplte color image Incomplete picture, a repair picture generated by Output, and a Truth image real picture.
The work of image restoration is divided into two steps by the LSK-FNet model: firstly, generating repaired edge information by using an edge generation network as prior information of subsequent image repair; and then, the generated prior information and the damaged original image are put into an image restoration network together for restoration. Gating convolution is integrated in the whole generated countermeasure network for training, region normalization is utilized in the image restoration network to improve the details and accuracy of restoration, the phenomena of fuzzy edge structure, excessive smoothness, unreasonable semantic understanding and visual artifacts of the restored image are avoided, and the image restoration effect is improved.
Further, in the step of constructing and training the edge generation network:
the edge generation network is composed of a first generator and a first discriminator, the first generator comprises a first encoder for sampling downwards twice, eight continuous first residual blocks and a first decoder for sampling upwards twice, and a gate control convolution module is added into the first residual blocks.
The edge-generating network generates a countermeasure network model based on deep convolution.
In the step of constructing and training the edge generation network:
and training the confrontation network model by using a confrontation loss function and a characteristic matching loss function to obtain the edge generation network.
In the present embodiment, since the normal convolution is likely to cause visual artifacts (such as color difference, blurring and sharp edge response around the hole) during restoration, the present invention proposes a partial convolution operation, which uses a binary mask to control the effective convolution region, so that the convolution depends on only effective pixels. Although partial convolution can improve the sharpness of the edge locations to some extent, some problems still exist. (1) The information in the image area is not classified as valid or invalid. Regardless of how many pixels are covered by the mask of the previous layer, the mask of the next layer will be set to 1, e.g., 1 active pixel and 9 active pixels are considered the same to update the current mask. (2) The invalid pixels of partial convolution gradually disappear in a deep layer, and the rule of disappearance is that as long as one pixel exists as an effective pixel, the mask corresponding to the current pixel is set to be 1. But such a rule set is not reasonable, and if the deep neural network is allowed to automatically learn the optimal mask, the network is able to perform the allocation of the mask at a deep level. (3) All channels of each layer share the same mask, and the mask of each channel cannot be flexibly processed. In essence, partial convolution can be viewed as a hard gating of the non-learnable single channel features.
Due to the limitation of partial convolution, the invention provides a method for automatically learning a mask updating strategy from data by using a gate control convolution module, dynamically identifying the effective pixel position in an image and well processing the transition of a damaged area and a complete area. The formula is as follows:
Gatingy,x=∑∑Wg·I
Featurey,x=∑∑Wf·I
Oy,x=φ(Featurey,x)⊙σ(Gatingy,x)
where σ denotes the sigmoid activation function, resulting in gating information whose output values between (0,1) represent gating weights. Φ can be any activation function (e.g., ReLU or LeakyReLU), WgAnd WfAre two different convolution filters. Two portions of active pixels and image features are dot-multiplied to extract useful information in the image, which indicates that it is dot-multiplied. The training of the image and the mask in the gating convolution is carried out synchronously, and the mask is not transferred according to a fixed and unchangeable rule, so that the repairing result can be more accurate.
The gated convolution learns a dynamic feature selection mechanism of each channel and each position, and the gated convolution learns that features are selected according to background, mask and sketch, and also considers semantic segmentation of certain channels. Even at very deep levels, gated convolution can learn hidden regions and information in separate channels, enabling better generation of repair results.
To enhance the accuracy of edge detail of image inpainting, excessive smoothing or blurring is prevented. The image repairing step is divided into two steps, wherein the first step is to extract and repair the damaged image edge information, and the second step is to repair the damaged image edge information by fusing the repaired edge prior information with the image repairing phase.
The structure of the edge generation network is shown in fig. 2, and the structure of the edge generation network generates a confrontation network model based on deep convolution, and is composed of the first generator G1 and the first discriminator D1. Wherein the first generator G1 comprises a first encoder sampling twice downwards, eight consecutive first residual blocks and a first decoder sampling twice upwards. The residual network can avoid the problem of gradient diffusion due to too deep network depth, and in the first residual block, gated convolution is used instead of ordinary convolution, and the gated convolution can learn the information of hidden areas and in individual channels even in a deep layer. The first discriminator D1 uses a 70 × 70patch gan architecture, and judges whether the edge map is generated, updates the judgment network parameters, and enhances the discrimination capability of the image. The spectrum normalization processing is carried out in the edge generation network.
The image input to the first generator G1 is a mask map, a damaged edge map, or a binary map of a damaged image. And carrying out multilayer convolution standardization processing on the input image by using a generator network, learning information in the image to obtain edge structure information, and finally generating a repaired edge image. The first discriminator D1 is a network for discriminating whether the edge information generated by the first generator G1 is accurate, and improves the discrimination capability of the edge image generated by the generator by continuous learning. The edge information of the complete image is extracted by a Canny edge detector, and the characteristics of the repaired edge image generated by the first generator G1 are fused and compared with each other to improve the discrimination capability of the discriminator and generate an edge image closer to the real edge information by repeated learning.
In fig. 2, Mask represents a Mask image, Edge represents an Edge image, and Grayscale represents a Grayscale image. M multiplied by N represents the length and width of an original picture, M/2 multiplied by N/2 represents that the length and width are respectively reduced by 2 and M/4 multiplied by N/4, the reduction is continued on the basis, and the connected conv + Residual Blocks represent that a convolutional neural network is added into a Residual block. Edge trained by the Predict Edge, Truth Edge real Edge, feature matching, real/Fake, real/false graph.
Let TgtRepresenting an unbroken image, TgrayRepresenting a gray scale map, CgtAn edge map indicating an unbroken image with an image mask M as a precondition (missing area marked 1 and background image marked 0), which is an hadamard product, G1 is the first generator of the edge generating network. In the edge generation network, the input damaged gray scale image is
Figure RE-GDA0003294631080000071
The damaged edge image is
Figure RE-GDA0003294631080000078
The edge generation network generator predicts the edge generation result as shown in formula (2):
Figure RE-GDA0003294631080000072
Figure RE-GDA0003294631080000073
representing the broken gray scale map of the input,
Figure RE-GDA0003294631080000074
representative is a broken edge image. CpredRepresenting the predictive picture trained by the first generator G1.
In the above formula, CpredRepresenting the predicted result generated by the first generator, and adding C to the predicted resultgtPut into the first discriminator together for training and give the true and false probability of the input image. And the constructed loss function is used for training the confrontation generation network to obtain the edge generation network. And comparing the activation mapping of the feature matching loss in the intermediate layer of the first discriminator, and stabilizing the training process by comparing the hypothetical graph generated by the first generator with a real image. The activation map is compared to an activation map of a pre-trained VGG network. However, since the VGG network is not trained to produce edge information, the results of the search at the initial stage are not captured. The following formulas (3) and (4) are formulas for the countermeasure loss and the feature matching loss, respectively:
Figure RE-GDA0003294631080000075
Figure RE-GDA0003294631080000076
representing the expected value of the distribution function, log [ D1 (C)gt,Tgray)]And the probability that the first discriminator judges the real picture as the real picture is represented. log [1-D1 (C)pred,Tgray)]The probability that the first discriminator judges the picture data generated by the first generator as an opposite face of real data and a false picture as a false picture is determined as a false picture is represented
Figure RE-GDA0003294631080000077
L in the feature matching formula represents the last convolutional layer of the first discriminator, D1(i)Representing the activation function of the i-th activation layer of the first discriminator, NiIs the activation result of the ith layer of the first discriminator. The loss function of the edge-generating network is composed of the countermeasure loss and the feature matching loss, and is shown in formula (5):
Figure RE-GDA0003294631080000081
λadv1=1,λFM=10 (6)
in the above equations (5) and (6), D1 represents the first discriminator in the edge generation network, λadv1And λFMAre hyper-parameters of the balance opposition loss and the feature matching loss, with values of 1 and 10, respectively.
Further, in the step of constructing and training the image inpainting network:
the image restoration network consists of a second generator and a second discriminator, the second generator consists of a second encoder which samples downwards twice, eight continuous second residual blocks and a second decoder which samples upwards twice, and a gating convolution module is added in the second residual blocks.
In the step of constructing and training the image inpainting network:
training the confrontation network model by utilizing a reconstruction loss function, a confrontation loss function, a style loss function and a perception loss function to obtain the image restoration network; and area normalization is used in the image inpainting network.
In this embodiment, Feature Normalization (FN) is an important technique to help neural network training, and is usually Feature Normalization across spatial dimensions. Most of the previous image restoration methods apply FN to the network without considering the influence of the damaged area of the input image on regularization, such as mean and variance shifts. On the basis that the mean value and variance deviation caused by FN limits the training of the image restoration network, the invention provides a Region Normalization (RN) to overcome the limitation. The RN divides the spatial pixels into different regions according to the input mask, and calculates the mean and variance of each region to achieve normalization. The schematic diagram of RN is shown in fig. 3 below:
n, C, H, W in the above figure are batch size, number of lanes, height, width, respectively. The first frame portion in the graph represents the damaged data, the second frame position represents the undamaged data, and then the data of the two portions are respectively normalized. And two different RNs are included for the image inpainting network: (1) basic RNs (Basic RN, RN-B) were normalized for damaged and undamaged areas, respectively, based on the damaged image. The problem of normalized mean and variance shift is solved. This area normalization approach is used to repair early layers of the network because the incoming broken image has a large damaged area, which results in severe mean and variance shifts. (2) The Learnable RN (Learnable RN, RN-L) is difficult to obtain an accurate area mask from the original mask after several convolution layers. RN-L solves this problem by automatically detecting the damaged region and obtaining the region mask, and RN-L enhances the fusion of the damaged region and the undamaged region through global mapping transformation. RN-L promotes the restoration of damaged area, has solved the skew problem of mean value and variance, has strengthened the integration. Therefore, RN-L is applicable to the latter layer of the network.
The structure of the image restoration network is shown in fig. 3, and the overall structure is similar to that of the edge generation network. The second generator G2 and the second discriminator D2. The second generator in the structure of the image inpainting network consists of a second encoder sampling twice downwards, eight consecutive second residual blocks and a second decoder sampling twice upwards. The gated convolution is used for replacing the expanded convolution in the residual block in the generator to extract the characteristics of the input image, the effective area and the shielded area are distinguished through learning, the adverse effect of the damaged area on image restoration is reduced, the color and detail texture structure of the restored image is more reasonable, and the restoration quality is improved. And area normalization is used in the network instead of spectral normalization, which exploits the spatial relationship of the input features to detect potential damage areas and generates an area mask. Fusion of damaged and undamaged regions can be enhanced through global affine transformation, so that the problem of mean and variance drift is solved, and reconstruction of the damaged region is promoted.
The edge map generated by the edge generation network is used as prior information and a damaged image as the input of the second generator G2, and the repaired image generated by the second generator G2 and the original image which is not damaged are input into the second discriminator D2, so that the capability of the second generator G2 in repairing the image is improved by continuously comparing and updating discriminator parameters. The second discriminator D2 takes the real image as input, and the content and structure of the generated restored image are more similar to the real image through repeated training.
In FIG. 4, the predicted image and the corrupted color image of the previous edge and the incomplete color image, the first dotted frame represents the insertion of RN-B, which represents from left to right, and the normal convolution, RN-B and ReLU represent the activation function. The second module, represented from left to right, is gated convolution, learnable region normalization, ReLU represents activation function, gated convolution, learnable region normalization.
The input image of the image restoration network is composed of an edge image generated by the edge generation network as prior information and a color image to be restored. Wherein the edge generation diagram is Ccon=Cgt⊙(1-M)+CpredAn input image to be restored of
Figure RE-GDA0003294631080000091
The image restoration network restores the completed image T through the second generatorpredAs shown in formula (6):
Figure RE-GDA0003294631080000092
the loss function in the image restoration network consists of reconstruction loss, countermeasure loss, perception loss and lattice loss and is used for training the countermeasure network to obtain the image restoration network, and the equation (7) is the countermeasure loss, and the equation is similar to the function structure of the countermeasure loss in the edge generator.
Figure RE-GDA0003294631080000093
The perception loss is to convert the generated image and the original image into a high-level feature space through convolution operation and then subtract the high-level feature space to enable the high-level feature space and the original image to be close to each other, so that the semantic information is consistent. The perceptual loss function is shown in equation (8) below:
Figure RE-GDA0003294631080000101
in the above formulaiFor mapping at the i-th layer in the pre-training network, an activation function feature map relu _ i _1 from VGG-19 is corresponded, wherein i ∈ {1, 2, 3, 4, 5 }. And loss of use style [38]To measure the difference in covariance between activation maps. Given size Cj×Hj×WjAs a function of the style loss, as shown in equation (9):
Figure RE-GDA0003294631080000102
in the style loss function
Figure RE-GDA0003294631080000103
Is a Cj×CjThe Gram Matrix (Gram Matrix) is formed by an activation function characteristic diagram phijThe structure is obtained. The second discriminator of the image restoration network discriminates the restored image by combining the reconstruction loss, the countermeasure loss, the perception loss and the style loss, as shown in the following formula (10):
Figure RE-GDA0003294631080000104
λl1=1,λadv2=λp=0.1,λs=250
wherein λl1、λadv2、λpAnd λsAre hyper-parameters of reconstruction loss, countermeasure loss, perceptual loss, and style loss.
Further, combining the edge generation network and the image inpainting network to form an end-to-end LSK-FNet model, and training the LSK-FNet model, including:
firstly, repairing a damaged edge image through the first generator in the edge generation network, inputting a complete edge image extracted by a Canny algorithm, and generating a repaired edge image through training an edge generation model;
secondly, using the complete edge image extracted by the Canny algorithm as prior information and a damaged face image as input, and enabling the image repairing network to adapt to the edge information to repair the image;
and finally, training the damaged face image by a model formed by combining the edge generation network and the image restoration network, realizing end-to-end damaged face restoration, and forming the end-to-end LSK-FNet model.
Repairing the face image by using the LSK-FNet model, comprising the following steps:
firstly, generating repaired edge information by using the edge generation network as prior information of subsequent image repair;
and then, the generated prior information and the damaged original image are put into the image restoration network together for restoration.
In this embodiment, the training of the LSK-FNet model is totally divided into three steps: firstly, a damaged edge image is repaired by a generator in the edge generation network, and a complete edge image extracted by a Canny algorithm is input, and a repaired edge image is generated by training an edge generation model. Secondly, the complete edge image extracted by the Canny algorithm is used as prior information and a damaged face image as input, so that the repairing network adapts to the edge information to repair the image. And finally, training the damaged face image by a model formed by combining the edge generation network and the image restoration network, and realizing end-to-end damaged face restoration.
Repairing the face image by using the LSK-FNet model, comprising the following steps: firstly, generating repaired edge information by using the edge generation network as prior information of subsequent image repair; and then, the generated prior information and the damaged original image are put into the image restoration network together for restoration.
Further, after the steps of combining the edge generating network and the image inpainting network to form an end-to-end LSK-FNet model, and training the LSK-FNet model: the CelebA-HQ dataset was used to test the performance of the LSK-FNet model.
In this embodiment, performance test was performed using intel CPU E5 (2.60GHz) and GTX1080ti GPU as hardware. The data set selected in the experiment is a CelebA-HQ data set, the data set comprises 3 million human face images, the size of the human face images is 256 multiplied by 256, 28K images are used as a training set, 1K images are used as a verification set, and 1K images are used as a test set in the experiment. The experiment used an irregular random mask data set, with a total of 1.2 million 256 x 256 irregular mask images used for training and testing of the model. Wherein 1k is used as a training set, 1k is used as a verification set, and 1k is used as a test set, and an irregular mask image is generated as shown in fig. 5.
Python language is used for experimental codesSpeak to writing. The model adopts the structure of a convolutional neural network, the processing batch size is 4, and an Adam optimizer is used[40]The model was optimized, β 1 ═ 0, and β 2 ═ 0.9. In the experiment, values of the weight λ 1 and λ 2 in the loss function of the edge repairing network are 1 and 10 respectively, and values of the weight λ 3, λ 4 and λ 5 in the loss function of the image repairing network are 0.1, 0.1 and 250 respectively.
In order to objectively evaluate the quality of the restored image by the proposed method, three evaluation indexes are selected for measuring the quality of the restored image: 1))
Figure RE-GDA0003294631080000111
Loss; 2) peak signal-to-noise Ratio (PSNR); 3) structural Similarity (SSIM). )
Figure RE-GDA0003294631080000112
The smaller the value of (2) is, the larger the PSNR and SSIM values are, the better the image restoration quality is represented, namely the better the restoration effect is, the contrast algorithm is a context Attention algorithm, a restoration algorithm (GMCNN) for generating a multi-column convolutional neural network, an EdgeConnect algorithm, a PIC algorithm and a 0RN algorithm. The above method was tested using 1000 test sets and averaged. The mask sizes of the test pairs are respectively 10% -20%, 20% -30%, 30% -40% and 40% -50%, so that the repair conditions of different damage sizes are observed.
Firstly, qualitative comparison: the test sample of the CelebA-HQ data set is repaired by irregular masks, and FIG. 6 shows the repair results of masks with different damage sizes, wherein the damage mask sizes of the graph a and the graph b are 10% -20%, the damage mask sizes of the graph c and the graph d are 20% -30%, the damage mask sizes of the graph e and the graph f are 30% -40%, and the damage mask sizes of the graph g and the graph h are 40% -50% from top to bottom. In fig. 6, (a) to (h) represent, from left to right, the original image, the damaged image, the CA algorithm repair result, the GMCNN algorithm repair result, the EdgeConnect algorithm repair result, the PIC algorithm repair result, the RN algorithm repair result, and the algorithm repair result proposed herein, respectively.
From the overall structure, the algorithm proposed herein is superior to other algorithms in the aspects of detail processing, color fusion, and the like, and it can be seen from fig. 6 that the images repaired by the CA algorithm, the GMCNN algorithm, and the RN algorithm have the characteristics of visual artifacts, unclear edge structures, and the like, which are particularly obvious in fig. e and g. In the case of a small damaged area (for example: fig. a and b), the EdgeConnect algorithm, PIC algorithm and the method proposed herein are not much different from the original image without damage in the repair result as a whole. But the PIC algorithm does not process the edge details enough. As can be seen from fig. c, the image restored by using the EdgeConnect algorithm is not perfect in detail processing, so that the mouth corner of the face image is distorted, the improved algorithm has a better detail restoration effect close to the original image, and the edge contour information is clearer. When the PIC algorithm and the image with a large damage mask are used, the edge of the repaired image is subjected to artifact. The algorithm provided by the invention replaces the expansion convolution in the EdgeConnect algorithm with the gated convolution, changes the spectrum normalization into the area normalization, better extracts the information of the damaged area to repair the damaged area, synchronously trains the damaged mask and the image and makes the repair result more reasonable. And the prior information of the image edge is blended before the image is repaired, so that the facial feature details of the repaired image are more in place.
Fig. 7 shows the overall subjective evaluation of the repaired images with different degrees of damage. In fig. 8, the details of the restored image are magnified and extracted, and the main damaged areas are eyes, mouth and nose in order from top to bottom. It can be seen that the method proposed herein is the best in the repairing effect of the detail part, because the method proposed herein is based on the edge prior information and adopts the region normalization, so that the repaired image has no artifacts and the like.
The experiment also extracts edge prior information in the process of repairing the image, and the results are shown in fig. 8, wherein (a) - (d) are an original image, a damaged image, an edge generation image and a repair result image respectively. As can be seen from the column (c) of fig. 8, the edge generation network can better repair the edge information of the missing part of the model, and the repaired edge information and the damaged image are put into the image repair network together for repair.
Secondly, quantitative comparison: besides subjective visual analysis of the images, quantitative analysis of the repaired images was also performed. 1000 data sets are selected for repairing and then averaged, so that the condition of special cases is prevented. Selected evaluation instructions
Figure RE-GDA0003294631080000132
Loss, peak signal-to-noise ratio, and structural similarity.
Table 1 shows the results of comparison with the CA algorithm, GMCNN algorithm, EdgeConnect algorithm, PIC algorithm, and RN algorithm on the irregular mask data set with a breakage size of 10% to 50%. Wherein Masked region represents the area of irregular mask occlusion.
Figure RE-GDA0003294631080000133
Smaller values for loss represent better image quality after restoration, and as can be seen from the data in table 1, the methods proposed herein are all higher than the other methods when compared to them. The larger values of peak signal-to-noise ratio and structural similarity represent better image quality after repair, and the data in table 1 show that when the damaged area does not reach 40% -50%, especially when the damaged area is less than 20%,
Figure RE-GDA0003294631080000134
the improvement of the loss performance is over 4.3 percent. The repaired image data is higher than other repairing methods. However, the method proposed herein has no significant advantage in repairing areas where the breakage mask is large. But the process herein is superior to the process to which it is compared in its entirety.
TABLE 1 comparison of results of irregular mask Damage repair
Figure RE-GDA0003294631080000131
In the table, 5 models for performance comparison are shown in the corresponding references:
the CA method comprises the following steps: J.Yu, Z.Lin, J.Yang, et al, "genetic Image Inpainting with context Attention,"2018IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA,2018, pp.5505-5514.
The GMCNN method comprises the following steps: W.Yi, T.Xin, X.Qi, et al, "Image Inpainting vision generating Multi-column volumetric Neural Networks," Neural Information Processing Systems, v 2018-December, p 331-.
The Edge method comprises the following steps: nazeri, E.Ng, T.Joseph, et al, "EdgeConnect: Structure Guided Image Inpainting Edge Prediction,"2019IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Korea (south),2019, pp.3265-3274.
PIC method: ZHENG, T.Cham and J.Cai, "Pluralogic Image Completion,"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA,2019, pp.1438-1447.
The RN method comprises the following steps: T.Yu, Z.Guo, X.jin, et al, "Region standardization for Image augmentation," Proceedings of the AAAI Conference on Artificial Intelligence insight. vol.34, No.07, pp. 12733-.
Thirdly, ablation experiment: in order to verify the influence of adding the gated convolution on image restoration, in the second stage, we remove the gated convolution, and perform comparison experiments on our LSK-FNet model on CelebA-HQ, which is still a 28K training set, a 1K verification set and a 1K test set. For better contrast display, irregular masking training and testing was used while ensuring that other conditions were the same. The experimental results are shown in table 2, and it can be seen that the repairing effect is not significantly improved when only the area normalization is added.
TABLE 2 De-gated convolution with regional normalization
Figure RE-GDA0003294631080000141
In order to verify the influence of adding the gated convolution on image restoration, in the second stage, we remove the gated convolution, and perform comparison experiments on our LSK-FNet model on CelebA-HQ, which is still a 28K training set, a 1K verification set and a 1K test set. For better contrast display, irregular masking training and testing was used while ensuring that other conditions were the same. The experimental results are shown in table 3, and it can be seen that the repairing effect is not significantly improved when only the area normalization is added.
TABLE 3 Deregionalization with addition of gated convolution
Figure RE-GDA0003294631080000142
From the above table it follows: 1. the addition of regional normalization or gated convolution can improve image quality, but the loss region is large and fails. 2. When the area normalization and the gating convolution are added into the network structure together, the image quality improvement effect is better. 3. When the EdgeConnect is independently acted by regional normalization or gate control convolution, the performance of reconstruction loss does not exceed that of the original model; when the two methods simultaneously act on the structure, the reconstruction loss performance is obviously improved; 4. when the repair area is small (less than 20% of the damaged area), the structural similarity is improved by using the two methods separately and simultaneously.
In summary, the following steps: since the existing image restoration uses a generation countermeasure network and an attention mechanism to increase semantic understanding between elements of a face, but the restored image has the situations of fuzzy edge structure, excessive smoothness, unreasonable semantic understanding, visual artifacts and the like, an LSK-FNet model is proposed. The image restoration work in the model is divided into two steps: firstly, the edge generation network is used for generating repaired edge information as prior information of subsequent image repair. And then, the generated prior information and the damaged original image are put into an image restoration network together for restoration. Gate convolution is integrated in the whole generation countermeasure network for training, and region normalization is utilized in the image restoration network to improve the detail and accuracy of restoration. The method is trained on the CelebA-HQ data set, qualitative analysis and quantitative analysis are carried out on the repaired image, and the result shows that the image is subjected to qualitative analysis and quantitative analysisWhen the method has a damaged area of less than 20%,
Figure RE-GDA0003294631080000151
the loss performance is improved by more than 4.3%, and a good effect is achieved in repairing the edge structure and the detail part of the facial image.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (8)

1. An image restoration method based on an LSK-FNet model is characterized by comprising the following steps:
constructing and training an edge generation network;
constructing and training an image restoration network;
combining the edge generation network and the image restoration network to form an end-to-end LSK-FNet model, and training the LSK-FNet model;
and repairing the face image by using the LSK-FNet model.
2. The LSK-FNet model-based image inpainting method of claim 1, wherein in the step of constructing and training the edge generating network:
the edge generation network is composed of a first generator and a first discriminator, the first generator comprises a first encoder for sampling downwards twice, eight continuous first residual blocks and a first decoder for sampling upwards twice, and a gate control convolution module is added into the first residual blocks.
3. The LSK-FNet model-based image inpainting method of claim 2, wherein,
the edge-generating network generates a countermeasure network model based on deep convolution.
4. The LSK-FNet model-based image inpainting method of claim 3, wherein in the step of constructing and training the edge generating network:
and training the confrontation network model by using a confrontation loss function and a characteristic matching loss function to obtain the edge generation network.
5. The image inpainting method based on the LSK-FNet model of claim 4, wherein in the step of constructing and training the image inpainting network:
the image restoration network consists of a second generator and a second discriminator, the second generator consists of a second encoder which samples downwards twice, eight continuous second residual blocks and a second decoder which samples upwards twice, and a gating convolution module is added in the second residual blocks.
6. The image inpainting method based on the LSK-FNet model of claim 5, wherein in the step of constructing and training the image inpainting network:
training the confrontation network model by utilizing a reconstruction loss function, a confrontation loss function, a style loss function and a perception loss function to obtain the image restoration network; and area normalization is used in the image inpainting network.
7. The LSK-FNet model-based image inpainting method of claim 6, wherein combining the edge generating network and the image inpainting network to form an end-to-end LSK-FNet model and training the LSK-FNet model comprises:
firstly, repairing a damaged edge image through the first generator in the edge generation network, inputting a complete edge image extracted by a Canny algorithm, and generating a repaired edge image through training an edge generation model;
secondly, using the complete edge image extracted by the Canny algorithm as prior information and a damaged face image as input, and enabling the image repairing network to adapt to the edge information to repair the image;
and finally, training the damaged face image by a model formed by combining the edge generation network and the image restoration network, realizing end-to-end damaged face restoration, and forming the end-to-end LSK-FNet model.
8. The LSK-FNet model-based image inpainting method of claim 7, wherein inpainting the face image using the LSK-FNet model comprises:
firstly, generating repaired edge information by using the edge generation network as prior information of subsequent image repair;
and then, the generated prior information and the damaged original image are put into the image restoration network together for restoration.
CN202110757574.3A 2021-07-05 2021-07-05 Image restoration method based on LSK-FNet model Pending CN113808031A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110757574.3A CN113808031A (en) 2021-07-05 2021-07-05 Image restoration method based on LSK-FNet model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110757574.3A CN113808031A (en) 2021-07-05 2021-07-05 Image restoration method based on LSK-FNet model

Publications (1)

Publication Number Publication Date
CN113808031A true CN113808031A (en) 2021-12-17

Family

ID=78893063

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110757574.3A Pending CN113808031A (en) 2021-07-05 2021-07-05 Image restoration method based on LSK-FNet model

Country Status (1)

Country Link
CN (1) CN113808031A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114693630A (en) * 2022-03-25 2022-07-01 上海大学 Image bleeding position prediction method based on antagonistic edge learning
CN114764754A (en) * 2022-03-25 2022-07-19 燕山大学 Occlusion face repairing method based on geometric perception prior guidance
CN116029945A (en) * 2023-03-29 2023-04-28 北京建筑大学 Generating type image restoration device
CN117974832A (en) * 2024-04-01 2024-05-03 南昌航空大学 Multi-modal liver medical image expansion algorithm based on generation countermeasure network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050190279A1 (en) * 2004-02-26 2005-09-01 Jonathan Nobels Mobile device with integrated camera operations
CN111968053A (en) * 2020-08-13 2020-11-20 南京邮电大学 Image restoration method based on gate-controlled convolution generation countermeasure network
CN112967218A (en) * 2021-03-15 2021-06-15 复旦大学 Multi-scale image restoration system based on wire frame and edge structure

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050190279A1 (en) * 2004-02-26 2005-09-01 Jonathan Nobels Mobile device with integrated camera operations
CN111968053A (en) * 2020-08-13 2020-11-20 南京邮电大学 Image restoration method based on gate-controlled convolution generation countermeasure network
CN112967218A (en) * 2021-03-15 2021-06-15 复旦大学 Multi-scale image restoration system based on wire frame and edge structure

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114693630A (en) * 2022-03-25 2022-07-01 上海大学 Image bleeding position prediction method based on antagonistic edge learning
CN114764754A (en) * 2022-03-25 2022-07-19 燕山大学 Occlusion face repairing method based on geometric perception prior guidance
CN114764754B (en) * 2022-03-25 2024-04-09 燕山大学 Occlusion face restoration method based on geometric perception priori guidance
CN116029945A (en) * 2023-03-29 2023-04-28 北京建筑大学 Generating type image restoration device
CN116029945B (en) * 2023-03-29 2023-07-14 北京建筑大学 Generating type image restoration device
CN117974832A (en) * 2024-04-01 2024-05-03 南昌航空大学 Multi-modal liver medical image expansion algorithm based on generation countermeasure network
CN117974832B (en) * 2024-04-01 2024-06-07 南昌航空大学 Multi-modal liver medical image expansion algorithm based on generation countermeasure network

Similar Documents

Publication Publication Date Title
CN111723860B (en) Target detection method and device
Agrawal et al. A novel joint histogram equalization based image contrast enhancement
CN113808031A (en) Image restoration method based on LSK-FNet model
Kang et al. Convolutional neural networks for no-reference image quality assessment
CN112837234B (en) Human face image restoration method based on multi-column gating convolution network
CN113255659B (en) License plate correction detection and identification method based on MSAFF-yolk 3
CN113112416B (en) Semantic-guided face image restoration method
CN112365556B (en) Image extension method based on perception loss and style loss
Chakraborty PRNU-based image manipulation localization with discriminative random fields
CN113344000A (en) Certificate copying and recognizing method and device, computer equipment and storage medium
CN116012291A (en) Industrial part image defect detection method and system, electronic equipment and storage medium
CN111274964B (en) Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle
CN117036281A (en) Intelligent generation method and system for defect image
Zheng et al. Overwater image dehazing via cycle-consistent generative adversarial network
CN112348762A (en) Single image rain removing method for generating confrontation network based on multi-scale fusion
CN116958880A (en) Video flame foreground segmentation preprocessing method, device, equipment and storage medium
CN116228795A (en) Ultrahigh resolution medical image segmentation method based on weak supervised learning
CN115546828A (en) Method for recognizing cow faces in complex cattle farm environment
CN115937095A (en) Printing defect detection method and system integrating image processing algorithm and deep learning
CN113192018B (en) Water-cooled wall surface defect video identification method based on fast segmentation convolutional neural network
CN115272378A (en) Character image segmentation method based on characteristic contour
Yuan et al. RM-IQA: A new no-reference image quality assessment framework based on range mapping method
Du et al. UIEDP: Underwater Image Enhancement with Diffusion Prior
Wang et al. Color image-spliced localization based on quaternion principal component analysis and quaternion skewness
CN111882495A (en) Image highlight processing method based on user-defined fuzzy logic and GAN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211217