CN112598604A

CN112598604A - Blind face restoration method and system

Info

Publication number: CN112598604A
Application number: CN202110241203.XA
Authority: CN
Inventors: 闫超; 卢丽; 黄俊洁
Original assignee: Chengdu Dongfang Tiancheng Intelligent Technology Co ltd
Current assignee: Chengdu Dongfang Tiancheng Intelligent Technology Co ltd
Priority date: 2021-03-04
Filing date: 2021-03-04
Publication date: 2021-04-02

Abstract

The invention discloses a blind face restoration method and a system, comprising the following steps: acquiring a blind face data set, evaluating the quality of the blind face data set by using a Laplacian gradient, and removing blurred and non-human face images; enhancing image data of the blind face data set, and randomly distributing to obtain a training set and a test set; constructing an AFFNet network; inputting images of a training set into an AFFNet network, training the AFFNet network by combining a reconstruction loss function, a perception loss function, a style loss function and an antagonism loss function, and training and optimizing the AFFNet network by using an SGD (generalized serving-grid-directed) optimization algorithm to obtain an optimal blind face restoration model; and inputting the images of the test set into the optimal blind face restoration model, and matching and selecting to obtain the image with the highest accuracy as a final retrieval result. Through the scheme, the method has the advantages of simple logic, accuracy, reliability and the like, and has high practical value and popularization value in the technical field of image processing.

Description

Blind face restoration method and system

Technical Field

The invention relates to the technical field of image processing, in particular to a blind face restoration method and a blind face restoration system.

Background

Blind face restoration as described herein is the restoration of low quality degraded images (noise, artifacts and blurring and combinations thereof) into sharp high quality images. In recent years, the acquisition and sharing of face images has been greatly improved, and on the one hand, with the development of image acquisition and display technologies, more and more high-quality (HQ) visual media have come into play. On the other hand, degraded images and video are still ubiquitous due to the variety of acquisition equipment, the influence of the environment and object motion. Therefore, how to recover a clear high-quality image from the degraded images is a valuable research topic in the field of computer vision.

High-quality face images play a very important role in entertainment, monitoring, human-computer interaction and other applications, so that face restoration is an urgent need of a multifunctional visual system. Currently, GFRNet in the prior art is a conventional method for face restoration based on a single sample image, but when the postures and expressions of a guide image and a degraded image are different, the definition is obviously reduced. In addition, GFRNet uses direct concatenation to fuse degenerate and curve features, which are limited to a single environmental state and have poor generalization capability to low-quality (LQ) images of unknown degenerate processes. GFRNet does not reconstruct much more of the human face's texture details from the guide image well, nor does it completely remove the noise and artifacts of the degraded image. Therefore, the single-sample method in the prior art is poor in restoration effect of guiding the LQ face image.

Multi-sample images can greatly improve the ability of image restoration compared to single-sample restoration. For degraded LQ face images, a multi-sample HQ image of the same person is likely to be useful. For example, face images in a smartphone album are typically grouped by appearance, and additionally a High Quality (HQ) sample image may be referenced to a Low Quality (LQ) image. Therefore, the introduction of multiple samples greatly reduces the difficulty of degradation estimation and image restoration, and provides a new visual angle for improving the blind face restoration method.

To solve the above problems, the multi-sample image-based method can guide the unique advantages of LQ image restoration. At present, the blind face restoration method in the prior art also has the following problems:

firstly, most of the existing blind face restoration methods are based on single-sample HQ images, and have limitations on the mining of the generalization capability of an unknown degradation process;

second, the prior art GFRNet uses a curvilinear sub-network to spatially calibrate the guide and degraded images. However, due to the lack of direct monitoring information to guide the image, the curve subnet is difficult to train and has poor generalization capability;

thirdly, the guide image and the degraded image are usually shot under different illumination conditions, and the background difference is large;

fourth, the cascade-based fusion method is still limited in complementarity between the guide image and the degraded image.

Therefore, a blind face restoration method based on multi-sample image and adaptive spatial feature fusion with simple logic, accuracy and reliability is urgently provided to improve the accuracy and generalization capability of blind face restoration.

Disclosure of Invention

In view of the above problems, an object of the present invention is to provide a blind face restoration method and system, and the technical solution adopted by the present invention is as follows:

a blind face restoration method based on multi-sample image and adaptive spatial feature fusion comprises the following steps:

acquiring a blind face data set, evaluating the quality of the blind face data set by using a Laplacian gradient, and removing blurred and non-human face images; enhancing image data of the blind face data set, and randomly distributing to obtain a training set and a test set;

constructing an AFFNet network;

inputting images of a training set into an AFFNet network, training the AFFNet network by combining a reconstruction loss function, a perception loss function, a style loss function and an antagonism loss function, and training and optimizing the AFFNet network by using an SGD (generalized serving-grid-directed) optimization algorithm to obtain an optimal blind face restoration model;

and inputting the images of the test set into the optimal blind face restoration model, and matching and selecting to obtain the image with the highest accuracy as a final retrieval result.

Further, the image data is enhanced for the blind face data set, including random cropping, horizontal flipping and chrominance transformation of the image of the blind face data set.

Further, the expression of the blind face restoration model is as follows:

wherein the content of the first and second substances,

representing a degraded image of the face of a person,

a feature representing a degraded image is present in the image,

is a key point of the degraded image,

a key point representing the guide image is displayed,

number of key points (

=68)，

Denotes a parameter, k

[0,

]，

Representing model parameters.

Furthermore, the method also comprises the step of carrying out degradation model processing on the blind face data, wherein the expression is as follows:

wherein the content of the first and second substances,

which represents a convolution operation, is a function of,Ka blur kernel is represented by the number of pixels,

a bi-cubic down-sampler is shown,

indicating having a noise level

The noise of the gaussian noise of (a),JPEG _qis expressed with a quality factorqJPEG compression of (1).

Further, the AFFNet network selects an optimal guide image from the blind face data set by adopting a weighted least square method WLS model, performs space calibration and illumination translation on the guide image in a feature space by utilizing a mobile least square method and self-adaptive example normalization, and fuses curve features of the guide image and restoration features of a degraded image by utilizing self-adaptive space features.

Furthermore, the weighted least square method WLS model selects an optimal guidance image from the blind face data set using a minimum weighted affine distance, and the expression is:

wherein the content of the first and second substances,D _a(L ^d,

) Representing an affine distance;w _mis shown asmThe weight of each keypoint;

and

respectively representing degraded imagesmA key point andka second of the guide imagemA key point;

is that

The homogeneity of (1);Wrepresenting keypoint weight vectorswA diagonal matrix of (a);

representing the transpose of the matrix.

Further, utilize

Weight of initialization key point of face image representing degradation, searching guide image

Optimal guide image in forward propagation

Updating the weight of the key point by using a back propagation algorithm, wherein the weight expression of the key point is as follows:

wherein the content of the first and second substances,

presentation guidance image

The affine distance of (c).

Further, the method for performing space calibration and illumination translation on the guide image in the feature space by using a moving least square method and self-adaptive example normalization comprises the following steps:

affine matrix of the guide imageM _pThe expression of (a) is:

wherein the content of the first and second substances,L ^grepresenting optimal guide image key points;L ^dkey points representing degraded images;

is that

Is a homogeneous representation of;pis the coordinates of the degraded image and is,p=(x,y)；

obtaining curve characteristics of guide image through bilinear interpolation

The expression is as follows:

wherein (A), (B), (C), (D), (C), (x,y) A coordinate representing the degraded image;

a coordinate representing the guide image;

is (a)x,y) Homogeneous coordinates of (a);Nto represent

4 nearest neighbors of;F ^gfeatures representing an optimal guide image;

and (3) adjusting the curve characteristics of the guide image by using self-adaptive example normalization, wherein the expression is as follows:

wherein the content of the first and second substances,F ^dandF ^{g w,}a curve feature representing a restoration feature of the degraded image and a curve feature of the guide image, respectively;

and

mean and standard deviation, respectively.

Further, the joint reconstruction loss function, the perception loss function, the pattern loss function and the antagonism loss function train the AFFNet network, and the expression is as follows:

wherein the content of the first and second substances,

a joint loss function representing a perceptual loss function and a reconstruction loss function,

representing a perceptual loss function;

the expression of the joint loss function of the perceptual loss function and the reconstruction loss function is as follows:

wherein the content of the first and second substances,

_MSEa weight parameter representing a reconstruction loss function, which has a value in the range of 0 to 1,

_percand the weight parameter represents a perception loss function and has a value ranging from 0 to 1.

And (3) adopting a reconstruction loss function to constrain the reconstructed image so as to obtain a reconstructed image close to the real image, and adopting a mean square error to measure the difference between the reconstructed image and the real image, wherein the expression is as follows:

wherein the content of the first and second substances,

which represents the reconstructed image(s) of the image,

representing the real image, C, H and W representing the channel, height and width of the image, respectively;

and (3) adopting a perception loss function to constrain the reconstructed image, wherein the expression is as follows:

wherein the content of the first and second substances,

second to represent a pre-trained faceNet modeluLayer characteristics; the above-mentionedu

1,2,3,4]；

The perceptual loss functionL _realThe expression of (a) is:

wherein the content of the first and second substances,

_styla weight parameter representing a pattern loss function, which ranges from 0 to 1,

_advand a weight parameter representing the resistance loss function, wherein the value of the weight parameter ranges from 0 to 1.

The expression of the style loss function is:

wherein the content of the first and second substances,

which represents the reconstructed image(s) of the image,

representing a real image, C, H and W representing the channel, height and width of the image respectively,

1,2,3,4](ii) a The above-mentioned

Indicating the interchange of rows and columns of the matrix.

Discriminator for AFFNet network by adopting antagonism loss functionl _{adv D,}Sum generatorl _{adv G,}Training is carried out, and the expression is as follows:

wherein the content of the first and second substances,Iand

the real image and the reconstructed image are represented separately,P(I) AndP(

) Representing the true image distribution and the reconstructed image distribution, respectively, G and D both represent a neural network, E represents the maximum likelihood estimate,

to representIToP(I) The maximum likelihood estimate of (a) is,

to represent

ToP(

) The maximum likelihood estimate of (a) is,

to represent

ToP(

) The maximum likelihood estimate of (a) is,

representing real imagesIThe input neural network generates a picture which is then,

to represent

The input neural network generates a picture.

A blind face restoration method and system comprises the following steps: the data preprocessing module is used for acquiring a blind face data set, evaluating the quality of the blind face data set by utilizing a Laplacian gradient and removing blurred and non-face images; enhancing image data of the blind face data set, and randomly distributing to obtain a training set and a test set;

the feature extraction module is used for extracting high-dimensional image features based on the constructed AFFNet network;

the training module is used for initializing parameters of the AFFNet network, inputting images of a training set into the AFFNet network, training the AFFNet network by combining a reconstruction loss function, a perception loss function, a style loss function and a resistance loss function, training and optimizing the AFFNet network by utilizing an SGD (generalized serving-fuzzy-decomposition) optimization algorithm, and obtaining an optimal blind face restoration model;

and the test module is used for inputting the images of the test set into the optimal blind face restoration model, matching and selecting the images to obtain the image with the highest accuracy as the final retrieval result.

Compared with the prior art, the invention has the following beneficial effects:

(1) the method skillfully adopts a weighted least square method WLS model, selects samples with similar postures and expressions from multiple sample HQ images to select an optimal guide image, adopts WLS to guide optimal selection at key points, and learns the weight of the key points to enable the selected guide image to reach the highest restoration precision, thereby solving the problem that the blind face restoration method based on single sample HQ images has limitation on the excavation of the generalization capability of unknown degradation processes.

(2) The invention introduces a moving least square Method (MLS), and can greatly reduce the posture and expression difference through guiding selection, thereby utilizing the MLS to calibrate the guide image and the degraded image in the characteristic space, and solving the problems of lack of direct monitoring information of the guide image, difficult training of a curve subnet and poor generalization capability.

(3) The present invention proposes Adaptive Instance Normalization (AIN) and then illumination translation of the guide image using the AIN to reduce the illumination difference between the guide image and the degraded image.

(4) The invention provides 4 self-adaptive spatial feature fusion (AFF) blocks, which fuse curve features of a guide image and restoration features of a degraded image in a self-adaptive and progressive mode so as to reconstruct an AFFNet subnet, and solve the problem that the fusion method based on cascade is still limited in utilizing complementarity between the guide image and the degraded image.

(5) The AFFNet of the invention has good generalization capability to complex and unknown degradation processes, and can effectively generate vivid results on LQ images;

(6) the invention skillfully adopts random cutting, horizontal turning and chrominance transformation (brightness and contrast) to enhance the image data;

in conclusion, the method has the advantages of simple logic, accuracy, reliability and the like, and has high practical value and popularization value in the technical field of image processing.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention, and therefore should not be considered as limiting the scope of protection, and it is obvious for those skilled in the art that other related drawings can be obtained according to these drawings without inventive efforts.

FIG. 1 is a logic flow diagram of the present invention.

FIG. 2 is a schematic diagram of the AFFNet network structure of the present invention.

Detailed Description

To further clarify the objects, technical solutions and advantages of the present application, the present invention will be further described with reference to the accompanying drawings and examples, and embodiments of the present invention include, but are not limited to, the following examples. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Examples

As shown in fig. 1 to fig. 2, the present embodiment provides a blind face restoration method and a system, wherein the system includes a data preprocessing module, a feature extraction module, a training module, and a testing module.

Specifically, the method comprises the following steps: as shown in fig. 1, the data preprocessing module S101 collects a blind face data set VGGFace2, evaluates the quality of the data set using laplacian gradients, removes blurry and non-face images, enhances image data using random cropping, horizontal flipping and chrominance transformation (luminance and contrast), sets the image size to 256 × 256, then converts to a corresponding tfrechrd format file, reads data in a multi-thread parallelized manner, and obtains a training and testing set;

the feature extraction module S102 is used for extracting high-dimensional image features through a convolution layer of the network based on the constructed AFFNet network;

the training module S103 is used for initializing parameters of an AFFNet network structure, inputting a blind face image into the AFFNet network, introducing 4 loss functions (reconstruction, perception, pattern and antagonism loss functions) to train the whole network structure, training and optimizing the AFFNet network by using an SGD (generalized minimum mean square) optimization algorithm, and fusing curve characteristics of a guide image and characteristics of degraded image restoration in a self-adaptive and progressive mode to obtain an optimal blind face restoration model;

and the test module S104 is used for inputting the optimal blind face restoration model for matching the test image and selecting the image with the highest accuracy as the final retrieval result.

The following describes a blind face restoration method and system, which details the guiding selection, spatial calibration, illumination translation, and adaptive feature fusion module proposed in this embodiment.

As shown in fig. 2, this embodiment proposes a weighted least square WLS model, which selects an optimal guidance image from a multi-sample image set, and then performs spatial calibration and illumination translation on the guidance image in a feature space by using a Moving Least Square (MLS) method and Adaptive Instance Normalization (AIN) method to mitigate the difference between the pose and the expression after guidance selection. Finally, 4 Adaptive Feature Fusion (AFF) blocks fuse the curve features of the guide image and the restoration features of the degraded image.

The blind face restoration method is based on a group of sample images

In a degraded face image

In reconstructing HQ image thereof

。

、

And

the images have the same size 256 × 256, and when the image sizes are different, the images are resized to the same size (256 × 256) using bicubic sampling. Each face image obtains 68 key points through the face key point detection method, and therefore, the blind face restoration model can be expressed as:

wherein the content of the first and second substances,

is a degraded face image;

features representing degraded images;

is a key point of the degraded image,

R ²

⁸（k=1,…,K=68）；

is a key point of the guide image;

number of key points (

=68)，

It is indicated that one of the parameters,k

[0,

]；

representing model parameters.

For most guided blind face restoration methods, the pose and expression differences between the guide image and the degraded image can reduce the accuracy of the restoration. Therefore, it is preferable to select a guide image having a similar pose and expression to the degraded image. The method comprises the steps of solving a Weighted Least Square (WLS) model, measuring the similarity between key points by adopting a weighted affine distance, and determining an optimal guide image by solving a minimum weighted affine distance (minimum weighted affine distanceK ^*) Can be expressed as:

wherein the content of the first and second substances,D _a(L ^d,

) Representing an affine distance;w _mis shown asmThe weight of each keypoint;

and

respectively representing degraded images

mA key point andka second of the guide imagemA key point;

is that

Is a homogeneous representation of (a)

The coordinate of (2)x,y]^TAccordingly is at

Has a homogeneous coordinate of [ 2 ]x,y,1]^T）；

，W=Diag(w) Is a keypoint weight vectorwThe diagonal matrix of (a).

In this embodiment, the image is degradedI ^dTo initialize the weight of the key point and guide the image

Finding out a guiding image with optimal accuracy in the forward propagation process

And updating the weight of the key point through a back propagation algorithm to enable the selected guide image to have a relatively small affine distance. The learning of the weight of the key point enables the selected guide image to achieve the highest recovery precision, and the weight of the key pointl _wCan be expressed as:

although the optimal guide image and the degraded image have similar postures and expressions, the error is still large, and the reconstructed image is subjected to artifact. Thus, GFRNet uses a curvilinear sub-network to spatially calibrate the guide image and the degraded image. However, due to the lack of direct monitoring information to guide the image, the curved sub-network is difficult to train and has poor generalization capability. In addition, the guide image and the degraded image are generally taken under different lighting conditions. To solve these problems, the present embodiment employs the MLS method for spatial calibration and the AIN method for illumination translation.

The embodiment introduces a Moving Least Squares (MLS) method to calibrate the guide image and the degraded image in the feature space, rather than learning curve subnets, and the difference of the pose and the expression can be greatly reduced through guide selection. In addition, the MLS calibration is minute, and the feature extraction sub-network of the curve sub-network can perform end-to-end learning in the training process, so that the feature extraction and the MLS can work cooperatively to calibrate the image more accurately.

Specific diagonal matrixW _pHas a size of 68X 68, the first of the diagonal matrixmA diagonal element

. Thus, a specific affine matrixM _pCan be expressed as:

is that

Is a homogeneous representation of;pis the coordinates of the degraded image and is,p=(x,y) (ii) a The curve subnet can obtain curve characteristics through bilinear interpolation to guide the curve characteristics of the image

Can be expressed as:

wherein (A), (B), (C), (D), (C), (x,y) Is a coordinate of the degraded image;

is a coordinate of the guide image;

is (a)x,y) Homogeneous coordinates of (a);Nis that

4 nearest neighbors of;F ^gfeatures representing an optimal guide image; the curve features are differentiable, so feature extraction can also be learned end-to-end in the training process.

In the present embodiment, Adaptive Instance Normalization (AIN) is the transformation of the degraded image into the required pattern. The invention takes illumination as a pattern, utilizes AIN to adjust the curve characteristic of the guide image to ensure that the curve characteristic of the guide image has illumination similar to the restoration characteristic of the degraded image, and guides the curve characteristic of the imageF ^{g w a,,}Can be expressed as:

and

mean and standard deviation, respectively.

GFRNet employs cascade-based fusion and is performed in multiple feature layers. However, the cascade-based fusion method is still limited in exploiting complementarity between the guide image and the degraded image. Therefore, this embodiment proposes 4 AFF blocks to adaptively and progressively fuse the curve feature of the guide image and the feature of the degraded image restoration, thereby reconstructing the AFFNet subnet. The AFFNet subnet consists of two shuffle layers, each followed by two residual blocks.

In this embodiment, on the one hand, the instructional image typically contains more high-quality facial details. On the other hand, in the case of a liquid,F ^g ^{w a,,}andF ^dthe HQ image is reconstructed better by spatially transferring the complementarity. Therefore, the face image obtains the features of the key points of the face through a key point detection algorithmF ^lThen, thenF ^{g w a,,}、F ^dAndF ^las input features and using a control module to generate an attention maskF ^m，F ^mGuide(s) toF ^{g w a,,}AndF ^dthe fused features are passed through 4 AFF blocks to obtain combined featuresF ^c。

Compared with the GFRNet, the cascade-based fusion AFF is a more flexible fusion method and can adapt to different degraded images and guide images. Due to the advantages of self-adaption and progressive fusion, the AFFNet has good generalization capability on LQ face images in a complex and unknown degradation process.

In this embodiment, 4 kinds of loss functions (reconstruction, perception, pattern, and antagonism loss functions) are specifically introduced to train the whole network structure, which is as follows:

(1) the reconstruction loss function is used for constraining the reconstructed image to be closer to the real image and measuring by adopting the mean square error

AndIdifference between, mean square errorI _MSE) Can be expressed as:

wherein the content of the first and second substances,

andIrespectively representing a reconstructed image and a real image;C、HandWrepresenting the channel, height and width of the image, respectively.

(2) Perceptual loss function for constraining reconstructed images

So as to improve the visual quality of the reconstructed image and make the reconstructed image closer to a real image in a characteristic space.

Wherein the content of the first and second substances,

second of network architecture faceNet model representing pre-trained face recognitionuThe characteristics of the layers are such that,u

[1,2,3,4]. Total loss of massL _recCan be expressed as:

wherein the content of the first and second substances,

_MSEand

_percis a weight parameter that is a function of,

_MSEthe value of (a) is in the range of 0 to 1,

_percis in the range of 0 to 1.

（3）The pattern loss function can generate accurate visual effect, pattern lossl _styleCan be expressed as:

(4) the antagonism loss is an effective method for improving the visual quality and is widely applied to an image generation task. The present invention introduces spectral normalization on the weight of each convolutional layer and trains discriminators with antagonism lossesl _{adv D,}Sum generatorl _{adv G,}The formula is as follows:

wherein the content of the first and second substances,l _{adv D,}for updating the discriminator; whilel _{adv G,}Used to update AFFNet;Iand

respectively representing a real image and a reconstructed image;P(I) AndP(

) Respectively representing a real image distribution and a reconstructed image distribution; g and D both represent a neural network; e represents the maximum likelihood estimate and the maximum likelihood estimate,

to representIToP(I) Maximum likelihood estimation of (2);

to represent

ToP(

) Maximum likelihood estimation of (2);

to represent

ToP(

) Maximum likelihood estimation of (2);

representing real imagesIInputting a neural network to generate a picture;

to represent

The input neural network generates a picture.

Therefore, the overall perception is lostL _realCan be expressed as:

wherein the content of the first and second substances,

_styland

_advis a weight parameter that is a function of,

_stylthe value of (a) is in the range of 0 to 1,

_advis in the range of 0 to 1.

In this embodiment, the overall objective function for blind face restorationLIs defined as:

in addition, the degradation model of the present embodiment can be expressed as:

wherein the content of the first and second substances,

representing a convolution operation;krepresenting a blur kernel;

representing a bicubic downsampler;

indicating having a noise level

Gaussian noise of (2);JPEG _qis expressed with a quality factorqJPEG compression of (1). The degradation model can generate a vivid LQ image, thereby achieving the highest restoration precision.

All experiments were developed on NVIDIA platform using python3.7, and a blind face data set VGGFace2 was collected, where VGGFace2 data set contained 16 ten thousand sets of face images, 10 thousand sets of training sets, and 6 thousand sets of testing sets, each set containing 3-10 HQ sample images, and the poses and expressions of the training and testing sets did not overlap. The experiment used an SGD optimizer to train AFFNet with a batch size of 8 and momentum parameters

₁=0.5 and

₂=0.999, initial learning rate 0.0002, loss term weight parameter

_MSE=300、

_perc=5、

_style=1 and

_advand (2). Experiments used peak signal-to-noise ratio (PSNR), Structural Similarity (SSIM) and LPIPS to quantify the accuracy of the model.

TABLE 1

Experiments compared 4 variants of AFFNet to verify the validity of adaptive feature fusion, 1-Consat fused 1 adaptive spatial feature fusion block, 4-Consat fused 4 adaptive spatial feature fusion block, w/o 1-Atten and w/o 4-Atten removed the attention mask in AFF block, respectively. The results are shown in Table 1, AFFNet is superior to Concat and w/o Atten in PSNR and SSIM indexes, and the effectiveness of adaptive spatial feature fusion is proved. To verify the effectiveness of progressive mode fusion, the AFFNet model was experimentally constructed from 4 different AFF blocks (1-AFF, 2-AFF, 4-AFF, and 8-AFF). Due to the advantage of progressive mode fusion, better accuracy can be obtained by stacking more AFF blocks, and when the number of AFF blocks is more than 4, PSNR and SSIM begin to saturate. Therefore, 4-AFF acts as the optimal AFFNet model. In addition, three AFFNet variants were considered in the experiment, w/o AIN by removing the AIN moduleW/o MLS by removing MLS modules, and Untrain F _gSubnet extraction by FaceNet network initialization featureF _g. AFFNet is the most accurate in Table 1, indicating that the micromability of MLS makesF _gHas learning ability and good effect on the space calibration of the degraded image and the selected guide image. In addition, illumination translation and self-adaptive fusion based on AIN can effectively generate a vivid result on a real LQ image, and the restoration accuracy and generalization capability of the AFFNet are improved.

The above-mentioned embodiments are only preferred embodiments of the present invention, and do not limit the scope of the present invention, but all the modifications made by the principles of the present invention and the non-inventive efforts based on the above-mentioned embodiments shall fall within the scope of the present invention.

Claims

1. A blind face restoration method, comprising the steps of:

constructing an AFFNet network;

2. A blind face restoration method according to claim 1, wherein the enhancing image data of the blind face data set comprises randomly cropping, horizontally flipping and chroma transforming the image of the blind face data set.

3. The blind face restoration method according to claim 1, wherein the expression of the blind face restoration model is:

wherein the content of the first and second substances,

representing a degraded image of the face of a person,

a feature representing a degraded image is present in the image,

is a key point of the degraded image,

a key point representing the guide image is displayed,

the number of the key points is represented,

denotes a parameter, k

[0,

]，

Representing model parameters.

4. The blind face restoration method according to claim 3, further comprising performing degradation model processing on the blind face data, wherein the expression is as follows:

wherein the content of the first and second substances,

a bi-cubic down-sampler is shown,

indicating having a noise level

5. The blind face restoration method according to claim 4, wherein the AFFNet network adopts a weighted least square method WLS model to select an optimal guide image from the blind face data set, performs spatial calibration and illumination translation on the guide image in a feature space by using a moving least square method and adaptive example normalization, and fuses curve features of the guide image and restoration features of a degraded image by using adaptive space features.

6. The blind face restoration method according to claim 5, wherein the Weighted Least Squares (WLS) model selects an optimal guidance image from the blind face data set by using a minimum weighted affine distance, and the expression is as follows:

wherein the content of the first and second substances,D _a(L ^d,

) Representing an affine distance;w _mis shown asmThe weight of each keypoint;

and

is that

indicating the interchange of rows and columns of the matrix.

7. A blind face restoration method according to claim 6, characterized by using

Optimal guide image in forward propagation

wherein the content of the first and second substances,

presentation guidance image

The affine distance of (c).

8. The blind face restoration method according to claim 7, wherein the spatial calibration and illumination translation of the guidance image in the feature space using the moving least squares method and adaptive instance normalization comprises the following steps:

affine matrix of the guide imageM _pThe expression of (a) is:

is that

obtaining curve characteristics of guide image through bilinear interpolation

The expression is as follows:

a coordinate representing the guide image;

is (a)x,y) Homogeneous coordinates of (a);Nto represent

4 nearest neighbors of;F ^gfeatures representing an optimal guide image;

(.) andu(.) mean and standard deviation, respectively.

9. The blind face restoration method according to claim 1, wherein the joint reconstruction loss function, the perceptual loss function, the pattern loss function and the antagonism loss function train an AFFNet network, and the expression is as follows:

wherein the content of the first and second substances,

representing a perceptual loss function;

wherein the content of the first and second substances,

_perca weight parameter representing a perception loss function, wherein the value range of the weight parameter is 0 to 1;

wherein the content of the first and second substances,

which represents the reconstructed image(s) of the image,

wherein the content of the first and second substances,

[1,2,3,4]；

The perceptual loss functionL _realThe expression of (a) is:

wherein the content of the first and second substances,

_adva weight parameter representing a resistance loss function, the value range of which is 0 to 1;

the expression of the style loss function is:

wherein the content of the first and second substances,

which represents the reconstructed image(s) of the image,

[1,2,3,4](ii) a The above-mentioned

Representing the row-column interchange of the matrix;

wherein the content of the first and second substances,Iand

to representIToP(I) The maximum likelihood estimate of (a) is,

to represent

ToP(

) The maximum likelihood estimate of (a) is,

to represent

ToP(

) The maximum likelihood estimate of (a) is,

to represent

The input neural network generates a patch.

10. A system for using the blind face restoration method according to any one of claims 1 to 9, comprising:

the data preprocessing module is used for acquiring a blind face data set, evaluating the quality of the blind face data set by utilizing a Laplacian gradient and removing blurred and non-face images; enhancing image data of the blind face data set, and randomly distributing to obtain a training set and a test set;