CN111260577B - Face image restoration system based on multi-guide image and self-adaptive feature fusion - Google Patents

Face image restoration system based on multi-guide image and self-adaptive feature fusion Download PDF

Info

Publication number
CN111260577B
CN111260577B CN202010039493.5A CN202010039493A CN111260577B CN 111260577 B CN111260577 B CN 111260577B CN 202010039493 A CN202010039493 A CN 202010039493A CN 111260577 B CN111260577 B CN 111260577B
Authority
CN
China
Prior art keywords
convolution
layer
hole
output
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010039493.5A
Other languages
Chinese (zh)
Other versions
CN111260577A (en
Inventor
左旺孟
李晓明
李文瑜
张宏志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN202010039493.5A priority Critical patent/CN111260577B/en
Publication of CN111260577A publication Critical patent/CN111260577A/en
Application granted granted Critical
Publication of CN111260577B publication Critical patent/CN111260577B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses a face image restoration system based on multi-guide map and self-adaptive feature fusion, relates to the technical field of image restoration processing, and aims to solve the problem that a real low-quality image cannot be effectively restored in the prior art.

Description

Face image restoration system based on multi-guide image and self-adaptive feature fusion
Technical Field
The invention relates to the technical field of image restoration processing, in particular to a human face image restoration system based on multi-guide image and self-adaptive feature fusion.
Background
The face image restoration aims to restore a high-quality face image from a low-quality face image (fuzzy, low resolution, serious compression, much noise and other common low quality). The low-quality face image usually inevitably reduces the quality of the face image due to long shooting years, limited shooting equipment and distortion existing in the storage process. With the development of technology, multimedia information with high visual quality, such as 2K and 4K videos, is increasingly sought after. Therefore, how to restore a low-quality face image into a high-quality image is a hot research topic of researchers.
In recent years, with further development of deep learning, image restoration has gradually advanced in a breakthrough manner. The practical development of the convolutional neural network applied to different single restoration tasks, such as super-resolution, drying and deblurring, has also been achieved by the scholars. However, because the degradation type of a real low-quality image is unknown, the existing method cannot effectively restore and enhance the image, and often cannot achieve a more ideal enhancement effect.
The existing face restoration work based on the front guide image only takes one front image as guide information for each person, and the method cannot be suitable for the real face restoration work. Due to the diversity of the human face, the human face has different postures, expressions and illumination influences, and a single guide image cannot be suitable for a real human face recovery scene.
Furthermore, by simply concatenating the guidance map with the degradation map features, the guidance map effect cannot be effectively exerted. The degradation map requires an effective degree of degradation from the degradation map to provide corresponding features.
Disclosure of Invention
The purpose of the invention is: aiming at the problem that a real low-quality image cannot be effectively restored in the prior art, a face image restoration system based on multi-guide image and self-adaptive feature fusion is provided.
The technical scheme adopted by the invention to solve the technical problems is as follows:
the human face image restoration system based on multi-guide map and self-adaptive feature fusion comprises an optimal guide map selection module, an optimal guide map feature extraction module, a degraded map human face key point feature extraction module, an optimal guide map feature posture correction module, an illumination distribution correction module, a step-by-step self-adaptive feature fusion module and a restoration result reconstruction module,
the optimal guide map selection module selects an optimal guide map with the most similar expression and posture to the degradation map from the plurality of guide maps by calculating the optimal weighted affine transformation distance of the key points of the human face between the degradation map and the guide map;
the optimal guide map feature extraction module is used for extracting features of an optimal guide map;
the degradation map feature extraction module is used for extracting features of a degradation map;
the degradation image face key point feature extraction module is used for obtaining face key point features according to key points of a degradation image, wherein the key points of the degradation image are obtained by detecting the key points of the degradation image through a face key point detection algorithm;
the optimal guide map feature posture correction module obtains a deformation vector by calculating the face key point features and the moving least squares between the optimal guide map key points, deforms the optimal guide map to the degraded map posture and expression, and obtains a deformed optimal guide map, wherein the optimal guide map key points are obtained by detecting the key points of the optimal guide map through a face key point detection algorithm;
the illumination distribution correction module is used for carrying out self-adaptive example normalization operation on the characteristics of the deformed optimal guide image and the characteristics of the degradation image to obtain the characteristics of the final guide image;
the step-by-step self-adaptive feature fusion module is used for dynamically and self-adaptively adding the features of the final guide map into the features of the degradation map step by step to obtain the enhanced features of the degradation map;
and the restoration result reconstruction module is used for outputting the characteristics of the enhanced degradation image to restore a human face image through a multilayer neural network.
Further, the optimal guide map feature extraction module includes M1 hole convolution residual error units, the degraded map feature extraction module includes M2 hole convolution residual error units, the degraded map face key point feature extraction module includes M3 unbiased convolution layers, the step-by-step adaptive feature fusion module includes M4 adaptive feature fusion units, and the multi-layer neural network in the restoration result reconstruction module is M5 hole convolution residual error units, wherein M1, M2, M3, M4, and M5 are all greater than or equal to 1.
Furthermore, the guide image comprises N high-definition face images with different postures, expressions and illumination and the same identity, wherein N is greater than or equal to 1.
Further, the optimal guidance map selection module executes the following steps:
firstly, an optimized weighted least square module based on similarity of human face key points is adopted, and the weighted least square module executes the following steps: firstly, selecting the pose and expression most similar to those of the degraded image from N guide images, and then detecting P key points and L key points of each guide image by adopting a face key point detection algorithm p= (x p ,y p ) Representing the horizontal axis and the vertical axis of the p-th key point of the face, defining the key point of the degradation graph as L d For all the guidance maps corresponding to the degradation map, the key point of the mth guidance map is defined as
Figure BDA0002368046090000021
The defined weighted least squares are:
Figure BDA0002368046090000022
wherein k is * The optimal guide map sequence number selected from the N guide maps is shown;
a represents an affine transformation matrix;
Figure BDA0002368046090000031
represents->
Figure BDA0002368046090000032
In the amplification matrix>
Figure BDA0002368046090000033
k represents the kth keypoint;
w m represents the weight of the mth keypoint;
Figure BDA0002368046090000034
representing a reflection transformation distance;
given a degradation map and a guidance map, the closed solution is:
Figure BDA0002368046090000035
where W = Diag (W) is a diagonal matrix of keypoint weights W;
updating the weight w by using a network gradient back propagation algorithm, which is defined as follows:
Figure BDA0002368046090000036
and d, calculating and updating the whole weight of the w to obtain an optimal guide map.
Further, the number of the cavity convolution residual error units in the optimal guide map feature extraction module is 3, and the optimal guide map feature extraction module executes the following steps;
inputting the selected optimal guide map;
the optimal guide map feature extraction module comprises a convolutional layer C1, a convolutional layer C2, a convolutional layer C3, a convolutional layer C4, a void convolution unit D1, a void convolution unit D2 and a void convolution unit D3;
the convolution layer C1 is used for carrying out a first convolution operation and an activation operation on the optimal guide map;
the hole convolution residual unit D1 is used for performing first hole convolution operation, activation operation, residual operation, second hole convolution operation, activation operation and residual operation on the output of the convolution layer C1;
the convolution layer C2 is used for sequentially carrying out second convolution operation, normalization operation and activation operation on the output of the hole convolution residual error unit D1;
the hole convolution residual unit D2 is used for performing third hole convolution operation, activation operation, residual operation, fourth hole convolution operation, activation operation and residual operation on the output of the convolution layer C2;
the convolution layer C3 is used for sequentially carrying out third convolution operation, normalization operation and activation operation on the output of the hole convolution residual error unit D2;
the hole convolution residual unit D3 is used for performing fifth hole convolution operation, activation operation, residual operation, sixth hole convolution operation, activation operation and residual operation on the output of the convolution layer C3;
the convolution layer C4 is used for sequentially performing fourth convolution operation and activation operation on the output of the hole convolution residual error unit D3;
the output of the convolutional layer C4 is the optimal guide map characteristic;
wherein, the activation operation adopts LReLU function, the normalization operation adopts BatchNorm,
the first convolution operation is a convolution operation of 64 by 3 with a step size of 1;
the first hole convolution operation is 64 hole convolution operations with 3 × 3, step size of 1 and hole rate of 7;
the second hole convolution operation is 64 hole convolution operations with 3 × 3, step size of 1 and hole rate of 5;
the second convolution operation is a convolution operation of 128 by 3 with a step size of 2;
the third hole convolution operation is 128 hole convolution operations with 3 × 3, step size of 1 and hole rate of 5;
the fourth hole convolution operation is 128 hole convolution operations with 3 × 3, the step size is 1, and the hole rate is 3;
the third convolution operation is a convolution operation of 128 by 3 with a step size of 2;
the fifth hole convolution operation is 128 hole convolution operations with 3 × 3, step size 1 and hole rate 3;
the sixth hole convolution operation is 128 hole convolution operations with 3 × 3, step size of 1 and hole rate of 1;
the fourth convolution operation is 128 by 3 convolution operations with a step size of 1.
Further, the number of the hole convolution residual error units in the degradation map feature extraction module is 3, and the following steps are executed:
inputting a degradation map;
the degenerate graph feature extraction module comprises a convolutional layer C5, a convolutional layer C6, a convolutional layer C7, a convolutional layer C8, a hole convolution residual unit D4, a hole convolution residual unit D5 and a hole convolution residual unit D6;
the convolution layer C5 is used for carrying out a first convolution operation and an activation operation on the degradation map;
the hole convolution residual unit D4 is used for performing first hole convolution operation, activation operation, residual operation, second hole convolution operation, activation operation and residual operation on the output of the convolution layer C5;
the convolution layer C6 is used for sequentially carrying out second convolution operation, normalization operation and activation operation on the output of the hole convolution residual error unit D4;
the hole convolution residual error unit D5 is used for performing third hole convolution operation, activation operation, residual error operation, fourth hole convolution operation, activation operation and residual error operation on the output of the convolution layer C6;
the convolution layer C7 is used for sequentially carrying out third convolution operation, normalization operation and activation operation on the output of the hole convolution residual error unit D5;
the hole convolution residual unit D6 is used for performing fifth hole convolution operation, activation operation, residual operation, sixth hole convolution operation, activation operation and residual operation on the output of the convolutional layer C7;
the convolutional layer C8 is used for sequentially performing fourth convolution operation and activation operation on the output of the hole convolution residual error unit D6;
the output of convolutional layer C8 is characteristic of the degradation map;
wherein, the activation operation adopts LReLU function, the normalization operation adopts BatchNorm,
the first convolution operation is 64 convolution operations with 3 x 3 and step size of 1;
the first hole convolution operation is 64 hole convolution operations with 3 × 3, the step size of 1 and the hole rate of 7;
the second hole convolution operation is 64 hole convolution operations with 3 × 3, step size of 1 and hole rate of 5;
the second convolution operation is 128 convolution operations with 3 x 3 and step size of 2;
the third hole convolution operation is a hole convolution operation with 128 holes 3 x 3, the step size is 1, and the hole rate is 5;
the fourth hole convolution operation is 128 hole convolution operations with 3 × 3, step size 1 and hole rate 3;
the third convolution operation is a convolution operation of 128 by 3 with a step size of 2;
the fifth hole convolution operation is 128 hole convolution operations with 3 × 3, step size of 1 and hole rate of 3;
the sixth hole convolution operation is a hole convolution operation with 128 holes 3 × 3, the step size is 1, and the hole rate is 1;
the fourth convolution operation is 128 3 x 3 convolution operations with step size of 1
Furthermore, the number of unbiased convolution layers in the degraded graph human face key point feature extraction module is 10, and the following steps are executed:
inputting a degradation image face key point;
the degraded graph face key point feature extraction module comprises a convolution layer C9, a convolution layer C10, a convolution layer C11, a convolution layer C12, a convolution layer C13, a convolution layer C14, a convolution layer C15, a convolution layer C16, a convolution layer C17 and a convolution layer C18;
the convolution layer C9 is used for carrying out first convolution operation and activation operation on the degraded image human face key point diagram;
the convolutional layer C10 is used for performing a second convolution operation and an activation operation on the output of the C9;
the convolution layer C11 is used for performing a third convolution operation and an activation operation on the output of the C10;
the convolutional layer C12 is used for performing a fourth convolution operation and an activation operation on the output of the C11;
the convolutional layer C13 is used for performing a fifth convolution operation and an activation operation on the output of the C12;
the convolutional layer C14 is used for performing a sixth convolution operation and an activation operation on the output of the C13;
the convolutional layer C15 is used for performing a seventh convolution operation and an activation operation on the output of the C14;
the convolution layer C16 is used for carrying out the eighth convolution operation and the activation operation on the output of the C15;
the convolutional layer C17 is used for performing a ninth convolution operation and an activation operation on the output of the C16;
convolutional layer C18 is used to perform the tenth convolution operation and activation operation on the output of C17;
the output of the convolutional layer C18 is the key point characteristics of the human face of the degraded image;
the activation operation employs the LReLU function,
the first convolution operation is 64 convolution operations with 9 × 9, step size of 2, and offset of 0;
the second convolution operation is 64 convolution operations with 3 × 3, step size of 1 and offset execution of 0;
the third convolution operation is 64 convolution operations with 7 × 7, step size of 1 and offset execution of 0;
the fourth convolution operation is a convolution operation of 128 by 3 with a step size of 1 and an offset of 0;
the fifth convolution operation is 128 5 × 5, the step size is 2, and the offset is 0;
the sixth convolution operation through the tenth convolution operation are convolution operations of 128 by 3 with a step size of 1 and offset by 0.
Further, the optimal guide map feature posture correction module executes the following steps:
inputting the characteristics of the optimal guide image, the optimal guide image face key points and the degraded image face key points;
defining optimal guide graph characteristics as F g The face key point of the optimal guide map is L g The key point of the face of the degraded graph is L d B, carrying out the following steps of; by using a moving least square deformation method, the affine transformation matrix of each point is as follows:
Figure BDA0002368046090000061
wherein:
Figure BDA0002368046090000062
p is given a key point position>
Figure BDA0002368046090000063
Is L d The amplification matrix of (a) is,
according to the affine transformation matrix, through a bilinear interpolation mode, the obtained deformation optimal guide graph is characterized in that:
Figure BDA0002368046090000064
Figure BDA0002368046090000065
and 4, outputting the characteristics of the deformed optimal guide map.
Further, the illumination distribution correction module executes the following steps:
inputting the characteristics of the optimal guide map and the characteristics of the degradation map after deformation,
and normalizing the deformed guide image through a self-adaptive example, wherein the illumination of the deformed guide image is further consistent with the distribution of the degradation image, and the formula is as follows:
Figure BDA0002368046090000066
wherein the content of the first and second substances,
F d is a degradation map feature;
F gw the optimal guide map characteristics after deformation are obtained;
σ (-) represents the mean of the spatial channel positions along the feature map;
μ (-) denotes the standard deviation of the spatial channel position along the feature map,
the output is the characteristics of the final guide map after the illumination correction.
Further, the number of the adaptive feature fusion units is 4, including adaptive feature fusion units P1, P2, P3, and P4, which perform the following steps:
inputting final guide image characteristics, degraded image characteristics and degraded image human face key point characteristics,
when the P1 characteristics are fused:
the convolution layer C19 is used for carrying out convolution operation on the characteristics of the degradation graph for the first time and further extracting the characteristics;
the convolution layer C20 is used for carrying out second convolution operation on the characteristics of the final guide image and further extracting the characteristics;
the convolutional layer C21 is used for carrying out a third convolution operation on the face key point characteristics and further extracting the characteristics;
the series layer T1 characteristically connects the outputs of the convolutional layers C19, C20 and C21 in series;
the convolution layer C22 performs the convolution operation and the activation operation on the output of the T1 for the fourth time;
the convolution layer C23 performs a fifth convolution operation and an activation operation on the output of the C22;
the convolution layer C24 performs a sixth convolution and activation operation on the characteristics of the degradation graph;
the convolutional layer C25 performs a seventh convolution and activation operation on the output of the C24;
the convolution layer C26 performs eighth convolution and activation operation on the final guide map features;
the convolutional layer C27 performs ninth convolution and activation operation on the output of the C26;
the residual layer J1 performs a subtraction operation on the output of the C25 layer and the output of the C27 layer;
the scale layer multiplies the output of the J1 by the output of the C23 layer;
the recovery layer F1 adds the output of the scale layer J1 and the characteristics of the degradation map and outputs the enhanced characteristics of the P1 level;
the first to third convolution operations are all 64 convolution operations with 1 x 1 and step size of 1;
the fourth to seventh convolution operations are those with 128 3 x 3, steps of 1;
the eighth to ninth convolution operations are those with 128 1 x 1, step size of 1;
when the P2 characteristics are fused:
the convolution layer C28 is used for carrying out convolution operation on the P1 enhanced degradation image characteristics for the first time and further extracting the characteristics;
the convolution layer C29 is used for carrying out a second convolution operation on the characteristics of the final guide map and further extracting the characteristics;
the convolution layer C30 is used for carrying out convolution operation for the third time on the face key point characteristics and further extracting the characteristics;
the cascade layer T2 characteristically cascades the outputs of the convolutional layers C28, C29 and C30;
the convolution layer C31 carries out the convolution operation and activation operation for the fourth time on the output of the T2;
the convolutional layer C32 performs a fifth convolution operation and an activation operation on the output of the C31;
the convolution layer C33 performs a sixth convolution and activation operation on the characteristics of the degradation graph;
the convolutional layer C34 performs a seventh convolution and activation operation on the output of the C33;
carrying out eighth convolution and activation operation on the P1 enhanced degradation graph characteristics by the convolution layer C35;
the convolutional layer C36 performs ninth convolution and activation operation on the output of the C35;
the residual layer J2 performs a subtraction operation on the output of the C34 layer and the output of the C36 layer;
the scale layer multiplies the output of the J2 layer with the output of the C32 layer;
the restoration layer F2 adds the output of the scale layer J2 and the degradation map characteristic and outputs a P2-level enhancement characteristic;
the first to third convolution operations are all 64 convolution operations with 1 x 1 and step size of 1;
the fourth to seventh convolution operations are those with 128 3 × 3 steps of 1;
the eighth to ninth convolution operations are those with 128 1 x 1, step size of 1;
when the P3 characteristics are fused:
the convolution layer C37 is used for carrying out convolution operation on the P2 enhanced degradation image characteristics for the first time and further extracting the characteristics;
the convolution layer C38 is used for carrying out second convolution operation on the final guide map features and further extracting the features;
the convolution layer C39 is used for carrying out convolution operation for the third time on the face key point characteristics and further extracting the characteristics;
the series layer T3 characteristically connects the outputs of the convolutional layers C37, C38 and C39 in series;
the convolution layer C40 performs a fourth convolution operation and activation operation on the output of the T3;
the convolution layer C41 performs a fifth convolution operation and an activation operation on the output of the C40;
the convolution layer C42 performs a sixth convolution and activation operation on the characteristics of the degradation graph;
the convolution layer C43 performs a seventh convolution and activation operation on the output of the C44;
performing eighth convolution and activation operation on the P1 enhanced degradation graph characteristics by the convolution layer C45;
the convolution layer C46 performs ninth convolution and activation operation on the output of the C45;
the residual layer J3 performs a subtraction operation on the output of the C43 layer and the output of the C46 layer;
the scale layer multiplies the output of the J3 layer by the output of the C41 layer;
the restoration layer F3 adds the output of the scale layer J3 and the degradation map characteristic and outputs a P3-level enhancement characteristic;
the first to third convolution operations are all 64 convolution operations with 1 x 1 and step size of 1;
the fourth to seventh convolution operations are those with 128 3 x 3, steps of 1;
the eighth to ninth convolution operations are those with 128 1 x 1, step size of 1;
when the P4 characteristics are fused:
the convolution layer C47 is used for carrying out convolution operation on the P3 enhanced degradation image characteristics for the first time and further extracting the characteristics;
the convolution layer C48 is used for carrying out the second convolution operation on the characteristics of the final guide map and further extracting the characteristics;
the convolution layer C49 is used for carrying out convolution operation for the third time on the human face key point characteristics and further extracting the characteristics;
the series layer T4 characteristically connects the outputs of the convolutional layers C47, C48 and C49 in series;
the convolutional layer C50 performs a fourth convolution operation and activation operation on the output of the T4;
the convolutional layer C51 performs a fifth convolution operation and activation operation on the output of the C50;
the convolution layer C52 performs a sixth convolution and activation operation on the characteristics of the degradation graph;
the convolution layer C53 performs a seventh convolution and activation operation on the output of the C32;
the convolution layer C54 performs eighth convolution and activation operation on the P1 enhanced degradation graph characteristics;
the convolutional layer C55 performs ninth convolution and activation operations on the output of the C54;
the residual layer J4 performs a subtraction operation on the output of the C53 layer and the output of the C55 layer;
the scale layer multiplies the output of the J4 layer by the output of the C51 layer;
the recovery layer F4 adds the output of the scale layer J4 and the degradation map characteristic to output a P4 th level enhancement characteristic;
the first to third convolution operations are all 64 convolution operations with 1 x 1 and step size of 1;
the fourth to seventh convolution operations are those with 128 3 × 3 steps of 1;
the eighth to ninth convolution operations are those with 128 1 x 1, step size of 1;
and finally, outputting the enhanced degradation graph characteristics after the multi-stage self-adaptive characteristic fusion.
Furthermore, the number of the hollow convolution residual error units in the restoration result reconstruction module is 3, and the following steps are executed
Inputting the characteristics of the degradation map after progressive enhancement,
the characteristic reconstruction module comprises a convolutional layer C56, a convolutional layer C57, a convolutional layer C58, a convolutional layer C59, a cavity convolution residual error unit D7, a cavity convolution residual error unit D8 and a cavity convolution residual error unit D9;
the convolution layer C56 is used for carrying out first convolution operation and activation operation on the gradually enhanced degradation graph characteristics;
the hole convolution residual unit D7 is used for performing first hole convolution operation, activation operation, residual operation, second hole convolution operation, activation operation and residual operation on the output of the convolutional layer C56;
the upsampling layer S1 is used for performing upsampling twice on the characteristics by using PixelShuffle for the output of the D7;
the convolution layer C57 is used for sequentially performing second convolution operation, normalization operation and activation operation on the output of the up-sampling layer S1;
the hole convolution residual unit D8 is used for performing third hole convolution operation, activation operation, residual operation, fourth hole convolution operation, activation operation and residual operation on the output of the convolutional layer C57;
the upsampling layer S2 is used for performing upsampling twice on the characteristics by using PixelShuffle for the output of the D8;
the convolution layer C58 is used for sequentially performing third convolution operation, normalization operation and activation operation on the output of the up-sampling layer S1;
the hole convolution residual unit D9 is used for performing fifth hole convolution operation, activation operation, residual operation, sixth hole convolution operation, activation operation and residual operation on the output of the convolution layer C58;
convolutional layer C59 performs a fourth convolution operation on the output of the hole convolution residual unit D9
The output activation operation performs the last activation operation on the output of C59;
outputting the output of the activation operation as a final restoration result;
the activation operation adopts LReLU function, the last activation operation adopts Tanh,
the first convolution operation is 256 convolution operations with 3 × 3 and step size of 1;
the first hole convolution operation is 256-by-3 hole convolution operations with the step size of 1 and the hole rate of 1;
the second hole convolution operation is 256 hole convolution operations with 3 × 3, step size of 1 and hole rate of 1;
the second convolution operation is 128 convolution operations with 3 x 3, step size of 1;
the third hole convolution operation is 128 hole convolution operations with 3 × 3, step size of 1 and hole rate of 1;
the fourth hole convolution operation is 128-by-3 hole convolution operations with a step size of 1 and a hole rate of 1;
the third convolution operation is a convolution operation of 64 by 3 with a step size of 1;
the fifth hole convolution operation is 64 hole convolution operations with 3 × 3, step size of 1 and hole rate of 1;
the sixth hole convolution operation is 64 hole convolution operations with 3 × 3, step size of 1 and hole rate of 1;
the fourth convolution operation is 32 convolution operations with 3 x 3 and step size 1.
Further, the system also comprises a training network which restrains the whole network learning by reconstructing the loss,
the reconstruction loss comprises a restored face image
Figure BDA0002368046090000101
Loss between pixel space loss and features of the face recognition network of the undegraded high definition image I corresponding thereto;
the Euclidean distance loss of the pixel space loss is defined as:
Figure BDA0002368046090000102
wherein c.h.w represents the number of channels and the width of the image;
the Euclidean distance loss of the feature space loss of the face recognition network is defined as:
Figure BDA0002368046090000103
in the formula, Ψ u The u-th layer convolution characteristic obtained for pre-training the face recognition network, C u 、H u And W u Respectively obtaining the channel number, height and width of the u-th layer convolution characteristic of the input face picture in a pre-trained face recognition network, wherein u belongs to [1,2,3 and 4 ]]。
Further, the training network further comprises a step of constraining the feature distribution of the restoration result to be consistent with the feature distribution of the degraded image, wherein the constraint adopts a style loss function, and the specific mode is as follows:
the style loss is obtained by calculating a Gram matrix of the restored image and the undegraded image in a face recognition network feature space, and the method specifically comprises the following steps:
Figure BDA0002368046090000111
in the formula, C u 、H u And W u Respectively obtaining the channel number, height and width of the u-th layer convolution characteristic of the input face picture in a pre-trained face recognition network, wherein u belongs to [1,2,3 and 4 ]]。
Further, the training network constrains the facial image restoration network through discriminant loss, and the specific method is as follows:
firstly, training a network to obtain discrimination loss through a discrimination network based on self attention;
then the characteristics output by the network are judged to carry out changeloss calculation,
for learning of the discriminative network, the penalty is defined as:
Figure BDA0002368046090000112
for learning of the generation network under the discriminant network constraints, the penalty is defined as:
Figure BDA0002368046090000113
further, the training network performs end-to-end training on the image restoration overall network by adopting an Adam optimization algorithm.
Furthermore, the degradation image is obtained by sequentially carrying out blurring, down-sampling, noise adding and JPEG compression on the high-definition face image,
wherein, the fuzzy processing adopts Gaussian fuzzy and motion fuzzy, and the standard deviation of Gaussian fuzzy kernel
Figure BDA0002368046090000114
The down-sampling treatment adopts a bicubic down-sampling method, and the sampling scale s belongs to { 1.1;
the noise adding process adopts Gaussian white noise, and the noise level n belongs to {0, 1;
the JPEG compression quality parameter q is larger than {0, 10.
The beneficial effects of the invention are:
the invention provides a multi-guide-map face restoration algorithm, and provides posture correction, illumination correction and step-by-step adaptive feature fusion operation, after the optimal guide map is subjected to posture expression and illumination correction, the optimal guide map can have high alignment degree with a degraded face, and missing identity detail information can be provided for the restoration of the degraded face image. Therefore, compared with the existing facial image restoration system based on the convolutional neural network, the facial image restoration system based on the fusion of the multi-guide graph and the self-adaptive feature can effectively enhance the low-quality facial image and effectively restore the real low-quality image.
Drawings
FIG. 1 is a flow chart of a face image restoration system according to the present invention.
Detailed Description
The first embodiment is as follows: specifically describing the embodiment with reference to fig. 1, the system for restoring a facial image based on multi-guide map and adaptive feature fusion according to the embodiment comprises an optimal guide map selection module, an optimal guide map feature extraction module, a degraded map face key point feature extraction module, an optimal guide map feature posture correction module, an illumination distribution correction module, a step-by-step adaptive feature fusion module and a restoration result reconstruction module,
the optimal guide map selection module selects an optimal guide map with the most similar expression and posture to the degradation map from the plurality of guide maps by calculating the optimal weighted affine transformation distance of the key points of the human face between the degradation map and the guide map;
the optimal guide map feature extraction module is used for extracting features of an optimal guide map;
the degradation map feature extraction module is used for extracting features of a degradation map;
the degraded image face key point feature extraction module is used for obtaining face key point features according to key points of the degraded image, wherein the key points of the degraded image are obtained by performing key point detection on the degraded image through a face key point detection algorithm;
the optimal guide image feature posture correction module obtains deformation vectors by calculating face key point features and moving least squares between the optimal guide image key points, deforms the optimal guide image to the posture and expression of the degraded image, and obtains the deformed optimal guide image, wherein the optimal guide image key points are obtained by detecting key points of the optimal guide image through a face key point detection algorithm;
the illumination distribution correction module is used for carrying out self-adaptive example normalization operation on the characteristics of the deformed optimal guide image and the characteristics of the degraded image to obtain the characteristics of the final guide image;
the step-by-step self-adaptive feature fusion module is used for dynamically and self-adaptively adding the features of the final guide image into the features of the degraded image step by step to obtain the features of the enhanced degraded image;
and the restoration result reconstruction module is used for outputting the characteristics of the enhanced degradation image through a multilayer neural network to restore a human face image.
The optimal guide map feature extraction module comprises M1 hole convolution residual error units, the degraded map feature extraction module comprises M2 hole convolution residual error units, the degraded map face key point feature extraction module comprises M3 unbiased convolution layers, the step-by-step adaptive feature fusion module comprises M4 adaptive feature fusion units, the multilayer neural network in the restoration result reconstruction module is M5 hole convolution residual error units, and M1, M2, M3, M4 and M5 are all larger than or equal to 1.
And obtaining the degradation map according to the low-quality image of the image to be restored.
The human face features are provided by detail texture from the guide image, and coarse texture from the degraded image.
And selecting the optimal guide map closest to the posture and expression of the degradation map from the N guide maps.
N.gtoreq.1 is preferred.
Adopting an optimized weighted least square module based on similarity of the face key points, wherein the weighted least square module executes the following steps: firstly, selecting the pose and expression which are most similar to those of the degraded image from the N guide images, and for the degraded image and all the guide images, firstly adopting a face key point detection algorithm to detect P key points and L key points of each face image p =(x p ,y p ) The horizontal axis and vertical axis coordinates of the p-th key point representing the key point of the face are defined as L for the key point of the degradation graph d For all the guidance maps corresponding to the degradation map, the key point of the mth guidance map is defined as
Figure BDA0002368046090000131
The weighted least squares are defined as:
Figure BDA0002368046090000132
wherein k is * The optimal guide map sequence number selected from the N guide maps is shown;
a represents an affine transformation matrix;
Figure BDA0002368046090000133
represents->
Figure BDA0002368046090000134
In the amplification matrix>
Figure BDA0002368046090000135
k represents the kth keypoint;
w m represents the weight of the mth keypoint;
Figure BDA0002368046090000136
representing a reflection transformation distance;
given a degradation map and a guidance map, the above formula has a closed solution:
Figure BDA0002368046090000137
where W = Diag (W) is a diagonal matrix of keypoint weights W;
for different keypoint weights, the weight w can be updated by a back propagation algorithm of the network gradient, defined as follows:
Figure BDA0002368046090000138
the whole weight W can be updated by deriving W, so that an optimized weighted least square algorithm is realized to obtain an optimal guide map.
Preferably, the optimal guide map feature extraction module is preferably M1=3;
inputting the selected optimal guide map;
the optimal guide map feature extraction module comprises a convolutional layer C1, a convolutional layer C2, a convolutional layer C3, a convolutional layer C4, a void convolution unit D1, a void convolution unit D2 and a void convolution unit D3; the convolution layer C1 is used for carrying out a first convolution operation and an activation operation on the optimal guide map;
the hole convolution residual unit D1 is used for performing first hole convolution operation, activation operation, residual operation, second hole convolution operation, activation operation and residual operation on the output of the convolution layer C1;
the convolution layer C2 is used for sequentially carrying out second convolution operation, normalization operation and activation operation on the output of the hole convolution residual error unit D1;
the hole convolution residual unit D2 is used for performing third hole convolution operation, activation operation, residual operation, fourth hole convolution operation, activation operation and residual operation on the output of the convolution layer C2;
the convolution layer C3 is used for sequentially performing third convolution operation, normalization operation and activation operation on the output of the hole convolution residual error unit D2;
the hole convolution residual unit D3 is used for performing fifth hole convolution operation, activation operation, residual operation, sixth hole convolution operation, activation operation and residual operation on the output of the convolution layer C3;
the convolution layer C4 is used for sequentially carrying out fourth convolution operation and activation operation on the output of the hole convolution residual error unit D3;
the output of the convolutional layer C4 is the optimal guide map characteristic;
wherein, the activation operation adopts LReLU function, the normalization operation adopts BatchNorm,
the first convolution operation is a convolution operation of 64 by 3 with a step size of 1;
the first hole convolution operation is 64 hole convolution operations with 3 × 3, step size of 1 and hole rate of 7;
the second hole convolution operation is 64 hole convolution operations with 3 × 3, step size of 1 and hole rate of 5;
the second convolution operation is a convolution operation of 128 by 3 with a step size of 2;
the third hole convolution operation is a hole convolution operation with 128 holes 3 x 3, the step size is 1, and the hole rate is 5;
the fourth hole convolution operation is 128 hole convolution operations with 3 × 3, step size 1 and hole rate 3;
the third convolution operation is 128 convolution operations with 3 x 3 and step size of 2;
the fifth hole convolution operation is 128 hole convolution operations with 3 × 3, step size 1 and hole rate 3;
the sixth hole convolution operation is a hole convolution operation with 128 holes 3 × 3, the step size is 1, and the hole rate is 1;
the fourth convolution operation is 128 by 3 convolution operations with a step size of 1.
Preferably, the degradation map feature extraction module is preferably configured to extract a degradation map feature, preferably M2=3;
inputting a degradation map;
the degenerate graph feature extraction module comprises a convolutional layer C5, a convolutional layer C6, a convolutional layer C7, a convolutional layer C8, a hole convolution residual unit D4, a hole convolution residual unit D5 and a hole convolution residual unit D6; the convolution layer C5 is used for carrying out a first convolution operation and an activation operation on the degradation map;
the hole convolution residual block D4 is used for performing first hole convolution operation, activation operation, residual operation, second hole convolution operation, activation operation and residual operation on the output of the convolution layer C5;
the convolution layer C6 is used for outputting the cavity convolution residual error block D4 to sequentially carry out second convolution operation, normalization operation and activation operation;
the hole convolution residual error block D5 is used for performing third hole convolution operation, activation operation, residual error operation, fourth hole convolution operation, activation operation and residual error operation on the output of the convolution layer C6;
the convolution layer C7 is used for sequentially performing third convolution operation, normalization operation and activation operation on the output of the hole convolution residual error block D5;
the hole convolution residual block D6 is used for performing fifth hole convolution operation, activation operation, residual operation, sixth hole convolution operation, activation operation and residual operation on the output of the convolution layer C7;
the convolution layer C8 is used for sequentially carrying out fourth convolution operation and activation operation on the output of the void convolution residual block D6;
the output of convolutional layer C8 is a degradation map feature;
wherein, the activation operation adopts LReLU function, the normalization operation adopts BatchNorm,
the first convolution operation is a convolution operation of 64 by 3 with a step size of 1;
the first hole convolution operation is 64 hole convolution operations with 3 × 3, step size of 1 and hole rate of 7;
the second hole convolution operation is 64 hole convolution operations with 3 × 3, step size of 1 and hole rate of 5;
the second convolution operation is a convolution operation of 128 by 3 with a step size of 2;
the third hole convolution operation is 128 hole convolution operations with 3 × 3, step size of 1 and hole rate of 5;
the fourth hole convolution operation is 128 hole convolution operations with 3 × 3, step size 1 and hole rate 3;
the third convolution operation is a convolution operation of 128 by 3 with a step size of 2;
the fifth hole convolution operation is 128 hole convolution operations with 3 × 3, step size 1 and hole rate 3;
the sixth hole convolution operation is a hole convolution operation with 128 holes 3 × 3, the step size is 1, and the hole rate is 1;
the fourth convolution operation is 128 by 3 convolution operations with a step size of 1.
Preferably, the degraded graph face key point feature extraction module is preferably M3=10;
inputting a degraded image human face key point diagram;
the degraded graph face key point feature extraction module comprises a convolution layer C9, a convolution layer C10, a convolution layer C11, a convolution layer C12, a convolution layer C13, a convolution layer C14, a convolution layer C15, a convolution layer C16, a convolution layer C17 and a convolution layer C18;
the convolution layer C9 is used for performing first convolution operation and activation operation on the degraded image human face key point diagram;
the convolution layer C10 is used for carrying out second convolution operation and activation operation on the degraded image human face key point diagram;
the convolution layer C11 is used for performing a third convolution operation and an activation operation on the degraded image human face key point diagram;
the convolution layer C12 is used for performing a fourth convolution operation and an activation operation on the degraded image human face key point diagram;
the convolution layer C13 is used for carrying out fifth convolution operation and activation operation on the degraded image human face key point diagram;
the convolution layer C14 is used for carrying out sixth convolution operation and activation operation on the degraded image human face key point diagram;
the convolution layer C15 is used for carrying out seventh convolution operation and activation operation on the degraded image human face key point diagram;
the convolution layer C16 is used for carrying out eighth convolution operation and activation operation on the degraded image human face key point diagram;
the convolution layer C17 is used for carrying out ninth convolution operation and activation operation on the degraded image human face key point diagram;
the convolution layer C18 is used for performing tenth convolution operation and activation operation on the degraded image human face key point diagram;
the output of the convolutional layer C18 is the key point characteristics of the human face of the degraded image;
the activation operation employs the LReLU function,
the first convolution operation is 64 convolution operations with 9 × 9, step size of 2 and offset of 0;
the second convolution operation is a convolution operation with 64 times 3, step size 1 and offset execution of 0;
the third convolution operation is 64 convolution operations with 7 × 7, step size of 1 and offset execution of 0;
the fourth convolution operation is a convolution operation of 128 by 3 with a step size of 1 and an offset of 0;
the fifth convolution operation is a convolution operation of 128 by 5, with a step size of 2, biased to 0;
the sixth convolution operation through the tenth convolution operation are convolution operations of 128 by 3 with a step size of 1 and offset by 0.
Preferably, the optimal guide map feature posture correction operation.
Inputting the optimal guide image characteristics, the optimal guide image face key points and the degraded image face key points;
defining the optimal guide map characteristic as F g The face key point of the optimal guide map is L g Face correlation of degraded imageThe key point is L d By the moving least square deformation method, the affine transformation matrix of each point is as follows:
Figure BDA0002368046090000161
wherein:
Figure BDA0002368046090000162
p is the given keypoint location.
Figure BDA0002368046090000163
Is L d The amplification matrix of (a) is,
according to the affine transformation matrix, through a bilinear interpolation mode, the obtained deformation optimal guide graph is characterized in that:
Figure BDA0002368046090000164
n is 4 neighbors. The above formula is conducive to convolutional neural networks, so the degradation map feature extraction stage can pass the gradients to the feature extraction module by back propagation.
And outputting the characteristics of the deformed optimal guide map.
Preferably, the deformation-optimal guide map feature illumination correction operation:
and inputting the deformed optimal guide map characteristic and the deformed optimal degradation map characteristic.
Normalizing the deformed guide image through an adaptive example, and further enabling the illumination of the deformed guide image to be consistent with the distribution of the degradation image:
Figure BDA0002368046090000171
wherein the content of the first and second substances,
F d is a degradation map feature;
F g,w the optimal guide map characteristic is obtained after deformation;
σ (-) denotes the mean of the spatial channel positions along the feature map;
μ (-) denotes the standard deviation of spatial channel position along the feature map.
The output is the final guide image characteristics after illumination correction, and the characteristics of the degradation image can be consistent in posture, expression and illumination distribution.
Preferably, M4 feature fusion modules are adapted in stages, preferably M4=4;
the input is the final guide image characteristic, the degradation image characteristic and the degradation image human face key point characteristic.
The step-by-step self-adaptive feature fusion module comprises self-adaptive feature fusion units P1, P2, P3 and P4;
inputting final guide image characteristics, degraded image characteristics and degraded image human face key point characteristics,
when the P1 characteristics are fused:
the convolution layer C19 is used for carrying out convolution operation on the characteristics of the degradation graph for the first time and further extracting the characteristics;
the convolutional layer C20 is used for performing a first convolution operation on the final guide map features and further extracting the features;
the convolutional layer C21 is used for performing convolution operation on the face key point features for the first time and further extracting the features;
the series layer T1 characteristically connects the outputs of the convolutional layers C19, C20 and C21 in series;
the convolution layer C22 performs the convolution operation and the activation operation on the output of the T1 for the fourth time;
the convolution layer C23 performs a fifth convolution operation and an activation operation on the output of the C22;
the convolution layer C24 performs a sixth convolution and activation operation on the characteristics of the degradation graph;
the convolution layer C25 performs a seventh convolution and activation operation on the output of the C24;
the convolution layer C26 performs eighth convolution and activation operation on the final guide map features;
the convolutional layer C27 performs ninth convolution and activation operation on the output of the C26;
the residual layer J1 performs a subtraction operation on the output of the C25 layer and the output of the C27 layer;
the scale layer multiplies the output of the J1 by the output of the C23 layer;
the recovery layer F1 adds the output of the scale layer J1 and the characteristics of the degradation map and outputs the enhanced characteristics of the P1 level;
the first to third convolution operations are all 64 convolution operations with 1 x 1 and step size of 1;
the fourth to seventh convolution operations are those with 128 3 × 3 steps of 1;
the eighth to ninth convolution operations are those with 128 1 x 1, step size of 1;
when the P2 characteristics are fused:
the convolution layer C28 is used for carrying out first convolution operation on the P1 enhanced degradation graph characteristics and further extracting the characteristics;
the convolutional layer C29 is used for performing the first convolution operation on the final guide map features and further extracting the features;
the convolution layer C30 is used for carrying out convolution operation for the first time on the face key point characteristics and further extracting the characteristics;
the series layer T2 is characterized by series connection of the outputs of the convolutional layers C28, C29 and C30;
the convolution layer C31 carries out the convolution operation and activation operation for the fourth time on the output of the T2;
the convolutional layer C32 performs a fifth convolution operation and an activation operation on the output of the C31;
the convolution layer C33 performs a sixth convolution and activation operation on the characteristics of the degradation graph;
the convolutional layer C34 performs a seventh convolution and activation operation on the output of the C33;
carrying out eighth convolution and activation operation on the P1 enhanced degradation graph characteristics by the convolution layer C35;
the convolutional layer C36 performs ninth convolution and activation operation on the output of the C35;
the residual layer J2 performs a subtraction operation on the output of the C34 layer and the output of the C36 layer;
the scale layer multiplies the output of the J2 layer by the output of the C32 layer;
the restoration layer F2 adds the output of the scale layer J2 and the degradation map characteristic and outputs a P2-level enhancement characteristic;
the first to third convolution operations are all 64 convolution operations with 1 x 1 and step size of 1;
the fourth to seventh convolution operations are those with 128 3 × 3 steps of 1;
the eighth to ninth convolution operations are those with 128 1 x 1, step size of 1;
when the P3 characteristics are fused:
the convolution layer C37 is used for carrying out first convolution operation on the P2 enhanced degradation image characteristics and further extracting the characteristics;
the convolutional layer C38 is used for performing a first convolution operation on the final guide map features and further extracting the features;
the convolution layer C39 is used for carrying out convolution operation on the key point features of the human face for the first time and further extracting the features;
the series layer T3 characteristically connects the outputs of the convolutional layers C37, C38 and C39 in series;
the convolution layer C40 performs a fourth convolution operation and activation operation on the output of the T3;
the convolution layer C41 performs a fifth convolution operation and an activation operation on the output of the C40;
the convolution layer C42 performs a sixth convolution and activation operation on the characteristics of the degradation graph;
the convolution layer C43 performs a seventh convolution and activation operation on the output of the C44;
performing eighth convolution and activation operation on the P1 enhanced degradation graph characteristics by the convolution layer C45;
the convolution layer C46 performs ninth convolution and activation operation on the output of the C45;
the residual layer J3 performs a subtraction operation on the output of the C43 layer and the output of the C46 layer;
the scale layer multiplies the output of the J3 layer by the output of the C41 layer;
the restoration layer F3 adds the output of the scale layer J3 and the degradation map characteristic and outputs a P3-level enhancement characteristic;
the first to third convolution operations are all 64 convolution operations with 1 x 1 and step size of 1;
the fourth to seventh convolution operations are those with 128 3 × 3 steps of 1;
the eighth to ninth convolution operations are those with 128 1 x 1, step size of 1;
when the P4 characteristics are fused:
the convolution layer C47 is used for carrying out convolution operation on the P3 enhanced degradation image characteristics for the first time and further extracting the characteristics;
the convolutional layer C48 is used for carrying out the first convolution operation on the characteristics of the final guide map and further extracting the characteristics;
the convolution layer C49 is used for carrying out convolution operation on the key point features of the human face for the first time and further extracting the features;
the series layer T4 is characterized by series connection of the outputs of the convolutional layers C47, C48 and C49;
the convolutional layer C50 performs a fourth convolution operation and activation operation on the output of the T4;
the convolutional layer C51 performs a fifth convolution operation and activation operation on the output of the C50;
the convolution layer C52 performs a sixth convolution and activation operation on the characteristics of the degradation graph;
the convolution layer C53 performs a seventh convolution and activation operation on the output of the C32;
the convolution layer C54 performs eighth convolution and activation operation on the P1 enhanced degradation graph characteristics;
the convolutional layer C55 performs ninth convolution and activation operations on the output of the C54;
the residual layer J4 performs a subtraction operation on the output of the C53 layer and the output of the C55 layer;
the scale layer multiplies the output of the J4 layer by the output of the C51 layer;
the restoration layer F4 adds the output of the scale layer J4 and the degradation map characteristic and outputs a P4-level enhancement characteristic;
the first to third convolution operations are all 64 convolution operations with 1 x 1 and step size of 1;
the fourth to seventh convolution operations are those with 128 3 × 3 steps of 1;
the eighth to ninth convolution operations are those with 128 1 x 1, step size of 1;
and finally, outputting the enhanced degradation graph characteristics after the multi-stage self-adaptive characteristic fusion.
Preferably, the enhanced degradation map feature reconstruction module, preferably, M5=3;
inputting the degradation map features after progressive enhancement,
the characteristic reconstruction module comprises a convolution layer C56, a convolution layer C57, a convolution layer C58 and a convolution layer C59;
the characteristic reconstruction module comprises a cavity convolution residual block D7, a cavity convolution residual block D8 and a cavity convolution residual block D9;
the convolution layer C56 is used for performing first convolution operation and activation operation on the gradually enhanced degradation graph characteristics;
the hole convolution residual error block D7 is used for performing first hole convolution operation, activation operation, residual error operation, second hole convolution operation, activation operation and residual error operation on the output of the convolution layer C56;
the upsampling layer S1 is used for performing upsampling twice on the characteristics by using PixelShuffle for the output of the D7;
the convolution layer C57 is used for sequentially performing second convolution operation, normalization operation and activation operation on the output of the up-sampling layer S1;
the hole convolution residual block D8 is used for performing third hole convolution operation, activation operation, residual operation, fourth hole convolution operation, activation operation and residual operation on the output of the convolution layer C57;
the upsampling layer S2 is used for performing upsampling twice on the characteristics by using PixelShuffle for the output of the D8;
the convolution layer C58 is used for sequentially performing third convolution operation, normalization operation and activation operation on the output of the up-sampling layer S1;
the hole convolution residual block D9 is used for performing fifth hole convolution operation, activation operation, residual operation, sixth hole convolution operation, activation operation and residual operation on the output of the convolution layer C58;
convolutional layer C59 performs a fourth convolution operation on the output of the hole convolution residual block D9
The output activation operation performs the last activation operation on the output of C59;
outputting the output of the activation operation as a final restoration result;
the activation operation adopts LReLU function, the last activation operation adopts Tanh,
the first convolution operation is 256 convolution operations with 3 x 3 and step size of 1;
the first hole convolution operation is 256 hole convolution operations with 3 × 3, step size of 1 and hole rate of 1;
the second hole convolution operation is 256 hole convolution operations with 3 × 3, step size of 1 and hole rate of 1;
the second convolution operation is 128 convolution operations with 3 x 3, step size of 1;
the third hole convolution operation is 128-by-3 hole convolution operations with a step size of 1 and a hole rate of 1;
the fourth hole convolution operation is 128 hole convolution operations with 3 × 3, step size of 1 and hole rate of 1;
the third convolution operation is a convolution operation of 64 by 3 with a step size of 1;
the fifth hole convolution operation is 64 hole convolution operations with 3 × 3, step size of 1 and hole rate of 1;
the sixth hole convolution operation is 64 hole convolution operations with 3 × 3, the step size of 1 and the hole rate of 1;
the fourth convolution operation is 32 convolution operations with 3 x 3 and step size of 1.
Preferably, the whole network learning is constrained by reconstruction loss, and the specific mode is;
the reconstruction loss comprises a restored face image
Figure BDA0002368046090000201
Loss of the undegraded high-definition image I corresponding thereto between the pixel space and the features of the face recognition network;
the Euclidean distance loss of pixel spatial loss is defined as:
Figure BDA0002368046090000202
in the formula, C, H and W represent the number of channels and the width of the image;
the Euclidean distance loss of the feature space loss of the face recognition network is defined as:
Figure BDA0002368046090000211
in the formula, Ψ u The u-th layer of convolution characteristics obtained for pre-training a face recognition network, C u 、H u And W u The number of channels, the height and the width of the u-th layer convolution characteristics obtained by the input face picture in a pre-trained face recognition network are respectively, u belongs to [1,2,3,4 ]]。
Preferably, the distribution of the restored image and the undegraded image is constrained in a specific manner as follows:
the style loss is obtained by calculating a Gram matrix of the restored image and the undegraded image in a face recognition network feature space, and the method specifically comprises the following steps:
Figure BDA0002368046090000212
in the formula, C u 、H u And W u Respectively obtaining the channel number, height and width of the u-th layer convolution characteristic of the input face picture in a pre-trained face recognition network, wherein u belongs to [1,2,3 and 4 ]]。
Preferably, the face image restoration network is constrained by discrimination loss, and the specific method is as follows:
the training network obtains discrimination loss through a self-attention discrimination network;
and judging loss by judging the characteristics output by the network to calculate change loss.
For the update discriminant network, the penalty is defined as:
Figure BDA0002368046090000213
for an update-generating network, the penalty is defined as:
Figure BDA0002368046090000214
preferably, the training network trains the guide graph deformation network and the facial image reconstruction network end to end by using an Adam optimization algorithm.
The facial image restoration system based on multi-guide graph and self-adaptive feature fusion introduces multi-guide graph, multi-guide graph optimization selection operation, moving least square posture correction operation, self-adaptive instance normalized illumination correction distribution and step-by-step self-adaptive feature fusion operation on the basis of the existing facial image restoration system based on the convolutional neural network. For a degradation map, N high-definition guide maps with the same identity exist, and the guide maps can have different postures, illumination, expressions and backgrounds; the optimal guide graph selects the optimal guide graph with the most similar expression and posture to the degradation graph by calculating the optimal weighted affine transformation distance of the key points of the human face between the degradation graph and the guide graph; the optimal guide map feature extraction module comprises M1 cavity convolution residual modules to obtain guide map features; the degradation map feature extraction module comprises M2 void convolution residual modules to obtain degradation map features; the degraded image face key point feature extraction module comprises M3 unbiased convolution operations to obtain face key point features; the posture correction module of the optimal guide map features obtains deformation vectors by calculating moving least squares (moving least squares) between key points of the human face of the degradation map and the key points of the optimal guide map, further deforms the optimal guide map to the posture and expression of the degradation map, and obtains the deformed optimal guide map; the illumination distribution correction module of the deformed optimal guide map features further corrects the illumination distribution of the deformed guide map into the illumination distribution of the degraded map on a feature layer through adaptive instance normalization operation (adaptive instance norm) to obtain final guide map features; a step-by-step adaptive feature fusion module which comprises M4 adaptive feature fusion operations and adds the guide map features into the degradation map in a dynamic adaptive and step-by-step manner to obtain enhanced degradation map features; and the restoration result reconstruction module outputs a restored face image by passing the enhanced degradation image features through M5 hollow convolution residual blocks.
The invention can be applied to any face repairing scene with a plurality of guide pictures, and the invention comprises but is not limited to the following two scenes:
1. photos grouped according to identities in a mobile phone album can be subjected to a shooting process assisted by known high-quality pictures or enhanced by low-quality images.
2. In the film restoration, a high-definition image is found according to the known lead actor information, the enhancement of the lead actor face in the film can be effectively restored,
the present invention is implemented by convolution operation and multilayer convolution operation in a neural network structure, and is not limited to a neural network.
The least square is a least square algorithm, the Adaptive instance normalization operation is an Adaptive instance norm, and the weighted least square module is a weighted least square.
The final guide map refers to the characteristics of the optimal guide map after characteristic extraction, deformation and illumination correction.
It should be noted that the detailed description is only for explaining and explaining the technical solution of the present invention, and the scope of protection of the claims is not limited thereby. It is intended that all such modifications and variations be included within the scope of the invention as defined in the following claims and the description.

Claims (13)

1. The human face image restoration system based on multi-guide map and self-adaptive feature fusion is characterized by comprising an optimal guide map selection module, an optimal guide map feature extraction module, a degraded map human face key point feature extraction module, an optimal guide map feature posture correction module, an illumination distribution correction module, a step-by-step self-adaptive feature fusion module and a restoration result reconstruction module,
the optimal guide map selection module selects an optimal guide map with the most similar expression and posture to the degradation map from the plurality of guide maps by calculating the optimal weighted affine transformation distance of the key points of the human face between the degradation map and the guide map;
the optimal guide map feature extraction module is used for extracting features of an optimal guide map;
the degradation map feature extraction module is used for extracting features of a degradation map;
the degraded image face key point feature extraction module is used for obtaining face key point features according to key points of the degraded image, wherein the key points of the degraded image are obtained by performing key point detection on the degraded image through a face key point detection algorithm;
the optimal guide image feature posture correction module obtains deformation vectors by calculating face key point features and moving least squares between the optimal guide image key points, deforms the optimal guide image to the posture and expression of the degraded image, and obtains the deformed optimal guide image, wherein the optimal guide image key points are obtained by detecting key points of the optimal guide image through a face key point detection algorithm;
the illumination distribution correction module is used for carrying out self-adaptive example normalization operation on the characteristics of the deformed optimal guide image and the characteristics of the degraded image to obtain the characteristics of the final guide image;
the step-by-step self-adaptive feature fusion module is used for dynamically and self-adaptively adding the features of the final guide image into the features of the degraded image step by step to obtain the features of the enhanced degraded image;
the restoration result reconstruction module is used for outputting the characteristics of the enhanced degradation image to restore a human face image through a multilayer neural network;
the optimal guide map feature extraction module comprises M1 hole convolution residual error units, the degraded map feature extraction module comprises M2 hole convolution residual error units, the degraded map face key point feature extraction module comprises M3 unbiased convolution layers, the step-by-step adaptive feature fusion module comprises M4 adaptive feature fusion units, the multilayer neural network in the restoration result reconstruction module is M5 hole convolution residual error units, and M1, M2, M3, M4 and M5 are all larger than or equal to 1;
the number of the adaptive feature fusion units is 4, the adaptive feature fusion units comprise adaptive feature fusion units P1, P2, P3 and P4, and the following steps are executed:
inputting final guide image characteristics, degraded image characteristics and degraded image human face key point characteristics,
when the P1 characteristics are fused:
the convolution layer C19 is used for carrying out convolution operation on the characteristics of the degradation graph for the first time and further extracting the characteristics;
the convolution layer C20 is used for carrying out second convolution operation on the characteristics of the final guide image and further extracting the characteristics;
the convolutional layer C21 is used for carrying out a third convolution operation on the face key point characteristics and further extracting the characteristics;
the series layer T1 characteristically connects the outputs of the convolutional layers C19, C20 and C21 in series;
the convolution layer C22 performs a fourth convolution operation and activation operation on the output of the T1;
the convolution layer C23 performs a fifth convolution operation and activation operation on the output of the C22;
the convolution layer C24 performs a sixth convolution and activation operation on the characteristics of the degradation graph;
the convolution layer C25 performs a seventh convolution and activation operation on the output of the C24;
the convolution layer C26 performs eighth convolution and activation operation on the final guide map features;
the convolution layer C27 performs ninth convolution and activation operation on the output of the C26;
the residual layer J1 performs a subtraction operation on the output of the C25 layer and the output of the C27 layer;
the scale layer multiplies the output of the J1 by the output of the C23 layer;
the recovery layer F1 adds the output of the scale layer J1 and the characteristics of the degradation graph after two layers of convolution operation of 128 layers of 3 x 3 with the step length of 1, and outputs the P1-level enhancement characteristics;
the first to third convolution operations are all 64 convolution operations with 1 x 1 and step size of 1;
the fourth to seventh convolution operations are those with 128 3 × 3 steps of 1;
the eighth to ninth convolution operations are those with 128 1 x 1, step size of 1;
when P2 features are fused:
the convolution layer C28 is used for carrying out first convolution operation on the P1 enhanced degradation graph characteristics and further extracting the characteristics;
the convolution layer C29 is used for carrying out a second convolution operation on the characteristics of the final guide map and further extracting the characteristics;
the convolution layer C30 is used for carrying out convolution operation for the third time on the key point features of the human face and further extracting the features;
the cascade layer T2 characteristically cascades the outputs of the convolutional layers C28, C29 and C30;
the convolution layer C31 carries out the convolution operation and activation operation for the fourth time on the output of the T2;
the convolution layer C32 performs a fifth convolution operation and an activation operation on the output of the C31;
the convolution layer C33 conducts sixth convolution and activation operation on the characteristics of the degradation graph;
the convolutional layer C34 performs a seventh convolution and activation operation on the output of the C33;
carrying out eighth convolution and activation operation on the P1 enhanced degradation graph characteristics by the convolution layer C35;
the convolutional layer C36 performs ninth convolution and activation operation on the output of the C35;
the residual layer J2 performs a subtraction operation on the output of the C34 layer and the output of the C36 layer;
the scale layer multiplies the output of the J2 layer by the output of the C32 layer;
the recovery layer F2 adds the output of the scale layer J2 and the characteristics of the degradation graph after two layers of convolution operation of 128 layers of 3 x 3 with the step length of 1, and outputs the P2-level enhanced characteristics;
the first to third convolution operations are all 64 convolution operations with 1 x 1 and step size of 1;
the fourth to seventh convolution operations are those with 128 3 × 3 steps of 1;
the eighth to ninth convolution operations are those with 128 1 x 1, step size of 1;
when P3 characteristics are fused:
the convolution layer C37 is used for carrying out first convolution operation on the P2 enhanced degradation image characteristics and further extracting the characteristics;
the convolutional layer C38 is used for performing a second convolution operation on the final guide map features and further extracting the features;
the convolution layer C39 is used for carrying out convolution operation for the third time on the face key point characteristics and further extracting the characteristics;
the series layer T3 characteristically connects the outputs of the convolutional layers C37, C38 and C39 in series;
the convolution layer C40 performs a fourth convolution operation and activation operation on the output of the T3;
the convolution layer C41 performs a fifth convolution operation and an activation operation on the output of the C40;
the convolution layer C42 performs a sixth convolution and activation operation on the characteristics of the degradation graph;
the convolution layer C43 performs a seventh convolution and activation operation on the output of the C44;
performing eighth convolution and activation operation on the P1 enhanced degradation graph characteristics by the convolution layer C45;
the convolution layer C46 performs ninth convolution and activation operation on the output of the C45;
the residual layer J3 performs a subtraction operation on the output of the C43 layer and the output of the C46 layer;
the scale layer multiplies the output of the J3 with the output of the C41 layer;
the recovery layer F3 adds the output of the scale layer J3 and the characteristics of the degradation graph after two layers of convolution operation of 128 layers of 3 x 3 with the step length of 1, and outputs the P3-level enhancement characteristics;
the first to third convolution operations are all 64 convolution operations with 1 x 1 and step size of 1;
the fourth to seventh convolution operations are those with 128 3 × 3 steps of 1;
the eighth to ninth convolution operations are those with 128 1 x 1, step size of 1;
when the P4 characteristics are fused:
the convolution layer C47 is used for carrying out first convolution operation on the P3 enhanced degradation graph characteristics and further extracting the characteristics;
the convolution layer C48 is used for carrying out the second convolution operation on the characteristics of the final guide map and further extracting the characteristics;
the convolution layer C49 is used for carrying out convolution operation for the third time on the human face key point characteristics and further extracting the characteristics;
the series layer T4 is characterized by series connection of the outputs of the convolutional layers C47, C48 and C49;
the convolutional layer C50 performs a fourth convolution operation and activation operation on the output of the T4;
the convolutional layer C51 performs a fifth convolution operation and activation operation on the output of the C50;
the convolution layer C52 performs a sixth convolution and activation operation on the characteristics of the degradation graph;
the convolution layer C53 performs a seventh convolution and activation operation on the output of the C32;
the convolution layer C54 performs eighth convolution and activation operation on the P1 enhanced degradation graph characteristics;
the convolutional layer C55 performs ninth convolution and activation operations on the output of the C54;
the residual layer J4 performs a subtraction operation on the output of the C53 layer and the output of the C55 layer;
the scale layer multiplies the output of the J4 layer by the output of the C51 layer;
the recovery layer F4 adds the output of the scale layer J4 and the characteristics of the degradation graph after convolution operation of two layers of 128 × 3 with the step length of 1, and outputs a P4-level enhanced characteristic;
the first to third convolution operations are all 64 convolution operations with 1 x 1 and step size of 1;
the fourth to seventh convolution operations are those with 128 3 x 3, steps of 1;
the eighth to ninth convolution operations are those with 128 1 x 1, step size of 1;
finally, outputting the enhanced degradation graph characteristics after the multi-stage self-adaptive characteristic fusion;
the optimal guide map feature posture correction module executes the following steps:
inputting the characteristics of the optimal guide image, the human face key points of the optimal guide image and the human face key points of the degraded image;
defining the optimal guide map characteristic as F g The face key point of the optimal guide map is L g The key point of the human face of the degradation graph is L d (ii) a By using a moving least square deformation method, the affine transformation matrix of each point is as follows:
Figure FDA0004105716160000041
wherein:
Figure FDA0004105716160000042
p is the given keypoint position>
Figure FDA0004105716160000043
Is L d The amplification matrix of (a) is,
according to the affine transformation matrix, through a bilinear interpolation mode, the obtained deformation optimal guide graph is characterized in that:
Figure FDA0004105716160000044
Figure FDA0004105716160000045
and 4, outputting the characteristics of the deformed optimal guide map.
2. The facial image restoration system based on multi-guide map and adaptive feature fusion as claimed in claim 1, wherein the guide map comprises N high definition facial images with different poses, expressions and illuminations and the same identity, and N is greater than 1.
3. The system for restoring a human face image based on multi-guide map and adaptive feature fusion according to claim 2, wherein the optimal guide map selection module performs the following steps:
firstly, an optimized weighted least square module based on similarity of key points of the human face is adopted, and the weighted least square module executes the following steps: firstly, selecting the pose and expression which are most similar to those of the degraded image from the N guide images, then adopting a face key point detection algorithm to detect P key points and L key points of each guide image p =(x p ,y p ) Representing the horizontal axis and the vertical axis coordinates of the p-th key point of the face key point, and defining the key point of the degradation graph as L d For all the guidance maps corresponding to the degradation map, the key point of the mth guidance map is defined as
Figure FDA0004105716160000051
The weighted least squares are defined as:
Figure FDA0004105716160000052
wherein k is * The optimal guide map sequence number selected from the N guide maps is shown;
a represents an affine transformation matrix;
Figure FDA0004105716160000053
represents->
Figure FDA0004105716160000054
In the amplification matrix>
Figure FDA0004105716160000055
k represents the kth keypoint;
w m represents the weight of the mth keypoint;
Figure FDA0004105716160000056
representing a reflection transformation distance;
given a degradation map and a guidance map, the closed solution is:
Figure FDA0004105716160000057
where W = Diag (W) is a diagonal matrix of keypoint weights W;
updating the weight w by using a network gradient back propagation algorithm, which is defined as follows:
Figure FDA0004105716160000058
to l w And (5) deriving and updating the whole weight to obtain an optimal guide map.
4. The system for restoring a human face image based on multi-guide map and adaptive feature fusion according to claim 3, wherein the number of the hole convolution residual error units in the optimal guide map feature extraction module is 3, and the optimal guide map feature extraction module performs the following steps;
inputting the selected optimal guide map;
the optimal guide map feature extraction module comprises a convolutional layer C1, a convolutional layer C2, a convolutional layer C3, a convolutional layer C4, a void convolution unit D1, a void convolution unit D2 and a void convolution unit D3;
the convolution layer C1 is used for carrying out a first convolution operation and an activation operation on the optimal guide map;
the hole convolution residual unit D1 is used for performing first hole convolution operation, activation operation, residual operation, second hole convolution operation, activation operation and residual operation on the output of the convolution layer C1;
the convolution layer C2 is used for sequentially performing second convolution operation, normalization operation and activation operation on the output of the hole convolution residual error unit D1;
the hole convolution residual unit D2 is used for performing third hole convolution operation, activation operation, residual operation, fourth hole convolution operation, activation operation and residual operation on the output of the convolution layer C2;
the convolution layer C3 is used for sequentially performing third convolution operation, normalization operation and activation operation on the output of the hole convolution residual error unit D2;
the hole convolution residual error unit D3 is used for performing fifth hole convolution operation, activation operation, residual error operation, sixth hole convolution operation, activation operation and residual error operation on the output of the convolution layer C3;
the convolution layer C4 is used for sequentially carrying out fourth convolution operation and activation operation on the output of the hole convolution residual error unit D3;
the output of the convolutional layer C4 is the optimal guide map characteristic;
wherein, the activation operation adopts LReLU function, the normalization operation adopts BatchNorm,
the first convolution operation is 64 convolution operations with 3 x 3 and step size of 1;
the first hole convolution operation is 64 hole convolution operations with 3 × 3, step size of 1 and hole rate of 7;
the second hole convolution operation is 64 hole convolution operations with 3 × 3, step size of 1 and hole rate of 5;
the second convolution operation is a convolution operation of 128 by 3 with a step size of 2;
the third hole convolution operation is a hole convolution operation with 128 holes 3 x 3, the step size is 1, and the hole rate is 5;
the fourth hole convolution operation is 128 hole convolution operations with 3 × 3, step size 1 and hole rate 3;
the third convolution operation is a convolution operation of 128 by 3 with a step size of 2;
the fifth hole convolution operation is 128 hole convolution operations with 3 × 3, step size 1 and hole rate 3;
the sixth hole convolution operation is 128 hole convolution operations with 3 × 3, step size of 1 and hole rate of 1;
the fourth convolution operation is 128 by 3 convolution operations with a step size of 1.
5. The system for restoring a human face image based on multi-guide map and adaptive feature fusion according to claim 4, wherein the number of the hole convolution residual error units in the degradation map feature extraction module is 3, and the following steps are performed:
inputting a degradation map;
the degenerate graph feature extraction module comprises a convolutional layer C5, a convolutional layer C6, a convolutional layer C7, a convolutional layer C8, a hole convolution residual unit D4, a hole convolution residual unit D5 and a hole convolution residual unit D6;
the convolution layer C5 is used for carrying out a first convolution operation and an activation operation on the degradation map;
the hole convolution residual unit D4 is used for performing first hole convolution operation, activation operation, residual operation, second hole convolution operation, activation operation and residual operation on the output of the convolution layer C5;
the convolution layer C6 is used for sequentially carrying out second convolution operation, normalization operation and activation operation on the output of the hole convolution residual error unit D4;
the hole convolution residual unit D5 is used for performing third hole convolution operation, activation operation, residual operation, fourth hole convolution operation, activation operation and residual operation on the output of the convolution layer C6;
the convolution layer C7 is used for sequentially carrying out third convolution operation, normalization operation and activation operation on the output of the hole convolution residual error unit D5;
the hole convolution residual unit D6 is used for performing fifth hole convolution operation, activation operation, residual operation, sixth hole convolution operation, activation operation and residual operation on the output of the convolutional layer C7;
the convolutional layer C8 is used for sequentially performing fourth convolution operation and activation operation on the output of the hole convolution residual error unit D6;
the output of convolutional layer C8 is characteristic of the degradation map;
wherein, the activation operation adopts LReLU function, the normalization operation adopts BatchNorm,
the first convolution operation is 64 convolution operations with 3 x 3 and step size of 1;
the first hole convolution operation is 64 hole convolution operations with 3 × 3, step size of 1 and hole rate of 7;
the second hole convolution operation is 64 hole convolution operations with 3 × 3, the step size of 1 and the hole rate of 5;
the second convolution operation is a convolution operation of 128 by 3 with a step size of 2;
the third hole convolution operation is a hole convolution operation with 128 holes 3 x 3, the step size is 1, and the hole rate is 5;
the fourth hole convolution operation is 128 hole convolution operations with 3 × 3, step size 1 and hole rate 3;
the third convolution operation is 128 convolution operations with 3 x 3 and step size of 2;
the fifth hole convolution operation is 128 hole convolution operations with 3 × 3, step size 1 and hole rate 3;
the sixth hole convolution operation is a hole convolution operation with 128 holes 3 × 3, the step size is 1, and the hole rate is 1;
the fourth convolution operation is 128 by 3 convolution operations with a step size of 1.
6. The system for restoring a human face image based on multi-guide map and adaptive feature fusion according to claim 5, wherein the number of the unbiased convolution layers in the degraded map human face key point feature extraction module is 10, and the following steps are executed:
inputting a degradation image face key point;
the degraded graph face key point feature extraction module comprises a convolution layer C9, a convolution layer C10, a convolution layer C11, a convolution layer C12, a convolution layer C13, a convolution layer C14, a convolution layer C15, a convolution layer C16, a convolution layer C17 and a convolution layer C18;
the convolution layer C9 is used for performing first convolution operation and activation operation on the degraded image human face key point diagram;
the convolutional layer C10 is used for performing a second convolution operation and an activation operation on the output of the C9;
the convolution layer C11 is used for performing a third convolution operation and an activation operation on the output of the C10;
the convolutional layer C12 is used for performing a fourth convolution operation and an activation operation on the output of the C11;
the convolutional layer C13 is used for performing a fifth convolution operation and an activation operation on the output of the C12;
the convolutional layer C14 is used for performing a sixth convolution operation and an activation operation on the output of the C13;
the convolutional layer C15 is used for performing a seventh convolution operation and an activation operation on the output of the C14;
the convolution layer C16 is used for carrying out eighth convolution operation and activation operation on the output of the C15;
the convolutional layer C17 is used for performing a ninth convolution operation and an activation operation on the output of the C16;
convolutional layer C18 is used to perform the tenth convolution operation and activation operation on the output of C17;
the output of the convolutional layer C18 is the key point characteristics of the human face of the degraded image;
the activation operation employs the lretlu function,
the first convolution operation is 64 convolution operations with 9 × 9, step size of 2 and offset execution of 0;
the second convolution operation is 64 by 3 convolution operations with a step size of 1 biased by 0;
the third convolution operation is 64 convolution operations with 7 × 7, step size of 1 and offset execution of 0;
the fourth convolution operation is 128 × 3, with a step size of 1, biased by a convolution operation of 0;
the fifth convolution operation is 128 5 × 5, the step size is 2, and the offset is 0;
the sixth convolution operation through the tenth convolution operation are convolution operations with 128 × 3, a step size of 1, and an offset of 0.
7. The system for facial image restoration based on multi-guide map and adaptive feature fusion according to claim 6, wherein the illumination distribution correction module performs the following steps:
inputting the characteristics of the optimal guide map and the characteristics of the degradation map after deformation,
and normalizing the deformed guide image through a self-adaptive example, wherein the illumination of the deformed guide image is further consistent with the distribution of the degradation image, and the formula is as follows:
Figure FDA0004105716160000081
wherein the content of the first and second substances,
F d is a degradation map feature;
F g,w the optimal guide map characteristic is obtained after deformation;
σ (-) denotes the mean of the spatial channel positions along the feature map;
μ (-) denotes the standard deviation of the spatial channel position along the feature map,
the output is the characteristics of the final guide map after the illumination correction.
8. The system for restoring a human face image based on multi-guide map and adaptive feature fusion according to claim 1, wherein the restoration result reconstruction module has 3 empty hole convolution residual error units and performs the following steps
Inputting the characteristics of the degradation map after progressive enhancement,
the characteristic reconstruction module comprises a convolutional layer C56, a convolutional layer C57, a convolutional layer C58, a convolutional layer C59, a cavity convolution residual error unit D7, a cavity convolution residual error unit D8 and a cavity convolution residual error unit D9;
the convolution layer C56 is used for carrying out first convolution operation and activation operation on the gradually enhanced degradation graph characteristics;
the hole convolution residual unit D7 is used for performing first hole convolution operation, activation operation, residual operation, second hole convolution operation, activation operation and residual operation on the output of the convolutional layer C56;
the upsampling layer S1 is used for performing upsampling twice on the characteristics by using PixelShuffle for the output of the D7;
the convolution layer C57 is used for sequentially performing second convolution operation, normalization operation and activation operation on the output of the up-sampling layer S1;
the hole convolution residual unit D8 is used for performing third hole convolution operation, activation operation, residual operation, fourth hole convolution operation, activation operation and residual operation on the output of the convolutional layer C57;
the upsampling layer S2 is used for performing upsampling twice on the characteristics by using PixelShuffle for the output of the D8;
the convolution layer C58 is used for sequentially carrying out third convolution operation, normalization operation and activation operation on the output of the up-sampling layer S1;
the hole convolution residual unit D9 is used for performing fifth hole convolution operation, activation operation, residual operation, sixth hole convolution operation, activation operation and residual operation on the output of the convolution layer C58;
the convolution layer C59 is used for carrying out fourth convolution operation on the output of the hole convolution residual error unit D9;
the output activation operation performs the last activation operation on the output of C59;
outputting the output of the activation operation as a final restoration result;
the activation operation adopts LReLU function, the last activation operation adopts Tanh,
the first convolution operation is 256 convolution operations with 3 × 3 and step size of 1;
the first hole convolution operation is 256-by-3 hole convolution operations with the step size of 1 and the hole rate of 1;
the second hole convolution operation is 256 hole convolution operations with 3 × 3, step size of 1 and hole rate of 1;
the second convolution operation is a convolution operation of 128 by 3 with a step size of 1;
the third hole convolution operation is 128 hole convolution operations with 3 × 3, step size of 1 and hole rate of 1;
the fourth hole convolution operation is 128-by-3 hole convolution operations with a step size of 1 and a hole rate of 1;
the third convolution operation is a convolution operation of 64 by 3 with a step size of 1;
the fifth hole convolution operation is 64 hole convolution operations with 3 × 3, the step size of 1 and the hole rate of 1;
the sixth hole convolution operation is 64 hole convolution operations with 3 × 3, step size of 1 and hole rate of 1;
the fourth convolution operation is 32 convolution operations with 3 x 3 and step size 1.
9. The system for facial image restoration based on multi-guide map and adaptive feature fusion according to claim 8, further comprising a training network, wherein the training network constrains the whole network learning by reconstruction loss,
said loss of reconstruction comprising after restorationFace image
Figure FDA0004105716160000091
Loss between pixel space loss and features of the face recognition network of the undegraded high definition image I corresponding thereto;
the Euclidean distance loss of the pixel space loss is defined as:
Figure FDA0004105716160000092
wherein, C, H and W represent the channel number, height and width of the image;
the Euclidean distance loss of the feature space loss of the face recognition network is defined as:
Figure FDA0004105716160000101
in the formula, Ψ u The u-th layer convolution characteristic obtained for pre-training the face recognition network, C i 、H u And W u Respectively obtaining the channel number, height and width of the u-th layer convolution characteristic of the input face picture in a pre-trained face recognition network, wherein u belongs to [1,2,3 and 4 ]]。
10. The system for facial image restoration based on multi-guide map and adaptive feature fusion as claimed in claim 9, wherein the training network further comprises a step of constraining the feature distribution of the restoration result to be consistent with the feature distribution of the degraded image, the constraint using a style loss function in a specific manner:
the style loss is obtained by calculating a Gram matrix of the restored image and the undegraded image in a face recognition network feature space, and the method specifically comprises the following steps:
Figure FDA0004105716160000102
in the formula, C u 、H u And W u Respectively obtaining the channel number, height and width of the u-th layer convolution characteristic of the input face picture in a pre-trained face recognition network, wherein u belongs to [1,2,3 and 4 ]]。
11. The system for facial image restoration based on multi-guide map and adaptive feature fusion according to claim 10, wherein the training network constrains the facial image restoration network through discriminant loss by:
firstly, training a network to obtain discrimination loss through a discrimination network based on self attention;
then the characteristics output by the network are judged to carry out changeloss calculation,
for learning of the discriminative network, the penalty is defined as:
Figure FDA0004105716160000103
for learning of the generation network under the discriminant network constraints, the penalty is defined as:
Figure FDA0004105716160000104
12. the system for facial image restoration based on multi-guide graph and adaptive feature fusion according to claim 11, wherein the training network adopts Adam optimization algorithm to train the whole network for image restoration end to end.
13. The system for facial image restoration based on multi-guide map and adaptive feature fusion according to claim 12, wherein the degradation map is obtained by sequentially performing blurring, down-sampling, noise adding and JPEG compression on a high-definition facial image,
wherein, the fuzzy processing adopts Gaussian blur and motion modelFuzzy Gaussian blur kernel standard deviation
Figure FDA0004105716160000105
The down-sampling treatment adopts a bicubic down-sampling method, and the sampling scale s belongs to { 1.1;
the noise adding process adopts Gaussian white noise, and the noise level n belongs to {0, 1;
the JPEG compression quality parameter q is larger than {0, 10.
CN202010039493.5A 2020-01-15 2020-01-15 Face image restoration system based on multi-guide image and self-adaptive feature fusion Active CN111260577B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010039493.5A CN111260577B (en) 2020-01-15 2020-01-15 Face image restoration system based on multi-guide image and self-adaptive feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010039493.5A CN111260577B (en) 2020-01-15 2020-01-15 Face image restoration system based on multi-guide image and self-adaptive feature fusion

Publications (2)

Publication Number Publication Date
CN111260577A CN111260577A (en) 2020-06-09
CN111260577B true CN111260577B (en) 2023-04-18

Family

ID=70945269

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010039493.5A Active CN111260577B (en) 2020-01-15 2020-01-15 Face image restoration system based on multi-guide image and self-adaptive feature fusion

Country Status (1)

Country Link
CN (1) CN111260577B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111768354A (en) * 2020-08-05 2020-10-13 哈尔滨工业大学 Face image restoration system based on multi-scale face part feature dictionary
CN113554569B (en) * 2021-08-04 2022-03-08 哈尔滨工业大学 Face image restoration system based on double memory dictionaries
CN114170108B (en) * 2021-12-14 2024-04-12 哈尔滨工业大学 Natural scene image blind restoration system based on face degradation model migration

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107103592A (en) * 2017-04-07 2017-08-29 南京邮电大学 A kind of Face Image with Pose Variations quality enhancement method based on double-core norm canonical
CN108537754A (en) * 2018-04-12 2018-09-14 哈尔滨工业大学 The facial image recovery system of figure is guided based on deformation
CN108932536A (en) * 2018-07-18 2018-12-04 电子科技大学 Human face posture method for reconstructing based on deep neural network
CN109740541A (en) * 2019-01-04 2019-05-10 重庆大学 A kind of pedestrian weight identifying system and method
CN110084775A (en) * 2019-05-09 2019-08-02 深圳市商汤科技有限公司 Image processing method and device, electronic equipment and storage medium
CN110570377A (en) * 2019-09-11 2019-12-13 辽宁工程技术大学 group normalization-based rapid image style migration method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8064712B2 (en) * 2007-01-24 2011-11-22 Utc Fire & Security Americas Corporation, Inc. System and method for reconstructing restored facial images from video
WO2010087162A1 (en) * 2009-01-27 2010-08-05 日本電気株式会社 Color image processing method, color image processing device and recording medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107103592A (en) * 2017-04-07 2017-08-29 南京邮电大学 A kind of Face Image with Pose Variations quality enhancement method based on double-core norm canonical
CN108537754A (en) * 2018-04-12 2018-09-14 哈尔滨工业大学 The facial image recovery system of figure is guided based on deformation
CN108932536A (en) * 2018-07-18 2018-12-04 电子科技大学 Human face posture method for reconstructing based on deep neural network
CN109740541A (en) * 2019-01-04 2019-05-10 重庆大学 A kind of pedestrian weight identifying system and method
CN110084775A (en) * 2019-05-09 2019-08-02 深圳市商汤科技有限公司 Image processing method and device, electronic equipment and storage medium
CN110570377A (en) * 2019-09-11 2019-12-13 辽宁工程技术大学 group normalization-based rapid image style migration method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Enhanced blind face restoration with multi-exemplar images and adaptive spatial feature fusion;Xiaoming Li 等;《2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)》;20200805;2706-2715 *
Learning Warped Guidance for Blind Face Restoration;Xiaoming Li 等;《arXiv》;20180413;1-25 *
人脸图像的复原与识别算法研究;张亚洲;《中国优秀硕士学位论文全文数据库 信息科技辑》;20150715(第07期);I138-1199 *
基于深度神经网络的人脸引导填充研究;叶玉婷;《中国优秀硕士学位论文全文数据库 信息科技辑》;20190115(第01期);I138-3131 *
运动模糊估计的理论、算法及其应用;潘金山;《中国博士学位论文全文数据库 信息科技辑》;20180815(第08期);I138-41 *

Also Published As

Publication number Publication date
CN111260577A (en) 2020-06-09

Similar Documents

Publication Publication Date Title
CN111260577B (en) Face image restoration system based on multi-guide image and self-adaptive feature fusion
CN108537754B (en) Face image restoration system based on deformation guide picture
CN113673307A (en) Light-weight video motion recognition method
CN112434655B (en) Gait recognition method based on adaptive confidence map convolution network
CN113096017B (en) Image super-resolution reconstruction method based on depth coordinate attention network model
CN112288627B (en) Recognition-oriented low-resolution face image super-resolution method
CN112347861A (en) Human body posture estimation method based on motion characteristic constraint
CN112507617B (en) Training method of SRFlow super-resolution model and face recognition method
CN112633220B (en) Human body posture estimation method based on bidirectional serialization modeling
CN112084952B (en) Video point location tracking method based on self-supervision training
CN112364838B (en) Method for improving handwriting OCR performance by utilizing synthesized online text image
CN111768354A (en) Face image restoration system based on multi-scale face part feature dictionary
CN111462274A (en) Human body image synthesis method and system based on SMP L model
CN114821764A (en) Gesture image recognition method and system based on KCF tracking detection
CN116030498A (en) Virtual garment running and showing oriented three-dimensional human body posture estimation method
CN113378949A (en) Dual-generation confrontation learning method based on capsule network and mixed attention
CN113989928A (en) Motion capturing and redirecting method
CN113421185A (en) StyleGAN-based mobile terminal face age editing method
CN114005046A (en) Remote sensing scene classification method based on Gabor filter and covariance pooling
CN110555379B (en) Human face pleasure degree estimation method capable of dynamically adjusting features according to gender
CN113222016B (en) Change detection method and device based on cross enhancement of high-level and low-level features
CN113869154B (en) Video actor segmentation method according to language description
CN111104929B (en) Multi-mode dynamic gesture recognition method based on 3D convolution and SPP
CN114240811A (en) Method for generating new image based on multiple images
CN114582002A (en) Facial expression recognition method combining attention module and second-order pooling mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant