CN111768354A - Face image restoration system based on multi-scale face part feature dictionary - Google Patents

Face image restoration system based on multi-scale face part feature dictionary Download PDF

Info

Publication number
CN111768354A
CN111768354A CN202010779169.7A CN202010779169A CN111768354A CN 111768354 A CN111768354 A CN 111768354A CN 202010779169 A CN202010779169 A CN 202010779169A CN 111768354 A CN111768354 A CN 111768354A
Authority
CN
China
Prior art keywords
face
scale
dictionary
features
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010779169.7A
Other languages
Chinese (zh)
Inventor
左旺孟
李晓明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN202010779169.7A priority Critical patent/CN111768354A/en
Publication of CN111768354A publication Critical patent/CN111768354A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/02Affine transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Probability & Statistics with Applications (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)

Abstract

A face image restoration system based on a multi-scale face part feature dictionary belongs to the technical field of face image restoration. The invention aims at the problem that in the existing face image restoration technology, the high-quality face image obtained from the real low-quality face image needs to be guided by the high-definition face image of the user, so that the application of the high-quality face image is limited. The method comprises a face feature dictionary offline generation module: the face feature dictionary is used for respectively extracting high-definition face feature from each sample image in the high-definition face image data set, and obtaining a face feature dictionary by adopting a k-means clustering mode on an extraction result; the face image restoration module: the face restoration method comprises the steps of extracting features of a degraded face image to be restored, and fusing a feature extraction result with the face part feature dictionary to obtain a face feature to be restored after the part is enhanced; and reconstructing the human face features to be restored to obtain a guide restoration result image. The invention is used for restoring the low-quality image.

Description

Face image restoration system based on multi-scale face part feature dictionary
Technical Field
The invention relates to a face image restoration system based on a multi-scale face part feature dictionary, and belongs to the technical field of face image restoration.
Background
The face image restoration technology is used for restoring a low-quality face image (such as blurriness, high noise, compression artifacts, low quality caused by long-distance shooting, low-quality shooting equipment, network transmission and the like) into a high-quality face image. With the development of technology and equipment, people are pursuing high-definition quality images and videos more and more, and mobile phone manufacturers also pursue high quality of shot face images. For the face image with low quality caused by the inevitable reason, the visual perception of people is often unacceptable, and the real degradation of the image cannot be simulated, so how to recover the image with high quality from a real image with low quality is a hot point of research of enterprises and researchers.
In recent years, deep learning has been a breakthrough in improving image quality, and can significantly improve the visual quality of images. However, most of the current methods are limited to the restoration work of a single image, and learn the mapping from a low-quality face image to a high-quality image through a convolutional neural network. Because the characteristics of the real images cannot be simulated, the method cannot be applied to most of the real images, and therefore, ideal robustness and effect cannot be achieved.
In order to solve the problems of the method, one or more high-definition images of the same person are adopted as a guide to assist the recovery process of the network. Although this process achieves a certain performance improvement, it requires that the identity of the degraded image is known in advance and one or more guide maps with high definition are provided, which greatly limits its application range.
Disclosure of Invention
The invention provides a face image restoration system based on a multi-scale face part feature dictionary, aiming at the problem that in the existing face image restoration technology, a high-quality face image obtained from a real low-quality face image needs to be guided by a high-definition face image of a user, so that the application of the high-quality face image is limited.
The invention discloses a face image restoration system based on a multi-scale face part feature dictionary, which comprises:
the face feature dictionary offline generation module: the face feature dictionary is used for respectively extracting high-definition face feature from each sample image in the high-definition face image data set, and obtaining a face feature dictionary by adopting a k-means clustering mode on an extraction result;
the face image restoration module: the face restoration method comprises the steps of extracting features of a degraded face image to be restored, and fusing a feature extraction result with the face part feature dictionary to obtain a face feature to be restored after the part is enhanced; and reconstructing the human face features to be restored to obtain a guide restoration result image.
According to the face image restoration system based on the multi-scale face part feature dictionary, the face part feature dictionary obtained by the face feature dictionary off-line generation module comprises M scales, and M is an integer greater than or equal to 1.
According to the face image restoration system based on the multi-scale face part feature dictionary, when M is 4, the process of processing the sample image by adopting the Vggface model comprises the following steps:
sequentially performing convolution, activation, pooling, convolution, activation, convolution and activation operations on each sample image to obtain high-definition face part features with the scale of 1;
sequentially performing activation, pooling, convolution, activation, convolution, activation and convolution operations on the high-definition face features with the scale of 1 to obtain high-definition face part features with the scale of 2;
sequentially performing activation, pooling, convolution, activation, convolution, activation and convolution operations on the high-definition face features with the scale of 2 to obtain high-definition face part features with the scale of 3;
sequentially performing activation, pooling, convolution, activation, convolution, activation and convolution operations on the high-definition face features with the scale of 3 to obtain high-definition face part features with the scale of 4;
the activation operation adopts a ReLU activation function, and the pooling operation adopts maximum pooling operation;
the first and second convolution operations are both 64 convolution operations of 3 x 3 with a step size of 1;
the third and fourth convolution operations are both 128 convolution operations of 3 x 3 with a step size of 1;
the fifth convolution operation to the ninth convolution operation are all 256 convolution operations with 3 × 3 and step size of 1;
the tenth to sixteen convolution operations are all 512 convolution operations with 3 × 3 and step size of 1;
the high-definition face part feature with the scale of 1, the high-definition face part feature with the scale of 2, the high-definition face part feature with the scale of 3 and the high-definition face part feature with the scale of 4 are respectively processed through a dictionary generation module to obtain a face part feature dictionary with the corresponding scale;
the process of processing the input data by the dictionary generation module comprises the following steps:
acquiring high-definition face part features with the scale of 1, high-definition face part features with the scale of 2, high-definition face part features with the scale of 3 or high-definition face part features with the scale of 4;
then, carrying out region alignment operation: acquiring positions of a left eye, a right eye, a nose and a mouth of the high-definition face position characteristics by adopting a face key point detection algorithm; cutting the left eye, the right eye, the nose and the mouth from the corresponding high-definition human face part features in a RoIAlign mode according to the obtained positions of all parts to obtain the part features of the left eye, the right eye, the nose and the mouth as extraction results;
respectively obtaining K1 clustering centers of the left eye, K2 clustering centers of the right eye, K3 clustering centers of the nose and K4 clustering centers of the mouth of all the part characteristics of all the parts in the extraction result in a K-means clustering mode; wherein K1 cluster centers correspond to the left-eye dictionary, K2 cluster centers correspond to the right-eye dictionary, K3 cluster centers correspond to the nose dictionary, and K4 cluster centers correspond to the mouth dictionary; k1, K2, K3 and K4 are all greater than or equal to 1;
the face part feature dictionary with the scale of 1 is obtained corresponding to the high-definition face part feature with the scale of 1, the face part feature dictionary with the scale of 2 is obtained corresponding to the high-definition face part feature with the scale of 2, the face part feature dictionary with the scale of 3 is obtained corresponding to the high-definition face part feature with the scale of 3, and the face part feature dictionary with the scale of 4 is obtained corresponding to the high-definition face part feature with the scale of 4.
According to the facial image restoration system based on the multi-scale facial feature dictionary, the facial image restoration module comprises:
a primary face feature extraction module: sequentially performing convolution, activation, pooling, convolution, activation and convolution operations on a degraded face image to be restored to obtain degraded face features with the scale of 1;
a secondary face feature extraction module: sequentially performing pooling, activation, convolution, activation and convolution operations on the degraded human face features with the scale of 1 to obtain degraded human face features with the scale of 2;
the three-level face feature extraction module: sequentially performing activation, pooling, convolution, activation, convolution, activation and convolution operations on the degraded human face features with the scale of 2 to obtain degraded human face features with the scale of 3;
four-level face feature extraction module: sequentially performing activation, pooling, convolution, activation, convolution, activation and convolution operations on the degraded human face features with the scale of 3 to obtain degraded human face features with the scale of 4;
the activation operation adopts a ReLU activation function, and the pooling operation adopts maximum pooling operation;
the first and second convolution operations are both 64 convolution operations of 3 x 3 with a step size of 1;
the third and fourth convolution operations are both 128 convolution operations of 3 x 3 with a step size of 1;
the fifth convolution operation to the 9 th convolution operation are all 256 convolution operations with 3 × 3 and the step size of 1;
the tenth to sixteen convolution operations are all 512 convolution operations with 3 × 3 and step size of 1;
the dictionary feature guidance enhancement module with the scale of 1: the face feature dictionary fusion method is used for fusing the degraded face features with the scale of 1 and the face part feature dictionary with the scale of 1 to obtain first-level face features to be restored after part enhancement;
scale 2 dictionary feature guidance enhancement module: the face feature dictionary fusion method is used for fusing the degraded face features with the scale of 2 and the face part feature dictionary with the scale of 2 to obtain second-level face features to be restored after part enhancement;
scale 3 dictionary feature guidance enhancement module: the face feature dictionary fusion method is used for fusing the degraded face features with the scale of 3 and the face part feature dictionary with the scale of 3 to obtain three-level face features to be restored after part enhancement;
the dictionary feature guidance enhancement module with the scale of 4: the face feature dictionary fusion method is used for fusing the degraded face features with the scale of 4 and the face part feature dictionary with the scale of 4 to obtain four-level face features to be restored after part enhancement;
a fourth-level reconstruction module: the system is used for carrying out affine transformation on the degraded human face features with the scale of 4 and four-level human face features to be restored, and inputting transformation results into network decoding features to obtain four-level reconstruction result features;
a third-level reconstruction module: the system is used for carrying out affine transformation on the four-level reconstruction result image and the three-level face features to be restored and inputting the transformation result into model network decoding features to obtain three-level reconstruction result features;
a secondary reconstruction module: the system is used for carrying out affine transformation on the three-level reconstruction result image and the second-level face feature to be restored and inputting the transformation result into the model network decoding feature to obtain the second-level reconstruction result feature;
a primary reconstruction module: the system comprises a model network decoding device, a first-level reconstruction result image processing device, a second-level reconstruction result image processing device, a first-level reconstruction result image processing device and a second-level reconstruction result image processing device, wherein the first-level reconstruction result image processing device is used for performing affine transformation on the first-level reconstruction result image and first-level to-be-restored face features;
an output module: and the primary reconstruction result image is output as a guide restoration result image.
According to the face image restoration system based on the multi-scale face part feature dictionary, the processing process of the dictionary feature guidance enhancement module with the scale of 1, the dictionary feature guidance enhancement module with the scale of 2, the dictionary feature guidance enhancement module with the scale of 3 and the dictionary feature guidance enhancement module with the scale of 4 on input data is the same; taking the dictionary feature guidance enhancement module with the scale of 1 as an example for explanation:
the scale-1 dictionary feature guidance enhancement module comprises:
face part feature extraction module: the method is used for obtaining the position characteristics of the left eye, the right eye, the nose and the mouth from the degraded human face characteristics with the scale of 1 according to the human face key points;
the dictionary feature self-adaptive normalization module: performing self-adaptive normalization operation on the part characteristics of the left eye, the right eye, the nose and the mouth in the face part characteristic dictionary with the scale of 1 by combining the left eye dictionary, the right eye dictionary, the nose dictionary and the mouth dictionary to obtain normalized dictionary characteristics;
and traversing the dictionary module: traversing in the normalized dictionary features to obtain corresponding features closest to the features of the left eye, the right eye, the nose and the mouth as matching dictionary features;
a confidence prediction module: the system is used for predicting confidence according to residual errors between the part features of the left eye, the right eye, the nose and the mouth and the corresponding matched dictionary features to obtain self-adaptive fusion features of all parts;
a restoration module: and the method is used for replacing the self-adaptive fusion features of all parts into the degraded human face image to be restored according to the human face key points to obtain the first-level human face features to be restored.
According to the face image restoration system based on the multi-scale face part feature dictionary, the method for acquiring the normalized dictionary features by the dictionary feature adaptive normalization module comprises the following steps:
and carrying out self-adaptive normalization operation on the position characteristics of the left eye, the right eye, the nose and the mouth:
Figure BDA0002619579490000041
wherein:
Figure BDA0002619579490000042
in order to normalize the characteristics of the dictionary after normalization,
Figure BDA0002619579490000043
constructing a kth clustering center of the c part feature with the scale of s for the offline; the sigma is the operation of the variance, and,
Figure BDA0002619579490000051
c ∈ { left eye, right eye, nose and mouth }, wherein s is 1, mu is mean value operation, for the c-th part feature of the degraded human face image with the scale of s to be restored;
the process of traversing the dictionary module to obtain the matching dictionary features comprises the following steps:
calculating respective features closest to the site features of the left eye, right eye, nose and mouth:
Figure BDA0002619579490000052
in the formula
Figure BDA0002619579490000053
In order to match the confidence of the dictionary features,<,>the inner product operation is performed.
According to the face image restoration system based on the multi-scale face part feature dictionary, the process of obtaining the self-adaptive fusion features of all parts by the confidence coefficient prediction module comprises the following steps:
Figure BDA0002619579490000054
in the formula
Figure BDA0002619579490000055
In order to adaptively blend the features of the video,
Figure BDA0002619579490000056
is the closest corresponding feature matched;
Figure BDA0002619579490000057
predicting the network for confidence, ΘCPredicting a network learnable parameter for the confidence; the confidence prediction network comprises two layers of convolution operation with 3 x 3 and step length of 1;
the primary reconstruction module obtains a scale change parameter alpha and a displacement change parameter beta through convolution operation of two layers of 3 x 3 with the step length of 1 for a secondary reconstruction result image and primary human face features to be restored, and obtains a primary reconstruction result image SFTs through calculation:
Figure BDA0002619579490000058
Figure BDA0002619579490000059
the feature with the scale s is decoded for the network.
According to the face image restoration system based on the multi-scale face part feature dictionary, the restoration model further comprises a constraint form of a training network, the training network constrains whole network learning through reconstruction loss, and the reconstruction loss comprises loss of a primary reconstruction result image and an undegraded high-definition image corresponding to the primary reconstruction result image between a pixel space and a feature space
Figure BDA00026195794900000510
Figure BDA00026195794900000511
In the formula ofl2Representing the pixel spatial loss weight, λpmThe weight of the loss of the feature space is represented,
Figure BDA00026195794900000512
for the first-order reconstruction of the resulting image, IhFor corresponding undegraded high definition images, Cm、Hm、,WmSequentially representing the number, height and width of characteristic channels of the mth layer of the first-level reconstruction result imagemObtaining the mth layer of convolution characteristics for a pre-trained face recognition network;
the training network further comprises a multi-scale discriminant loss function:
for the image of the guiding restoration result, down-sampling r ═ {1,2,4,8} is respectively carried out to obtain 4 groups of images with different resolutions, and loss is calculated by adopting a changeloss mode through four discrimination networks
Figure BDA0002619579490000061
For learning of the discriminative network, the penalty is defined as:
Figure BDA0002619579490000062
wherein R is the upper limit of the scale, DrIs a discriminator with a dimension r,
Figure BDA0002619579490000063
for an image down-sampled by r times for a non-degraded high-definition image, x r denotes down-sampled by r times,
Figure BDA0002619579490000064
in the interest of expectation,
Figure BDA0002619579490000065
is composed of
Figure BDA0002619579490000066
The distribution of (a);
for learning of a generative network under discriminant network constraints, penalties
Figure BDA00026195794900000610
Is defined as:
Figure BDA0002619579490000067
λa,rdiscriminating the network weight with the scale r; l isdIs a face key point, Dic is a constructed face dictionary, theta is a parameter which can be learned by a model, IdFor the degraded face image to be restored, P (I)d) Is IdThe distribution of (a) to (b) is,
Figure BDA0002619579490000068
a recovery module;
according to the face image restoration system based on the multi-scale face part feature dictionary, the training network performs end-to-end training on other network structures except the first-level face feature extraction module, the second-level face feature extraction module, the third-level face feature extraction module and the fourth-level face feature extraction module by adopting an Adam optimization algorithm.
According to the face image restoration system based on the multi-scale face part feature dictionary, the degraded face image to be restored is obtained by sequentially carrying out blurring, down-sampling, noise adding and JPEG (joint photographic experts group) compression on a high-definition face image, wherein the blurring adopts Gaussian blurring and motion blurring, and the standard deviation of Gaussian blurring kernel is adopted
Figure BDA0002619579490000069
The method comprises the following steps of performing down-sampling treatment by a bicubic down-sampling method according to a sampling scale S ∈ {1:0.1: S }, performing noise addition treatment by Gaussian white noise according to a noise level N ∈ {0,1:0.1: N }, performing JPEG compression quality parameters Q ∈ {0,10:0.1: Q }, wherein P is more than or equal to 5, S is more than or equal to 8, N is more than or equal to 15, and Q is more than or equal to 80;
the whole network is trained through the constructed low-quality degraded human face image to be restored and the corresponding high-definition human face image, and the obtained trained network is used for restoring the low-quality image.
The invention has the beneficial effects that: the method replaces the guide image to enhance by constructing the high-definition face part dictionary, so that the image restoration is not limited by the application range any more, and the method can be applied to most face enhancement scenes; compared with one or more guide images, the face part dictionary constructed by the invention has the advantages that high-quality features with higher similarity can be selected as the guide images, and the quality of guide enhancement is greatly improved.
The invention provides a face part dictionary algorithm for assisting face image enhancement aiming at the problem that a real low-quality image cannot be effectively restored in the prior art and the problem that one or more high-definition unified identity characteristic images are needed in a method based on guide image enhancement.
Drawings
FIG. 1 is a flow chart of a face image restoration system based on a multi-scale face part feature dictionary according to the present invention;
FIG. 2 is a block flow diagram of a face image restoration module;
FIG. 3 is a schematic diagram of a network structure for generating a face region feature dictionary;
fig. 4 is a schematic diagram of a network structure in which a face part feature dictionary is migrated to a face image restoration module to realize image restoration.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.
The invention is further described with reference to the following drawings and specific examples, which are not intended to be limiting.
In a first embodiment, as shown in fig. 1 to 4, the present invention provides a face image restoration system based on a multi-scale face part feature dictionary, including:
the face feature dictionary offline generation module 100: the face feature dictionary is used for respectively extracting high-definition face feature from each sample image in the high-definition face image data set, and obtaining a face feature dictionary by adopting a k-means clustering mode on an extraction result;
face image restoration module 200: the face restoration method comprises the steps of extracting features of a degraded face image to be restored, and fusing a feature extraction result with the face part feature dictionary to obtain a face feature to be restored after the part is enhanced; and reconstructing the human face features to be restored to obtain a guide restoration result image.
In the embodiment, the sample images forming the high-definition face image data set have different postures, expressions, illumination conditions and the like, so that the diversity of the sample images is ensured.
In the embodiment, the face part feature dictionary is generated in an off-line mode, which is beneficial to greatly improving the efficiency of low-quality image restoration.
Further, the face feature dictionary obtained by the face feature dictionary offline generation module 100 includes M scales, where M is an integer greater than or equal to 1.
Still further, with reference to fig. 1 to 4, when M is 4, the processing of the sample image by using the VggFace model includes:
sequentially performing convolution, activation, pooling, convolution, activation, convolution and activation operations on each sample image to obtain high-definition face part features with the scale of 1;
sequentially performing activation, pooling, convolution, activation, convolution, activation and convolution operations on the high-definition face features with the scale of 1 to obtain high-definition face part features with the scale of 2;
sequentially performing activation, pooling, convolution, activation, convolution, activation and convolution operations on the high-definition face features with the scale of 2 to obtain high-definition face part features with the scale of 3;
sequentially performing activation, pooling, convolution, activation, convolution, activation and convolution operations on the high-definition face features with the scale of 3 to obtain high-definition face part features with the scale of 4;
the activation operation adopts a ReLU activation function, and the pooling operation adopts maximum pooling operation;
the first and second convolution operations are both 64 convolution operations of 3 x 3 with a step size of 1;
the third and fourth convolution operations are both 128 convolution operations of 3 x 3 with a step size of 1;
the fifth convolution operation to the ninth convolution operation are all 256 convolution operations with 3 × 3 and step size of 1;
the tenth to sixteen convolution operations are all 512 convolution operations with 3 × 3 and step size of 1;
the high-definition face part feature with the scale of 1, the high-definition face part feature with the scale of 2, the high-definition face part feature with the scale of 3 and the high-definition face part feature with the scale of 4 are respectively processed through a dictionary generation module to obtain a face part feature dictionary with the corresponding scale;
the process of processing the input data by the dictionary generation module comprises the following steps:
acquiring high-definition face part features with the scale of 1, high-definition face part features with the scale of 2, high-definition face part features with the scale of 3 or high-definition face part features with the scale of 4;
then, carrying out region alignment operation: extracting parts of a left eye, a right eye, a nose and a mouth of the high-definition face part characteristics by using a Vggface model with fixed parameters by adopting a face key point detection algorithm; cutting the left eye, the right eye, the nose and the mouth from the corresponding high-definition human face part features in a RoIAlign mode according to the obtained positions of all parts to obtain the part features of the left eye, the right eye, the nose and the mouth as extraction results; in this way, a large number of high-quality features of the individual regions can be obtained;
respectively obtaining K1 clustering centers of the left eye, K2 clustering centers of the right eye, K3 clustering centers of the nose and K4 clustering centers of the mouth of all the part characteristics of all the parts in the extraction result in a K-means clustering mode; wherein K1 cluster centers correspond to the left-eye dictionary, K2 cluster centers correspond to the right-eye dictionary, K3 cluster centers correspond to the nose dictionary, and K4 cluster centers correspond to the mouth dictionary; k1, K2, K3 and K4 are all greater than or equal to 1;
the face part feature dictionary with the scale of 1 is obtained corresponding to the high-definition face part feature with the scale of 1, the face part feature dictionary with the scale of 2 is obtained corresponding to the high-definition face part feature with the scale of 2, the face part feature dictionary with the scale of 3 is obtained corresponding to the high-definition face part feature with the scale of 3, and the face part feature dictionary with the scale of 4 is obtained corresponding to the high-definition face part feature with the scale of 4.
The embodiment is used for extracting S scale features, wherein S is 4, the network structure comprises convolutional layers C1 to C14, and pooling operations A1 to A4;
the convolution layer C1 performs a first convolution on a high-definition input sample image, and performs a first activation operation;
convolutional layer C2 performs a second convolution on the output of convolutional layer C1, a second activation operation;
pooling operation A1 performs a first pooling operation on the output of convolutional layer C2;
convolutional layer C3 performs a third convolution on the output of pooling operation A1, a third activation operation;
convolutional layer C4 performs a fourth convolution on the output of convolutional layer C3, a fourth activation operation;
the output scale of the convolutional layer C4 is 1 of the high-definition human face part characteristics;
pooling layer A2 performs a fifth activation operation on the output of convolutional layer C4, a second pooling operation;
convolutional layer C5 performs a fifth convolution operation, a sixth activation operation, on the output of pooling layer A2;
convolutional layer C6 performs a sixth convolution operation, a seventh activation operation on the convolutional layer C5 output;
convolutional layer C7 performs the seventh convolution operation, the eighth activation operation on the convolutional layer C6 output;
convolutional layer C8 performs an eighth convolution operation on the output of convolutional layer C7;
the output scale of the convolutional layer C8 is 2 high-definition human face part characteristics;
the pooling layer A3 performs a ninth activation operation and a third pooling operation on the output of the convolutional layer C8;
convolutional layer C9 performs the ninth convolution operation, the tenth activation operation on the output of pooling layer A3;
the convolutional layer C10 performs the tenth convolution operation, the eleventh activation operation on the output of the convolutional layer C9;
the convolutional layer C11 performs the eleventh convolution operation, the twelfth activation operation on the output of the convolutional layer C10;
convolutional layer C12 performs a twelfth convolution operation on the output of convolutional layer C11;
the output scale of the convolutional layer C12 is 3 high-definition human face part characteristics;
pooling layer A4 performs a thirteenth activation operation, a fourth pooling operation on the output of convolutional layer C12;
convolutional layer C13 performs a thirteenth convolution operation, a fourteenth activation operation on the output of pooling layer A4;
convolutional layer C14 performs a fourteenth convolution operation, a fifteenth activation operation on the output of convolutional layer C13;
the convolutional layer C15 performs a fifteenth convolution operation, a sixteenth activation operation on the output of the convolutional layer C14;
convolutional layer C16 performs a sixteenth convolution operation on the output of convolutional layer C15;
the convolutional layer C16 has a high-definition face feature output scale of 4.
Still further, as shown in fig. 2, the facial image restoration module 200 includes:
a primary face feature extraction module: sequentially performing convolution, activation, pooling, convolution, activation and convolution operations on a degraded face image to be restored to obtain degraded face features with the scale of 1;
a secondary face feature extraction module: sequentially performing pooling, activation, convolution, activation and convolution operations on the degraded human face features with the scale of 1 to obtain degraded human face features with the scale of 2;
the three-level face feature extraction module: sequentially performing activation, pooling, convolution, activation, convolution, activation and convolution operations on the degraded human face features with the scale of 2 to obtain degraded human face features with the scale of 3;
four-level face feature extraction module: sequentially performing activation, pooling, convolution, activation, convolution, activation and convolution operations on the degraded human face features with the scale of 3 to obtain degraded human face features with the scale of 4;
the activation operation adopts a ReLU activation function, and the pooling operation adopts maximum pooling operation;
the first and second convolution operations are both 64 convolution operations of 3 x 3 with a step size of 1;
the third and fourth convolution operations are both 128 convolution operations of 3 x 3 with a step size of 1;
the fifth convolution operation to the 9 th convolution operation are all 256 convolution operations with 3 × 3 and the step size of 1;
the tenth to sixteen convolution operations are all 512 convolution operations with 3 × 3 and step size of 1;
the dictionary feature guidance enhancement module with the scale of 1: the face feature dictionary fusion method is used for fusing the degraded face features with the scale of 1 and the face part feature dictionary with the scale of 1 to obtain first-level face features to be restored after part enhancement;
scale 2 dictionary feature guidance enhancement module: the face feature dictionary fusion method is used for fusing the degraded face features with the scale of 2 and the face part feature dictionary with the scale of 2 to obtain second-level face features to be restored after part enhancement;
scale 3 dictionary feature guidance enhancement module: the face feature dictionary fusion method is used for fusing the degraded face features with the scale of 3 and the face part feature dictionary with the scale of 3 to obtain three-level face features to be restored after part enhancement;
the dictionary feature guidance enhancement module with the scale of 4: the face feature dictionary fusion method is used for fusing the degraded face features with the scale of 4 and the face part feature dictionary with the scale of 4 to obtain four-level face features to be restored after part enhancement;
a fourth-level reconstruction module: the system is used for carrying out affine transformation on the degraded human face features with the scale of 4 and four-level human face features to be restored, and inputting transformation results into network decoding features to obtain four-level reconstruction result features;
a third-level reconstruction module: the system is used for carrying out affine transformation on the four-level reconstruction result image and the three-level face features to be restored and inputting the transformation result into model network decoding features to obtain three-level reconstruction result features;
a secondary reconstruction module: the system is used for carrying out affine transformation on the three-level reconstruction result image and the second-level face feature to be restored and inputting the transformation result into the model network decoding feature to obtain the second-level reconstruction result feature;
a primary reconstruction module: the system comprises a model network decoding device, a first-level reconstruction result image processing device, a second-level reconstruction result image processing device, a first-level reconstruction result image processing device and a second-level reconstruction result image processing device, wherein the first-level reconstruction result image processing device is used for performing affine transformation on the first-level reconstruction result image and first-level to-be-restored face features;
an output module: and the primary reconstruction result image is output as a guide restoration result image.
In this embodiment, the process of extracting an image by the four-level face feature extraction module specifically includes:
the first-level human face feature extraction module comprises the following steps:
the convolution layer C1 performs the first convolution on the degraded human face image to be restored, and performs the first activation operation;
convolutional layer C2 performs a second convolution on the output of convolutional layer C1, a second activation operation;
pooling operation A1 performs a first pooling operation on the output of convolutional layer C2;
convolutional layer C3 performs a third convolution on the output of pooling operation A1, a third activation operation;
convolutional layer C4 performs a fourth convolution on the output of convolutional layer C3, a fourth activation operation;
the output scale of convolutional layer C4 is 1 degraded human face features;
the secondary face feature extraction module comprises the following steps:
pooling layer A2 performs a fifth activation operation on the output of convolutional layer C4, a second pooling operation;
convolutional layer C5 performs a fifth convolution operation, a sixth activation operation, on the output of pooling layer A2;
convolutional layer C6 performs a sixth convolution operation, a seventh activation operation on the convolutional layer C5 output;
convolutional layer C7 performs the seventh convolution operation, the eighth activation operation on the convolutional layer C6 output;
convolutional layer C8 performs an eighth convolution operation on the output of convolutional layer C7;
the output scale of convolutional layer C8 is 2 degraded human face features;
the pooling layer A3 performs a ninth activation operation and a third pooling operation on the output of the convolutional layer C8;
convolutional layer C9 performs the ninth convolution operation, the tenth activation operation on the output of pooling layer A3;
the convolutional layer C10 performs the tenth convolution operation, the eleventh activation operation on the output of the convolutional layer C9;
the convolutional layer C11 performs the eleventh convolution operation, the twelfth activation operation on the output of the convolutional layer C10;
convolutional layer C12 performs a twelfth convolution operation on the output of convolutional layer C11;
the output scale of convolutional layer C12 is 3 degraded human face features;
pooling layer A4 performs a thirteenth activation operation, a fourth pooling operation on the output of convolutional layer C12;
convolutional layer C13 performs a thirteenth convolution operation, a fourteenth activation operation on the output of pooling layer A4;
convolutional layer C14 performs a fourteenth convolution operation, a fifteenth activation operation on the output of convolutional layer C13;
the convolutional layer C15 performs a fifteenth convolution operation, a sixteenth activation operation on the output of the convolutional layer C14;
convolutional layer C16 performs a sixteenth convolution operation on the output of convolutional layer C15;
the output scale of convolutional layer C16 is 4 degraded facial features.
Still further, with reference to fig. 2, the processing procedure of the dictionary feature guidance enhancement module with scale 1, the processing procedure of the dictionary feature guidance enhancement module with scale 2, the processing procedure of the dictionary feature guidance enhancement module with scale 3, and the processing procedure of the dictionary feature guidance enhancement module with scale 4 are the same; taking the dictionary feature guidance enhancement module with the scale of 1 as an example for explanation:
the scale-1 dictionary feature guidance enhancement module comprises:
face part feature extraction module: the method is used for obtaining the position characteristics of the left eye, the right eye, the nose and the mouth from the degraded human face characteristics with the scale of 1 according to the human face key points;
the dictionary feature self-adaptive normalization module: performing self-adaptive normalization operation on the part characteristics of the left eye, the right eye, the nose and the mouth in the face part characteristic dictionary with the scale of 1 by combining the left eye dictionary, the right eye dictionary, the nose dictionary and the mouth dictionary to obtain normalized dictionary characteristics;
and traversing the dictionary module: traversing in the normalized dictionary features to obtain corresponding features closest to the features of the left eye, the right eye, the nose and the mouth as matching dictionary features;
a confidence prediction module: the system is used for predicting confidence according to residual errors between the part features of the left eye, the right eye, the nose and the mouth and the corresponding matched dictionary features to obtain self-adaptive fusion features of all parts;
a restoration module: and the method is used for replacing the self-adaptive fusion features of all parts into the degraded human face image to be restored according to the human face key points to obtain the first-level human face features to be restored. Specifically, the enhanced features of each part are replaced with corresponding features in the features of the input image according to the positions detected by the key points of the human face, so as to obtain the enhanced next-stage human face features to be restored.
In this embodiment, the four dictionary feature guidance enhancement modules add the enhanced features to the network decoding features through feature affine transformation, and the four enhancement modules respectively operate in feature spaces of different scales, so that a coarse-to-fine guidance restoration result can be obtained.
Still further, the method for obtaining the normalized dictionary features by the dictionary feature adaptive normalization module comprises the following steps:
and carrying out self-adaptive normalization operation on the position characteristics of the left eye, the right eye, the nose and the mouth:
Figure BDA0002619579490000121
wherein:
Figure BDA0002619579490000122
in order to normalize the characteristics of the dictionary after normalization,
Figure BDA0002619579490000123
constructing a kth clustering center of the c part feature with the scale of s for the offline; the sigma is the operation of the variance, and,
Figure BDA0002619579490000124
c ∈ { left eye, right eye, nose and mouth }, wherein s is 1, mu is mean value operation, for the c-th part feature of the degraded human face image with the scale of s to be restored;
based on the above operation, an adaptive normalization operation is performed on all dictionary features of each part c.
The process of traversing the dictionary module to obtain the matching dictionary features comprises the following steps:
calculating respective features closest to the site features of the left eye, right eye, nose and mouth:
Figure BDA0002619579490000125
in the formula
Figure BDA0002619579490000126
To match the confidence of the dictionary features, the inner product operation can be implemented by adopting the convolution operation with the bias execution of 0.
And outputting the similarity between the part of the current input image and each clustering center of the dictionary, and selecting the feature with the highest score, namely the most similar feature as the matching feature.
Still further, the process of obtaining the adaptive fusion characteristics of each part by the confidence prediction module includes:
Figure BDA0002619579490000131
in the formula
Figure BDA0002619579490000132
In order to adaptively blend the features of the video,
Figure BDA0002619579490000133
is the closest corresponding feature matched;
Figure BDA0002619579490000134
predicting the network for confidence, ΘCPredicting a network learnable parameter for the confidence; the confidence prediction network comprises two layers of convolution operation with 3 x 3 and step length of 1;
the primary reconstruction module obtains a scale change parameter alpha and a displacement change parameter beta through convolution operation of two layers of 3 x 3 with the step length of 1 for a secondary reconstruction result image and primary human face features to be restored, and obtains a primary reconstruction result image SFTs through calculation:
Figure BDA0002619579490000135
Figure BDA0002619579490000136
the feature with the scale s is decoded for the network.
In the embodiment, for the matched dictionary features, a confidence coefficient is predicted according to a residual error between the dictionary features and the input part features, and the confidence coefficient is applied to the dictionary features and is added back to the input part features.
Still further, the restoration model further comprises a constraint form of a training network, the training network constrains the whole network learning through reconstruction loss, and the reconstruction loss comprises that the first-stage reconstruction result image and the non-degraded high-definition image corresponding to the first-stage reconstruction result image are between the pixel space and the feature spaceLoss of
Figure BDA0002619579490000137
Figure BDA0002619579490000138
In the formula ofl2Representing the pixel spatial loss weight, λpmThe weight of the loss of the feature space is represented,
Figure BDA0002619579490000139
for the first-order reconstruction of the resulting image, IhFor corresponding undegraded high definition images, Cm、Hm、,WmSequentially representing the number, height and width of characteristic channels of the mth layer of the first-level reconstruction result imagemObtaining the nth layer of convolution characteristics for a pre-trained face recognition network;
the training network further comprises a multi-scale discriminant loss function:
for the image of the guiding restoration result, down-sampling r ═ {1,2,4,8} is respectively carried out to obtain 4 groups of images with different resolutions, and loss is calculated by adopting a changeloss mode through four discrimination networks
Figure BDA00026195794900001310
For learning of the discriminative network, the penalty is defined as:
Figure BDA00026195794900001311
wherein R is the upper limit of the scale, DrIs a discriminator with a dimension r,
Figure BDA00026195794900001312
for an image down-sampled by r times for a non-degraded high-definition image, x r denotes down-sampled by r times,
Figure BDA0002619579490000141
in the interest of expectation,
Figure BDA0002619579490000142
is composed of
Figure BDA0002619579490000143
The distribution of (a);
for learning of a generative network under discriminant network constraints, penalties
Figure BDA0002619579490000147
Is defined as:
Figure BDA0002619579490000144
λa,rdiscriminating the network weight with the scale r; l isdIs a face key point, Dic is a constructed face dictionary, theta is a parameter which can be learned by a model, IdFor the degraded face image to be restored, P (I)d) Is IdThe distribution of (a) to (b) is,
Figure BDA0002619579490000145
a recovery module;
and further, the training network performs end-to-end training on other network structures except the first-level human face feature extraction module, the second-level human face feature extraction module, the third-level human face feature extraction module and the fourth-level human face feature extraction module by adopting an Adam optimization algorithm.
Further, the degraded human face image to be restored is obtained by sequentially carrying out blurring, down-sampling, noise adding and JPEG compression on the high-definition human face image, wherein the blurring process adopts Gaussian blurring and motion blurring, and Gaussian blurring kernel standard deviation
Figure BDA0002619579490000146
The method comprises the following steps of performing down-sampling treatment by a bicubic down-sampling method according to a sampling scale S ∈ {1:0.1: S }, performing noise addition treatment by Gaussian white noise according to a noise level N ∈ {0,1:0.1: N }, performing JPEG compression quality parameters Q ∈ {0,10:0.1: Q }, wherein P is more than or equal to 5, S is more than or equal to 8, N is more than or equal to 15, and Q is more than or equal to 80;
the whole network is trained through the constructed low-quality degraded human face image to be restored and the corresponding high-definition human face image, and the obtained trained network is used for restoring the low-quality image.
The restoration system of the invention provides the construction operation of the multi-scale face part feature dictionary and the operation of transferring the feature dictionary to the degradation graph on the basis of the existing face image restoration system based on the convolutional neural network. Firstly, extracting each face part from a large number of high-definition face images, and then obtaining the characteristics of each part in different scales by adopting a k-means mode. For each location of a degraded graph, there are K high-definition location dictionaries that can be used as a guide enhancement. Firstly, for each scale, adopting a site self-adaptive normalization operation to normalize dictionary features, and using the normalized dictionary features to process the problem of inconsistent distribution of the degradation map and the dictionary features. Traversing the whole normalized dictionary to obtain part features with the most similar features as guidance; for the problem that dictionary degrees required by different degenerated inputs are inconsistent, residual errors based on the matching features and the input features are used for predicting a confidence coefficient for obtaining dictionary features in a targeted mode. Finally, the multi-scale dictionary fusion features are beneficial to learning detailed features from coarse to fine by the network.
The method is suitable for any face repairing scene.
The face feature extraction module adopted by the invention comprises but is not limited to the use of a Vggface model.
The method adopts a form of constructing a high-quality dictionary to assist image enhancement in the convolutional neural network, and comprises the step of obtaining a plurality of high-definition part features in a K mean value mode.
The present invention is implemented by convolution operation and multilayer convolution operation in a neural network structure, and is not limited to a neural network.
Although the invention herein has been described with reference to particular embodiments, it is to be understood that these embodiments are merely illustrative of the principles and applications of the present invention. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be devised without departing from the spirit and scope of the present invention as defined by the appended claims. It should be understood that features described in different dependent claims and herein may be combined in ways different from those described in the original claims. It is also to be understood that features described in connection with individual embodiments may be used in other described embodiments.

Claims (10)

1. A face image restoration system based on a multi-scale face part feature dictionary is characterized by comprising:
face feature dictionary offline generation module (100): the face feature dictionary is used for respectively extracting high-definition face feature from each sample image in the high-definition face image data set, and obtaining a face feature dictionary by adopting a k-means clustering mode on an extraction result;
face image restoration module (200): the face restoration method comprises the steps of extracting features of a degraded face image to be restored, and fusing a feature extraction result with the face part feature dictionary to obtain a face feature to be restored after the part is enhanced; and reconstructing the human face features to be restored to obtain a guide restoration result image.
2. The system for restoring a human face image based on a multi-scale human face feature dictionary according to claim 1, wherein the human face feature dictionary obtained by the human face feature dictionary off-line generation module (100) comprises M scales, and M is an integer greater than or equal to 1.
3. The system for restoring a human face image based on a multi-scale human face feature dictionary according to claim 2, wherein when M is 4, the process of processing the sample image by using the VggFace model comprises:
sequentially performing convolution, activation, pooling, convolution, activation, convolution and activation operations on each sample image to obtain high-definition face part features with the scale of 1;
sequentially performing activation, pooling, convolution, activation, convolution, activation and convolution operations on the high-definition face features with the scale of 1 to obtain high-definition face part features with the scale of 2;
sequentially performing activation, pooling, convolution, activation, convolution, activation and convolution operations on the high-definition face features with the scale of 2 to obtain high-definition face part features with the scale of 3;
sequentially performing activation, pooling, convolution, activation, convolution, activation and convolution operations on the high-definition face features with the scale of 3 to obtain high-definition face part features with the scale of 4;
the activation operation adopts a ReLU activation function, and the pooling operation adopts maximum pooling operation;
the first and second convolution operations are both 64 convolution operations of 3 x 3 with a step size of 1;
the third and fourth convolution operations are both 128 convolution operations of 3 x 3 with a step size of 1;
the fifth convolution operation to the ninth convolution operation are all 256 convolution operations with 3 × 3 and step size of 1;
the tenth to sixteen convolution operations are all 512 convolution operations with 3 × 3 and step size of 1;
the high-definition face part feature with the scale of 1, the high-definition face part feature with the scale of 2, the high-definition face part feature with the scale of 3 and the high-definition face part feature with the scale of 4 are respectively processed through a dictionary generation module to obtain a face part feature dictionary with the corresponding scale;
the process of processing the input data by the dictionary generation module comprises the following steps:
acquiring high-definition face part features with the scale of 1, high-definition face part features with the scale of 2, high-definition face part features with the scale of 3 or high-definition face part features with the scale of 4;
then, carrying out region alignment operation: acquiring positions of a left eye, a right eye, a nose and a mouth of the high-definition face position characteristics by adopting a face key point detection algorithm; cutting the left eye, the right eye, the nose and the mouth from the corresponding high-definition human face part features in a RoIAlign mode according to the obtained positions of all parts to obtain the part features of the left eye, the right eye, the nose and the mouth as extraction results;
respectively obtaining K1 clustering centers of the left eye, K2 clustering centers of the right eye, K3 clustering centers of the nose and K4 clustering centers of the mouth of all the part characteristics of all the parts in the extraction result in a K-means clustering mode; wherein K1 cluster centers correspond to the left-eye dictionary, K2 cluster centers correspond to the right-eye dictionary, K3 cluster centers correspond to the nose dictionary, and K4 cluster centers correspond to the mouth dictionary; k1, K2, K3 and K4 are all greater than or equal to 1;
the face part feature dictionary with the scale of 1 is obtained corresponding to the high-definition face part feature with the scale of 1, the face part feature dictionary with the scale of 2 is obtained corresponding to the high-definition face part feature with the scale of 2, the face part feature dictionary with the scale of 3 is obtained corresponding to the high-definition face part feature with the scale of 3, and the face part feature dictionary with the scale of 4 is obtained corresponding to the high-definition face part feature with the scale of 4.
4. The multi-scale face region feature dictionary based face image restoration system according to claim 3, wherein the face image restoration module (200) comprises:
a primary face feature extraction module: sequentially performing convolution, activation, pooling, convolution, activation and convolution operations on a degraded face image to be restored to obtain degraded face features with the scale of 1;
a secondary face feature extraction module: sequentially performing pooling, activation, convolution, activation and convolution operations on the degraded human face features with the scale of 1 to obtain degraded human face features with the scale of 2;
the three-level face feature extraction module: sequentially performing activation, pooling, convolution, activation, convolution, activation and convolution operations on the degraded human face features with the scale of 2 to obtain degraded human face features with the scale of 3;
four-level face feature extraction module: sequentially performing activation, pooling, convolution, activation, convolution, activation and convolution operations on the degraded human face features with the scale of 3 to obtain degraded human face features with the scale of 4;
the activation operation adopts a ReLU activation function, and the pooling operation adopts maximum pooling operation;
the first and second convolution operations are both 64 convolution operations of 3 x 3 with a step size of 1;
the third and fourth convolution operations are both 128 convolution operations of 3 x 3 with a step size of 1;
the fifth convolution operation to the 9 th convolution operation are all 256 convolution operations with 3 × 3 and the step size of 1;
the tenth to sixteen convolution operations are all 512 convolution operations with 3 × 3 and step size of 1;
the dictionary feature guidance enhancement module with the scale of 1: the face feature dictionary fusion method is used for fusing the degraded face features with the scale of 1 and the face part feature dictionary with the scale of 1 to obtain first-level face features to be restored after part enhancement;
scale 2 dictionary feature guidance enhancement module: the face feature dictionary fusion method is used for fusing the degraded face features with the scale of 2 and the face part feature dictionary with the scale of 2 to obtain second-level face features to be restored after part enhancement;
scale 3 dictionary feature guidance enhancement module: the face feature dictionary fusion method is used for fusing the degraded face features with the scale of 3 and the face part feature dictionary with the scale of 3 to obtain three-level face features to be restored after part enhancement;
the dictionary feature guidance enhancement module with the scale of 4: the face feature dictionary fusion method is used for fusing the degraded face features with the scale of 4 and the face part feature dictionary with the scale of 4 to obtain four-level face features to be restored after part enhancement;
a fourth-level reconstruction module: the system is used for carrying out affine transformation on the degraded human face features with the scale of 4 and four-level human face features to be restored, and inputting transformation results into network decoding features to obtain four-level reconstruction result features;
a third-level reconstruction module: the system is used for carrying out affine transformation on the four-level reconstruction result image and the three-level face features to be restored and inputting the transformation result into model network decoding features to obtain three-level reconstruction result features;
a secondary reconstruction module: the system is used for carrying out affine transformation on the three-level reconstruction result image and the second-level face feature to be restored and inputting the transformation result into the model network decoding feature to obtain the second-level reconstruction result feature;
a primary reconstruction module: the system comprises a model network decoding device, a first-level reconstruction result image processing device, a second-level reconstruction result image processing device, a first-level reconstruction result image processing device and a second-level reconstruction result image processing device, wherein the first-level reconstruction result image processing device is used for performing affine transformation on the first-level reconstruction result image and first-level to-be-restored face features;
an output module: and the primary reconstruction result image is output as a guide restoration result image.
5. The system for restoring a human face image based on a multi-scale human face part feature dictionary according to claim 4, wherein the processing procedures of the dictionary feature guidance enhancement module with the scale of 1, the dictionary feature guidance enhancement module with the scale of 2, the dictionary feature guidance enhancement module with the scale of 3 and the dictionary feature guidance enhancement module with the scale of 4 on the input data are the same; taking the dictionary feature guidance enhancement module with the scale of 1 as an example for explanation:
the scale-1 dictionary feature guidance enhancement module comprises:
face part feature extraction module: the method is used for obtaining the position characteristics of the left eye, the right eye, the nose and the mouth from the degraded human face characteristics with the scale of 1 according to the human face key points;
the dictionary feature self-adaptive normalization module: performing self-adaptive normalization operation on the part characteristics of the left eye, the right eye, the nose and the mouth in the face part characteristic dictionary with the scale of 1 by combining the left eye dictionary, the right eye dictionary, the nose dictionary and the mouth dictionary to obtain normalized dictionary characteristics;
and traversing the dictionary module: traversing in the normalized dictionary features to obtain corresponding features closest to the features of the left eye, the right eye, the nose and the mouth as matching dictionary features;
a confidence prediction module: the system is used for predicting confidence according to residual errors between the part features of the left eye, the right eye, the nose and the mouth and the corresponding matched dictionary features to obtain self-adaptive fusion features of all parts;
a restoration module: and the method is used for replacing the self-adaptive fusion features of all parts into the degraded human face image to be restored according to the human face key points to obtain the first-level human face features to be restored.
6. The system for restoring a human face image based on a multi-scale human face feature dictionary according to claim 5, wherein the method for obtaining the normalized dictionary features by the dictionary feature adaptive normalization module comprises:
and carrying out self-adaptive normalization operation on the position characteristics of the left eye, the right eye, the nose and the mouth:
Figure FDA0002619579480000041
wherein:
Figure FDA0002619579480000042
in order to normalize the characteristics of the dictionary after normalization,
Figure FDA0002619579480000043
constructing a kth clustering center of the c part feature with the scale of s for the offline; the sigma is the operation of the variance, and,
Figure FDA0002619579480000044
c ∈ { left eye, right eye, nose and mouth }, wherein s is 1, mu is mean value operation, for the c-th part feature of the degraded human face image with the scale of s to be restored;
the process of traversing the dictionary module to obtain the matching dictionary features comprises the following steps:
calculating respective features closest to the site features of the left eye, right eye, nose and mouth:
Figure FDA0002619579480000045
in the formula
Figure FDA0002619579480000046
In order to match the confidence of the dictionary features,<,>the inner product operation is performed.
7. The system for restoring a facial image based on a multi-scale facial feature dictionary according to claim 6,
the process of the confidence coefficient prediction module for obtaining the self-adaptive fusion characteristics of each part comprises the following steps:
Figure FDA0002619579480000047
in the formula
Figure FDA0002619579480000048
In order to adaptively blend the features of the video,
Figure FDA0002619579480000049
is the closest corresponding feature matched;
Figure FDA00026195794800000410
predicting the network for confidence, ΘCPredicting a network learnable parameter for the confidence; the confidence prediction network comprises two layers of convolution operation with 3 x 3 and step length of 1;
the primary reconstruction module obtains a scale change parameter alpha and a displacement change parameter beta through convolution operation of two layers of 3 x 3 with the step length of 1 for a secondary reconstruction result image and primary human face features to be restored, and obtains a primary reconstruction result image SFTs through calculation:
Figure FDA00026195794800000411
Figure FDA00026195794800000412
the feature with the scale s is decoded for the network.
8. The system for restoring a facial image based on a multi-scale facial features dictionary as claimed in claim 7, wherein the system is characterized in thatThe restoration model further comprises a constraint form of a training network, the training network constrains the whole network learning through reconstruction loss, and the reconstruction loss comprises the loss of the primary reconstruction result image and the non-degraded high-definition image corresponding to the primary reconstruction result image between the pixel space and the feature space
Figure FDA00026195794800000413
Figure FDA0002619579480000051
In the formula ofl2Representing the pixel spatial loss weight, λpmThe weight of the loss of the feature space is represented,
Figure FDA0002619579480000052
for the first-order reconstruction of the resulting image, IhFor corresponding undegraded high definition images, Cm、Hm、,WmSequentially representing the number, height and width of characteristic channels of the mth layer of the first-level reconstruction result imagemObtaining the mth layer of convolution characteristics for a pre-trained face recognition network;
the training network further comprises a multi-scale discriminant loss function:
for the image of the guiding restoration result, down-sampling r ═ {1,2,4,8} is respectively carried out to obtain 4 groups of images with different resolutions, and loss is calculated by adopting a changeloss mode through four discrimination networks
Figure FDA0002619579480000053
For learning of the discriminative network, the penalty is defined as:
Figure FDA0002619579480000054
wherein R is the upper limit of the scale, DrIs a discriminator with a dimension r,
Figure FDA0002619579480000055
for an image down-sampled by r times for a non-degraded high-definition image, x r denotes down-sampled by r times,
Figure FDA0002619579480000056
in the interest of expectation,
Figure FDA0002619579480000057
is composed of
Figure FDA0002619579480000058
The distribution of (a);
for learning of a generating network under discriminating network constraints, loss ladv,GIs defined as:
Figure FDA0002619579480000059
λa,rdiscriminating the network weight with the scale r; l isdIs a face key point, Dic is a constructed face dictionary, theta is a parameter which can be learned by a model, IdFor the degraded face image to be restored, P (I)d) Is IdThe distribution of (a) to (b) is,
Figure FDA00026195794800000510
is a recovery module.
9. The system for restoring a facial image based on a multi-scale facial feature dictionary according to claim 8, wherein the training network adopts an Adam optimization algorithm to perform end-to-end training on other network structures except the first-level facial feature extraction module, the second-level facial feature extraction module, the third-level facial feature extraction module and the fourth-level facial feature extraction module.
10. The system for restoring a human face image based on a multi-scale human face part feature dictionary according to claim 9, wherein the degraded human face image to be restored is obtained by sequentially performing blurring, down-sampling, noise adding and JPEG (joint photographic experts group) compression on a high-definition human face imageObtaining the standard deviation of Gaussian blur and motion blur by the blur processing
Figure FDA00026195794800000511
The method comprises the following steps of performing down-sampling treatment by a bicubic down-sampling method according to a sampling scale S ∈ {1:0.1: S }, performing noise addition treatment by Gaussian white noise according to a noise level N ∈ {0,1:0.1: N }, performing JPEG compression quality parameters Q ∈ {0,10:0.1: Q }, wherein P is more than or equal to 5, S is more than or equal to 8, N is more than or equal to 15, and Q is more than or equal to 80;
the whole network is trained through the constructed low-quality degraded human face image to be restored and the corresponding high-definition human face image, and the obtained trained network is used for restoring the low-quality image.
CN202010779169.7A 2020-08-05 2020-08-05 Face image restoration system based on multi-scale face part feature dictionary Pending CN111768354A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010779169.7A CN111768354A (en) 2020-08-05 2020-08-05 Face image restoration system based on multi-scale face part feature dictionary

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010779169.7A CN111768354A (en) 2020-08-05 2020-08-05 Face image restoration system based on multi-scale face part feature dictionary

Publications (1)

Publication Number Publication Date
CN111768354A true CN111768354A (en) 2020-10-13

Family

ID=72729707

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010779169.7A Pending CN111768354A (en) 2020-08-05 2020-08-05 Face image restoration system based on multi-scale face part feature dictionary

Country Status (1)

Country Link
CN (1) CN111768354A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113128624A (en) * 2021-05-11 2021-07-16 山东财经大学 Graph network face recovery method based on multi-scale dictionary
CN113554569A (en) * 2021-08-04 2021-10-26 哈尔滨工业大学 Face image restoration system based on double memory dictionaries
CN113688752A (en) * 2021-08-30 2021-11-23 厦门美图宜肤科技有限公司 Face pigment detection model training method, device, equipment and storage medium
CN114170108A (en) * 2021-12-14 2022-03-11 哈尔滨工业大学 Natural scene image blind restoration system based on human face degradation model migration
CN116452466A (en) * 2023-06-14 2023-07-18 荣耀终端有限公司 Image processing method, device, equipment and computer readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103996024A (en) * 2014-05-13 2014-08-20 南京信息工程大学 Bayesian estimation sparse representation face recognition method based on dictionary reconstruction
US20180268203A1 (en) * 2017-03-17 2018-09-20 Nec Laboratories America, Inc. Face recognition system for face recognition in unlabeled videos with domain adversarial learning and knowledge distillation
CN110288697A (en) * 2019-06-24 2019-09-27 天津大学 3D face representation and method for reconstructing based on multiple dimensioned figure convolutional neural networks
CN111260577A (en) * 2020-01-15 2020-06-09 哈尔滨工业大学 Face image restoration system based on multi-guide image and self-adaptive feature fusion

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103996024A (en) * 2014-05-13 2014-08-20 南京信息工程大学 Bayesian estimation sparse representation face recognition method based on dictionary reconstruction
US20180268203A1 (en) * 2017-03-17 2018-09-20 Nec Laboratories America, Inc. Face recognition system for face recognition in unlabeled videos with domain adversarial learning and knowledge distillation
CN110288697A (en) * 2019-06-24 2019-09-27 天津大学 3D face representation and method for reconstructing based on multiple dimensioned figure convolutional neural networks
CN111260577A (en) * 2020-01-15 2020-06-09 哈尔滨工业大学 Face image restoration system based on multi-guide image and self-adaptive feature fusion

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XIAOMING LI 等: ""Blind Face Restoration via Deep Multi-scale Component Dictionaries"", 《HTTPS://ARXIV.ORG/PDF/2008.00418.PDF》, 2 August 2020 (2020-08-02), pages 6 - 10 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113128624A (en) * 2021-05-11 2021-07-16 山东财经大学 Graph network face recovery method based on multi-scale dictionary
CN113554569A (en) * 2021-08-04 2021-10-26 哈尔滨工业大学 Face image restoration system based on double memory dictionaries
CN113554569B (en) * 2021-08-04 2022-03-08 哈尔滨工业大学 Face image restoration system based on double memory dictionaries
CN113688752A (en) * 2021-08-30 2021-11-23 厦门美图宜肤科技有限公司 Face pigment detection model training method, device, equipment and storage medium
WO2023029233A1 (en) * 2021-08-30 2023-03-09 厦门美图宜肤科技有限公司 Face pigment detection model training method and apparatus, device, and storage medium
CN113688752B (en) * 2021-08-30 2024-02-02 厦门美图宜肤科技有限公司 Training method, device, equipment and storage medium for face color detection model
CN114170108A (en) * 2021-12-14 2022-03-11 哈尔滨工业大学 Natural scene image blind restoration system based on human face degradation model migration
CN114170108B (en) * 2021-12-14 2024-04-12 哈尔滨工业大学 Natural scene image blind restoration system based on face degradation model migration
CN116452466A (en) * 2023-06-14 2023-07-18 荣耀终端有限公司 Image processing method, device, equipment and computer readable storage medium
CN116452466B (en) * 2023-06-14 2023-10-20 荣耀终端有限公司 Image processing method, device, equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN111768354A (en) Face image restoration system based on multi-scale face part feature dictionary
CN108520503B (en) Face defect image restoration method based on self-encoder and generation countermeasure network
CN108537754B (en) Face image restoration system based on deformation guide picture
CN110136062B (en) Super-resolution reconstruction method combining semantic segmentation
CN110378208B (en) Behavior identification method based on deep residual error network
CN112507617B (en) Training method of SRFlow super-resolution model and face recognition method
CN112291570B (en) Real-time video enhancement method based on lightweight deformable convolutional neural network
WO2024040973A1 (en) Multi-scale fused dehazing method based on stacked hourglass network
CN115393396B (en) Unmanned aerial vehicle target tracking method based on mask pre-training
CN111639564A (en) Video pedestrian re-identification method based on multi-attention heterogeneous network
CN109949217B (en) Video super-resolution reconstruction method based on residual learning and implicit motion compensation
CN103577813A (en) Information fusion method for heterogeneous iris recognition
CN112036260A (en) Expression recognition method and system for multi-scale sub-block aggregation in natural environment
CN112241939A (en) Light-weight rain removing method based on multi-scale and non-local
CN111260577B (en) Face image restoration system based on multi-guide image and self-adaptive feature fusion
CN112598604A (en) Blind face restoration method and system
CN113421186A (en) Apparatus and method for unsupervised video super-resolution using a generation countermeasure network
CN116630369A (en) Unmanned aerial vehicle target tracking method based on space-time memory network
US20230072445A1 (en) Self-supervised video representation learning by exploring spatiotemporal continuity
Zhao et al. Adaptive Dual-Stream Sparse Transformer Network for Salient Object Detection in Optical Remote Sensing Images
CN117333908A (en) Cross-modal pedestrian re-recognition method based on attitude feature alignment
Sun et al. Learning discrete representations from reference images for large scale factor image super-resolution
CN115147457B (en) Memory enhanced self-supervision tracking method and device based on space-time perception
CN113128624B (en) Graph network face recovery method based on multi-scale dictionary
CN113554569B (en) Face image restoration system based on double memory dictionaries

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201013