CN113379606A - Face super-resolution method based on pre-training generation model - Google Patents

Face super-resolution method based on pre-training generation model Download PDF

Info

Publication number
CN113379606A
CN113379606A CN202110934749.3A CN202110934749A CN113379606A CN 113379606 A CN113379606 A CN 113379606A CN 202110934749 A CN202110934749 A CN 202110934749A CN 113379606 A CN113379606 A CN 113379606A
Authority
CN
China
Prior art keywords
convolution
resolution
training
module
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110934749.3A
Other languages
Chinese (zh)
Other versions
CN113379606B (en
Inventor
孙立剑
王军
徐晓刚
曹卫强
朱岳江
虞舒敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202110934749.3A priority Critical patent/CN113379606B/en
Publication of CN113379606A publication Critical patent/CN113379606A/en
Application granted granted Critical
Publication of CN113379606B publication Critical patent/CN113379606B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the field of computer vision and image processing, and relates to a face super-resolution method based on a pre-training generated model, which comprises the following steps: step one, collecting and inputting low-resolution images to a feature extraction module
Figure DEST_PATH_IMAGE001
Extracting characteristic information; step two, inputting the characteristic information into a coder to obtain an implicit matrix with the channel number being 8 times of the input size, and obtaining an implicit vector after the implicit matrix is subjected to characteristic decomposition through a separation module
Figure 638665DEST_PATH_IMAGE002
Respectively inputting the generated feature and the face label data into a pre-training generation model in a cascading mode to obtain a generation feature; step three, transmitting the generated features to a decoder and fusing the feature extraction module
Figure 339905DEST_PATH_IMAGE001
And outputting the target high-resolution image after decoding operation of the extracted characteristic information. The invention can amplify the face with low resolution with high magnification, can obtain 64 times of super-resolution result at most, and the super-resolution result keeps better fidelity, so that the amplified image has better improvement in the aspects of fidelity and texture fidelity.

Description

Face super-resolution method based on pre-training generation model
Technical Field
The invention belongs to the field of computer vision and image processing, and relates to a face super-resolution method based on a pre-training generated model.
Background
The size of the image resolution is directly related to the image quality, and higher high resolution means that more detail information is contained, so that greater application potential is implied. However, in practical situations, many images face the problem of low resolution, which affects the subsequent high-level visual processing. With the continuous development of computer vision technology, especially the development of deep learning, image quality enhancement methods are more and more, and the super-resolution technology is an effective means for enhancing image quality, and can significantly improve the resolution of images. The image super-resolution technology samples the low-resolution image to the high-resolution image through an algorithm means, and has very important application value in multiple fields of security monitoring, medical detection, criminal investigation and the like. For example, in a security monitoring scene, due to factors such as a camera and the surrounding environment, a photographed target can be blurred, so that the target cannot be recognized, a clear picture can be reconstructed through a super-resolution technology, the resolution of a target face is improved, and therefore help is provided for quickly positioning a target person. Therefore, the image super-resolution technology as a low-level image processing method can provide effective support for subsequent high-level processing methods such as target detection and identification.
At present, a plurality of networks related to image super resolution are provided, the processing of various scenes and objects is obviously improved, the number of networks for the face super resolution is small, a plurality of methods are used for constructing corresponding face data and then training by using the existing networks, although some progress is made, the super resolution effect is not good for the face with low resolution, and the generated countermeasure network is widely applied to a super resolution task at present, and the purpose is to enrich texture details in the restored image. But common methods of generating an antagonistic network can limit the ability to approximate natural image manifolds, or because low dimensional steganography and constraints in image space are insufficient to guide the recovery process, these methods often produce artifacts and unnatural textures with low fidelity faces.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides a face super-resolution method based on a pre-training generation model, which is used for providing rich face detail features by introducing a large pre-training face generation model, can guide the pre-training generation model to enhance towards the features of an input face based on information extracted by a coding module by embedding the face detail features into a coding and decoding module based on residual attention, and further improves the quality of face image restoration by fusing various pre-training generation models and original input features through a decoder, and the specific technical scheme is as follows:
a face super-resolution method based on a pre-training generated model comprises the following steps:
step one, collecting and inputting low-resolution images to a feature extraction module
Figure 444755DEST_PATH_IMAGE001
Extracting characteristic information;
step two, inputting the characteristic information into a coder to obtain an implicit matrix with the channel number being 8 times of the input size, and obtaining the implicit matrix after characteristic decomposition of a separation moduleDeriving implicit vectors
Figure 84815DEST_PATH_IMAGE002
Respectively inputting the generated feature and the face label data into a pre-training generation model in a cascading mode to obtain a generation feature;
step three, transmitting the generated features to a decoder and fusing the feature extraction module
Figure 712105DEST_PATH_IMAGE001
And outputting the target high-resolution image after decoding operation of the extracted characteristic information.
Further, the feature extraction module
Figure 79501DEST_PATH_IMAGE001
The system is composed of 2 convolution layers of 3 multiplied by 64 multiplied by 1 and 6 residual channel attention units which are connected in series, wherein the 3 multiplied by 64 multiplied by 1 convolution layers represent the size of convolution kernels, 64 represents the number of the convolution kernels, and the last bit 1 represents the motion step of the convolution kernels; the residual error channel attention unit comprises a residual error unit and a channel attention unit, wherein the residual error unit extracts the characteristics of the input low-resolution image, inputs the characteristics into the channel attention unit to obtain a channel calibration coefficient vector beta, and outputs the channel calibration coefficient vector beta and the input characteristics of the channel attention unit as the output of the residual error channel attention unit after recalibration.
Further, the channel attention unit comprises a global average pooling layer, a ReLU nonlinear transformation layer, two convolution layers and a Sigmoid nonlinear transformation layer.
Further, the second step specifically includes: inputting characteristic information into 3 convolution modules adopted by coder
Figure 475848DEST_PATH_IMAGE003
Figure 21230DEST_PATH_IMAGE004
Each convolution module comprising a convolution layer of step 1, an active layer and a convolution layer of step 2, the first two convolution modules comprising one convolution layer3 x 64 x 02 convolutional layer, LReLU active layer and 3 x 13 x 264 x 1 convolutional layer, the last convolutional module includes a 3 x 128 x 2 convolutional layer, LReLU active layer and three (input size/8) x 128 x 1 convolutional layers, finally a 3 x 128 implicit matrix is output, the implicit matrix is subjected to feature decomposition to obtain three implicit vectors
Figure 27494DEST_PATH_IMAGE002
Figure 73947DEST_PATH_IMAGE004
Respectively inputting the data and the face label data into a residual error module in a pre-training generation model in a cascading mode to obtain corresponding generation characteristics
Figure 200166DEST_PATH_IMAGE005
Figure 306663DEST_PATH_IMAGE004
Furthermore, the pre-training generation model adopts a pre-training BigGAN model, each residual error module of the model comprises an up-sampling convolution, and corresponding generation characteristics are output
Figure 298758DEST_PATH_IMAGE005
Figure 758690DEST_PATH_IMAGE004
Further, the third step specifically includes: the decoder comprises a decoding module
Figure 129628DEST_PATH_IMAGE006
Decoding module
Figure 30195DEST_PATH_IMAGE007
Decoding module
Figure 260319DEST_PATH_IMAGE008
Decoding module
Figure 382996DEST_PATH_IMAGE009
Feature extraction module
Figure 733074DEST_PATH_IMAGE001
The extracted characteristic information is input into a decoding module
Figure 446952DEST_PATH_IMAGE006
In (1),
Figure 898794DEST_PATH_IMAGE006
output the result and
Figure 247998DEST_PATH_IMAGE010
input to a decoding module
Figure 734474DEST_PATH_IMAGE007
In (1),
Figure 478308DEST_PATH_IMAGE007
output the result and
Figure 417445DEST_PATH_IMAGE011
input to a decoding module
Figure 147503DEST_PATH_IMAGE008
In (1),
Figure 242148DEST_PATH_IMAGE008
output the result and
Figure 766670DEST_PATH_IMAGE012
input to a decoding module
Figure 927524DEST_PATH_IMAGE009
And finally obtaining the face image with the target resolution.
Further, the first three decoding modules in the decoder
Figure 195695DEST_PATH_IMAGE013
Figure 640451DEST_PATH_IMAGE014
The system comprises a 3 x 64 x 1 convolutional layer, an LReLU nonlinear transformation layer, two residual error units and a 2-time upsampled sub-pixel convolutional layer, wherein the residual error unit comprises a first branch and a second branch, the first branch sequentially passes through a 3 x 64 x 1 convolution and an LReLU nonlinear transformation layer and a 3 x 64 x 1 convolution, the second branch directly adds the input with the output of the first branch, and the final decoding module
Figure 476820DEST_PATH_IMAGE009
Comprising a 3 x 1 convolutional layer.
According to the invention, through the coding and decoding network based on the residual error structure and the channel attention convolution, the generation model of the pre-training is embedded in the middle of the coding and decoding structure, the implicit vector is generated by using the coding network, and the generator of the pre-training is guided to generate rich human face high-frequency information to provide the prior of texture and detail generation, so that the human face with low resolution is amplified with high magnification, 64 times of super-resolution result is obtained at most through the setting of the structure quantity of the residual error module in the pre-training generation model and the setting of the sampling convolution quantity on a decoder, and the super-resolution result keeps better fidelity, so that the amplified image has better improvement on the aspects of fidelity and texture fidelity, and diversified loss functions and introduced LPIPS evaluation indexes are beneficial to enhancing the visual perception quality.
Drawings
FIG. 1 is an overall flow chart of a high-magnification face super-resolution method based on a pre-training generated model according to the present invention;
FIG. 2 is a feature extraction module of the present invention
Figure 249604DEST_PATH_IMAGE001
A structure diagram;
fig. 3 is a diagram of the residual channel attention unit structure of the present invention.
Detailed Description
In order to make the objects, technical solutions and technical effects of the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and examples.
The embodiment of the invention takes 8-time image super-resolution as an example for explanation, and as shown in fig. 1, a face super-resolution method based on a pre-training generated model comprises the following steps:
step one, inputting a face image with the resolution of 16 multiplied by 16, and adopting a feature extraction module consisting of a plurality of residual error channel attention units
Figure 947564DEST_PATH_IMAGE001
Extracting feature information, including: contour features and texture features;
as shown in fig. 2 and 3, the feature extraction module
Figure 122194DEST_PATH_IMAGE001
The method comprises the steps that 2 convolution layers of 3 x 64 x 1 and 6 residual channel attention units connected in series are formed, the concerned convolution layers Conv are 3 x 64 x 1, 3 x 3 represents the size of a convolution kernel, 64 represents the number of the convolution kernels, the last bit represents the motion step of the convolution kernel, each residual channel attention unit comprises a residual unit and a channel attention unit, the features of an input image are extracted through the residual units, the features are input into the channel attention units to obtain channel calibration coefficient vectors beta, the channel calibration coefficient vectors beta and the input features of the channel attention units are recalibrated to serve as the output of the residual channel attention units, and the channel attention units comprise a global average pooling layer, a ReLU nonlinear transformation layer, two convolution layers and a Sigmoid nonlinear lamination transformation layer.
Step two, inputting the features extracted in the step one into an encoder structure, wherein the encoder structure adopts 3 convolution modules
Figure 129464DEST_PATH_IMAGE003
Figure 779757DEST_PATH_IMAGE004
Each convolution module comprising a convolution of step 1Layers, active layers and convolutional layers of one step 2, by means of each convolutional module
Figure 389730DEST_PATH_IMAGE003
Is characterized by being obtained
Figure 559811DEST_PATH_IMAGE015
Finally obtaining an implicit matrix Z with the channel number being 8 times of the input size, and obtaining an implicit vector by the implicit matrix Z through a separation module
Figure 485785DEST_PATH_IMAGE002
The image and face label data are jointly input into a pre-training generation model in a cascading mode, the model uses a pre-training high-resolution image generation model BigGAN to provide rich texture and detail prior knowledge for the generation of the high-resolution image, and an implicit vector required by the pre-training generation model
Figure 967582DEST_PATH_IMAGE002
Providing high-level information for the generated model, and leading the pre-trained generated model to generate more high-resolution face textures and detail features by face label data;
the encoder structure adopts 3 convolution modules, specifically, the first two convolution modules comprise a 3 × 3 × 64 × 02 convolution layer, an LReLU active layer and a 3 × 13 × 264 × 1 convolution layer, the last convolution module comprises a 3 × 3 × 128 × 2 convolution layer, an LReLU active layer and three (input size/8) × 128 × 1 convolution layers, and finally a 3 × 128 implicit matrix is output, and the implicit matrix is subjected to feature decomposition to obtain three implicit vectors
Figure 256612DEST_PATH_IMAGE002
Respectively inputting the residual error signals into a residual error module in a pre-training generation model, and in addition, because the generation module adopts a pre-training BigGAN model, in order to ensure that the model develops towards a high-resolution face direction, a face label and an implicit vector are cascaded and are jointly input into the residual error module;
saidThe structure of the pre-training generation model is the structure of a BigGAN model, and different from the BigGAN model, the method mainly utilizes the high-resolution detail generation capability of the BigGAN, each residual error module comprises an up-sampling convolution, and corresponding generation characteristics are output
Figure 405834DEST_PATH_IMAGE005
And input to the final decoder, i.e., the decoding module.
Step three, generating output characteristics in the module by pre-training
Figure 4174DEST_PATH_IMAGE005
Transferring to decoder, and fusing feature extraction module
Figure 114213DEST_PATH_IMAGE001
Extracted feature information
Figure 65988DEST_PATH_IMAGE016
After the operation of the decoder, the image with the target high resolution is finally output;
for the decoder, a feature extraction module
Figure 961394DEST_PATH_IMAGE001
Extracted feature information
Figure 606002DEST_PATH_IMAGE016
Input to a decoding module
Figure 203337DEST_PATH_IMAGE006
In (1),
Figure 349016DEST_PATH_IMAGE006
output the result and
Figure 941672DEST_PATH_IMAGE010
input to a decoding module
Figure 632547DEST_PATH_IMAGE007
In (1),
Figure 576232DEST_PATH_IMAGE007
output the result and
Figure 776137DEST_PATH_IMAGE011
input to a decoding module
Figure 488878DEST_PATH_IMAGE008
In (1),
Figure 85075DEST_PATH_IMAGE008
output the result and
Figure 516057DEST_PATH_IMAGE012
input to a decoding module
Figure 737959DEST_PATH_IMAGE009
Finally obtaining the face image of the target resolution, and aiming at the first three decoding modules in the decoder
Figure 570786DEST_PATH_IMAGE013
Figure 337885DEST_PATH_IMAGE004
The system comprises a 3 multiplied by 64 multiplied by 1 convolutional layer, an LReLU nonlinear transformation layer, two residual error units and a sub-pixel convolutional layer with 2 times of up sampling, wherein the residual error unit comprises two branches, one branch of the residual error units enables an input to sequentially pass through a 3 multiplied by 64 multiplied by 1 convolution, an LReLU nonlinear transformation layer and a 3 multiplied by 64 multiplied by 1 convolution, the other branch of the residual error units enables the input not to be changed, the input is directly added with the output of the first branch, and the last decoding module
Figure 147840DEST_PATH_IMAGE009
Comprising a 3 x 1 convolutional layer.
Wherein, the network involved in the first step to the third step is used as a face image super-resolution network, and the training process specifically comprises the following steps:
the loss function consists of three parts: content perception based on LPIPSKnown loss, pixel loss, i.e. smoothing
Figure 48800DEST_PATH_IMAGE017
And (4) loss, updating the network by using a back propagation strategy, wherein the pre-training generation model and the network parameters for calculating the content perception loss are fixed and do not participate in the training process. Using PSNR: peak Signal to Noise Ratio, Peak Signal-to-Noise Ratio, and SSIM: structural similarity index, LPIPS is used as an evaluation index of picture quality, a high-resolution face data set CelebA is selected, then the image is cut, only the face part is cut, the influence of hair hat clothes on the face is avoided, the cut picture is down-sampled to 128 x 128 by using the imresize in matlab as a high-resolution image and is down-sampled to 16 x 16 as a corresponding low-resolution image, the high-resolution face image and the low-resolution face image are used as a training set, a verification set and a test set, the whole training process is divided into two stages, the first stage adopts pixel loss for training, RMSprop is used for training, and the learning rate is set to be 0.0005; and in the second stage, content loss is introduced to carry out model fine adjustment, the learning rate is set to be 0.0001, the network is updated by using a back propagation strategy, and if the network is converged, the trained network model is stored and used as a final reasoning. Using this generator network as the final inference, 100 additional pictures of low resolution were selected as the test set. In addition, training and testing were performed on the hellen data set in the same manner, with the test results shown in table 1:
TABLE 1 comparison of the Performance of the present invention with other methods under different data sets at 8 Xmagnification (PSNR/SSIM/LPIPS)
Figure 345921DEST_PATH_IMAGE018
The last line in table 1 shows that the test was performed on both Helen and CelebA, and compared with the conventional super-resolution method including bicubic up-sampling, ESRGAN, RCAN, RDN, and FSRNet, the same data set training and testing were performed, the average PSNR and SSIM of 100 test pictures obtained by the present invention both obtained higher results, and additionally LPIPS was the lowest, the best visual perception quality was maintained, and the overall picture definition was also the best.

Claims (7)

1. A face super-resolution method based on a pre-training generated model is characterized by comprising the following steps:
step one, collecting and inputting low-resolution images to a feature extraction module
Figure 878828DEST_PATH_IMAGE001
Extracting characteristic information;
step two, inputting the characteristic information into a coder to obtain an implicit matrix with the channel number being 8 times of the input size, and obtaining an implicit vector after the implicit matrix is subjected to characteristic decomposition through a separation module
Figure 754380DEST_PATH_IMAGE002
Respectively inputting the generated feature and the face label data into a pre-training generation model in a cascading mode to obtain a generation feature;
step three, transmitting the generated features to a decoder and fusing the feature extraction module
Figure 409615DEST_PATH_IMAGE001
And outputting the target high-resolution image after decoding operation of the extracted characteristic information.
2. The face super-resolution method based on pre-training generated model as claimed in claim 1, wherein said feature extraction module
Figure 446841DEST_PATH_IMAGE001
The system is composed of 2 convolution layers of 3 multiplied by 64 multiplied by 1 and 6 residual channel attention units which are connected in series, wherein the 3 multiplied by 64 multiplied by 1 convolution layers represent the size of convolution kernels, 64 represents the number of the convolution kernels, and the last bit 1 represents the motion step of the convolution kernels; the residual channel attention unit comprises a residual unit and a channel attention unit, wherein the residual unit extracts the characteristics of the input low-resolution image and outputs the characteristicsAnd the channel attention unit acquires a channel calibration coefficient vector beta, and the channel calibration coefficient vector beta and the input characteristics of the channel attention unit are recalibrated to be used as the output of the residual channel attention unit.
3. The method of claim 2, wherein the channel attention unit comprises a global average pooling layer, a ReLU nonlinear transformation layer, two convolution layers and a Sigmoid nonlinear transformation layer.
4. The face super-resolution method based on the pre-training generated model as claimed in claim 1, wherein the second step specifically comprises: inputting characteristic information into 3 convolution modules adopted by coder
Figure 69583DEST_PATH_IMAGE003
Figure 607881DEST_PATH_IMAGE004
Each convolution module comprises a convolution layer with step 1, an active layer and a convolution layer with step 2, the first two convolution modules comprise a convolution layer with 3 x 64 x 02, an LReLU active layer and a convolution layer with 3 x 13 x 264 x 1, the last convolution module comprises a convolution layer with 3 x 128 x 2, an LReLU active layer and three convolution layers with (input size/8) × 128 x 1, finally a hidden matrix with 3 x 128 is output, and the hidden matrix is subjected to feature decomposition to obtain three hidden vectors
Figure 491523DEST_PATH_IMAGE002
Figure 371755DEST_PATH_IMAGE004
Respectively inputting the data and the face label data into a residual error module in a pre-training generation model in a cascading mode to obtain corresponding generation characteristics
Figure 229596DEST_PATH_IMAGE005
Figure 446951DEST_PATH_IMAGE004
5. The face super-resolution method based on the pre-training generated model as claimed in claim 1, wherein the pre-training generated model is a pre-training BigGAN model, each residual module of the model comprises an upsampling convolution, and the model outputs the corresponding generated feature
Figure 326045DEST_PATH_IMAGE005
Figure 95287DEST_PATH_IMAGE004
6. The face super-resolution method based on the pre-training generated model as claimed in claim 5, wherein the third step specifically comprises: the decoder comprises a decoding module
Figure 817255DEST_PATH_IMAGE006
Decoding module
Figure 448088DEST_PATH_IMAGE007
Decoding module
Figure 198000DEST_PATH_IMAGE008
Decoding module
Figure 747930DEST_PATH_IMAGE009
Feature extraction module
Figure 832561DEST_PATH_IMAGE001
The extracted characteristic information is input into a decoding module
Figure 391718DEST_PATH_IMAGE006
In (1),
Figure 494672DEST_PATH_IMAGE006
output the result and
Figure 356449DEST_PATH_IMAGE010
input to a decoding module
Figure 787430DEST_PATH_IMAGE007
In (1),
Figure 502009DEST_PATH_IMAGE007
output the result and
Figure 475781DEST_PATH_IMAGE011
input to a decoding module
Figure 633093DEST_PATH_IMAGE008
In (1),
Figure 676004DEST_PATH_IMAGE008
output the result and
Figure 717910DEST_PATH_IMAGE012
input to a decoding module
Figure 670822DEST_PATH_IMAGE009
And finally obtaining the face image with the target resolution.
7. The method as claimed in claim 6, wherein the first three decoding modules in the decoder are the same as the previous three decoding modules
Figure 359555DEST_PATH_IMAGE013
Figure 765128DEST_PATH_IMAGE014
The system comprises a 3 x 64 x 1 convolutional layer, an LReLU nonlinear transformation layer, two residual error units and a 2-time upsampled sub-pixel convolutional layer, wherein the residual error unit comprises a first branch and a second branch, the first branch sequentially passes through a 3 x 64 x 1 convolution and an LReLU nonlinear transformation layer and a 3 x 64 x 1 convolution, the second branch directly adds the input with the output of the first branch, and the final decoding module
Figure 345145DEST_PATH_IMAGE009
Comprising a 3 x 1 convolutional layer.
CN202110934749.3A 2021-08-16 2021-08-16 Face super-resolution method based on pre-training generation model Active CN113379606B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110934749.3A CN113379606B (en) 2021-08-16 2021-08-16 Face super-resolution method based on pre-training generation model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110934749.3A CN113379606B (en) 2021-08-16 2021-08-16 Face super-resolution method based on pre-training generation model

Publications (2)

Publication Number Publication Date
CN113379606A true CN113379606A (en) 2021-09-10
CN113379606B CN113379606B (en) 2021-12-07

Family

ID=77577259

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110934749.3A Active CN113379606B (en) 2021-08-16 2021-08-16 Face super-resolution method based on pre-training generation model

Country Status (1)

Country Link
CN (1) CN113379606B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114610861A (en) * 2022-05-11 2022-06-10 之江实验室 End-to-end dialogue method for integrating knowledge and emotion based on variational self-encoder
CN115311720A (en) * 2022-08-11 2022-11-08 山东省人工智能研究院 Defekake generation method based on Transformer

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107958246A (en) * 2018-01-17 2018-04-24 深圳市唯特视科技有限公司 A kind of image alignment method based on new end-to-end human face super-resolution network
CN109255831A (en) * 2018-09-21 2019-01-22 南京大学 The method that single-view face three-dimensional reconstruction and texture based on multi-task learning generate
CN110148085A (en) * 2019-04-22 2019-08-20 智慧眼科技股份有限公司 Face image super-resolution reconstruction method and computer-readable storage medium
CN110288537A (en) * 2019-05-20 2019-09-27 湖南大学 Facial image complementing method based on the depth production confrontation network from attention
CN110378979A (en) * 2019-07-04 2019-10-25 公安部第三研究所 The method automatically generated based on the generation confrontation customized high-resolution human face picture of network implementations
CN110889332A (en) * 2019-10-30 2020-03-17 中国科学院自动化研究所南京人工智能芯片创新研究院 Lie detection method based on micro expression in interview
CN111080527A (en) * 2019-12-20 2020-04-28 北京金山云网络技术有限公司 Image super-resolution method and device, electronic equipment and storage medium
WO2020099957A1 (en) * 2018-11-12 2020-05-22 Sony Corporation Semantic segmentation with soft cross-entropy loss
CN112488923A (en) * 2020-12-10 2021-03-12 Oppo广东移动通信有限公司 Image super-resolution reconstruction method and device, storage medium and electronic equipment
CN112507997A (en) * 2021-02-08 2021-03-16 之江实验室 Face super-resolution system based on multi-scale convolution and receptive field feature fusion
US20210118099A1 (en) * 2019-10-18 2021-04-22 Retrace Labs Generative Adversarial Network for Dental Image Super-Resolution, Image Sharpening, and Denoising

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107958246A (en) * 2018-01-17 2018-04-24 深圳市唯特视科技有限公司 A kind of image alignment method based on new end-to-end human face super-resolution network
CN109255831A (en) * 2018-09-21 2019-01-22 南京大学 The method that single-view face three-dimensional reconstruction and texture based on multi-task learning generate
WO2020099957A1 (en) * 2018-11-12 2020-05-22 Sony Corporation Semantic segmentation with soft cross-entropy loss
CN110148085A (en) * 2019-04-22 2019-08-20 智慧眼科技股份有限公司 Face image super-resolution reconstruction method and computer-readable storage medium
CN110288537A (en) * 2019-05-20 2019-09-27 湖南大学 Facial image complementing method based on the depth production confrontation network from attention
CN110378979A (en) * 2019-07-04 2019-10-25 公安部第三研究所 The method automatically generated based on the generation confrontation customized high-resolution human face picture of network implementations
US20210118099A1 (en) * 2019-10-18 2021-04-22 Retrace Labs Generative Adversarial Network for Dental Image Super-Resolution, Image Sharpening, and Denoising
CN110889332A (en) * 2019-10-30 2020-03-17 中国科学院自动化研究所南京人工智能芯片创新研究院 Lie detection method based on micro expression in interview
CN111080527A (en) * 2019-12-20 2020-04-28 北京金山云网络技术有限公司 Image super-resolution method and device, electronic equipment and storage medium
CN112488923A (en) * 2020-12-10 2021-03-12 Oppo广东移动通信有限公司 Image super-resolution reconstruction method and device, storage medium and electronic equipment
CN112507997A (en) * 2021-02-08 2021-03-16 之江实验室 Face super-resolution system based on multi-scale convolution and receptive field feature fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KELVIN C.K ET AL: "GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution", 《ARXIV.ORG》 *
黄怀波: "基于生成模型的人脸图像合成与分析", 《中国优秀博硕士学位论文全文数据库(博士)信息科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114610861A (en) * 2022-05-11 2022-06-10 之江实验室 End-to-end dialogue method for integrating knowledge and emotion based on variational self-encoder
CN114610861B (en) * 2022-05-11 2022-08-26 之江实验室 End-to-end dialogue method integrating knowledge and emotion based on variational self-encoder
CN115311720A (en) * 2022-08-11 2022-11-08 山东省人工智能研究院 Defekake generation method based on Transformer
CN115311720B (en) * 2022-08-11 2023-06-06 山东省人工智能研究院 Method for generating deepfake based on transducer

Also Published As

Publication number Publication date
CN113379606B (en) 2021-12-07

Similar Documents

Publication Publication Date Title
CN112507997B (en) Face super-resolution system based on multi-scale convolution and receptive field feature fusion
WO2022267641A1 (en) Image defogging method and system based on cyclic generative adversarial network
CN113284051B (en) Face super-resolution method based on frequency decomposition multi-attention machine system
CN113191953B (en) Transformer-based face image super-resolution method
CN109685716B (en) Image super-resolution reconstruction method for generating countermeasure network based on Gaussian coding feedback
Zhao et al. Invertible image decolorization
CN109636721B (en) Video super-resolution method based on countermeasure learning and attention mechanism
CN109035267B (en) Image target matting method based on deep learning
CN110363068B (en) High-resolution pedestrian image generation method based on multiscale circulation generation type countermeasure network
CN113379606B (en) Face super-resolution method based on pre-training generation model
Luo et al. Lattice network for lightweight image restoration
CN112381716B (en) Image enhancement method based on generation type countermeasure network
CN109949217B (en) Video super-resolution reconstruction method based on residual learning and implicit motion compensation
CN112949636B (en) License plate super-resolution recognition method, system and computer readable medium
CN110047038B (en) Single-image super-resolution reconstruction method based on hierarchical progressive network
CN115713462A (en) Super-resolution model training method, image recognition method, device and equipment
CN117408924A (en) Low-light image enhancement method based on multiple semantic feature fusion network
CN116797541A (en) Transformer-based lung CT image super-resolution reconstruction method
CN116433516A (en) Low-illumination image denoising and enhancing method based on attention mechanism
CN113205005B (en) Low-illumination low-resolution face image reconstruction method
CN111861877A (en) Method and apparatus for video hyper-resolution
CN116266336A (en) Video super-resolution reconstruction method, device, computing equipment and storage medium
CN115018733A (en) High dynamic range imaging and ghost image removing method based on generation countermeasure network
CN114331894A (en) Face image restoration method based on potential feature reconstruction and mask perception
CN115311149A (en) Image denoising method, model, computer-readable storage medium and terminal device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant