CN113139915A - Portrait restoration model training method and device and electronic equipment - Google Patents

Portrait restoration model training method and device and electronic equipment Download PDF

Info

Publication number
CN113139915A
CN113139915A CN202110395414.9A CN202110395414A CN113139915A CN 113139915 A CN113139915 A CN 113139915A CN 202110395414 A CN202110395414 A CN 202110395414A CN 113139915 A CN113139915 A CN 113139915A
Authority
CN
China
Prior art keywords
training
data set
portrait
local
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110395414.9A
Other languages
Chinese (zh)
Inventor
杨健榜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202110395414.9A priority Critical patent/CN113139915A/en
Publication of CN113139915A publication Critical patent/CN113139915A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The application relates to a portrait restoration model training method, a portrait restoration method and device, an electronic device and a computer readable storage medium, comprising: acquiring a training data set, wherein the training data set comprises a low-quality picture data set and a corresponding label high-quality picture data set; inputting the low-quality image data set into a portrait restoration model to be trained to obtain a training generation image data set; inputting the training generated picture data set and the corresponding label high-quality picture data set into a global discriminator to obtain a global discrimination result; inputting a part training image corresponding to the training generated picture data set and a part label image in a corresponding label high-quality picture data set into a local discriminator corresponding to the human face part to obtain a local discrimination result; and training the portrait restoration model, the global discriminator and the local discriminator based on the global discrimination result, the local discrimination result and the preset target loss function to obtain the trained portrait restoration model, wherein the restored image quality is high.

Description

Portrait restoration model training method and device and electronic equipment
Technical Field
The present application relates to the field of computer technologies, and in particular, to a portrait restoration model training method, a portrait restoration method and apparatus, an electronic device, and a computer-readable storage medium.
Background
The problems of low resolution, high noise, compression distortion and the like of some old photos are generally caused by various factors such as limitation of shooting devices and ages and network stream compression distortion, and the problems are classified as the problem of low image quality of the old photos. It is an important technology to restore a portrait photo with low resolution, much noise, blur and the like to a high-definition image.
The existing portrait restoration algorithm has the disadvantages of unsatisfactory restoration effect and low quality of restored images.
Disclosure of Invention
The embodiment of the application provides a portrait restoration model training method, a portrait restoration method and device, electronic equipment and a computer-readable storage medium.
A training method of a portrait restoration model comprises the following steps:
acquiring a training data set, wherein the training data set comprises a low-quality picture data set and a corresponding label high-quality picture data set;
inputting the low-quality image data set into a portrait restoration model to be trained to obtain an output training generation image data set;
inputting the training generated picture data set and the corresponding label high-quality picture data set into a global discriminator to obtain a global discrimination result;
acquiring a part training image corresponding to each training generation picture in a training generation picture data set;
inputting the part training image and the part label image in the corresponding label high-quality picture data set into a local discriminator corresponding to the human face part to obtain a local discrimination result;
and training the portrait restoration model, the global discriminator and the local discriminator based on the global discrimination result, the local discrimination result and a preset target loss function until the training is finished to obtain a trained portrait restoration model, wherein the trained portrait restoration model is used for restoring the portrait of the low-quality picture.
A training apparatus for a portrait restoration model, comprising:
an acquisition module, configured to acquire a training data set, where the training data set includes a low-quality picture data set and a corresponding tagged high-quality picture data set;
the image generation module is used for inputting the low-quality image data set into a portrait restoration model to be trained to obtain an output training generation image data set;
the global discrimination module is used for inputting the training generated picture data set and the corresponding label high-quality picture data set into a global discriminator to obtain a global discrimination result;
the local judgment module is used for acquiring a part training image corresponding to each training generation picture in the training generation picture data set, inputting the part training image and a part label image in the corresponding label high-quality picture data set into a local judger corresponding to the human face part, and obtaining a local judgment result;
and the training module is used for training the portrait restoration model, the global discriminator and the local discriminator based on the global discrimination result, the local discrimination result and a preset target loss function until the training is finished to obtain a trained portrait restoration model, and the trained portrait restoration model is used for restoring the portrait of the low-quality picture.
An electronic device comprising a memory and a processor, the memory having stored therein a computer program that, when executed by the processor, causes the processor to perform the steps of:
acquiring a training data set, wherein the training data set comprises a low-quality picture data set and a corresponding label high-quality picture data set;
inputting the low-quality image data set into a portrait restoration model to be trained to obtain an output training generation image data set;
inputting the training generated picture data set and the corresponding label high-quality picture data set into a global discriminator to obtain a global discrimination result;
acquiring a part training image corresponding to each training generation picture in a training generation picture data set;
inputting the part training image and the part label image in the corresponding label high-quality picture data set into a local discriminator corresponding to the human face part to obtain a local discrimination result;
and training the portrait restoration model, the global discriminator and the local discriminator based on the global discrimination result, the local discrimination result and a preset target loss function until the training is finished to obtain a trained portrait restoration model, wherein the trained portrait restoration model is used for restoring the portrait of the low-quality picture.
A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, causes the processor to perform the steps of:
acquiring a training data set, wherein the training data set comprises a low-quality picture data set and a corresponding label high-quality picture data set;
inputting the low-quality image data set into a portrait restoration model to be trained to obtain an output training generation image data set;
inputting the training generated picture data set and the corresponding label high-quality picture data set into a global discriminator to obtain a global discrimination result;
acquiring a part training image corresponding to each training generation picture in a training generation picture data set;
inputting the part training image and the part label image in the corresponding label high-quality picture data set into a local discriminator corresponding to the human face part to obtain a local discrimination result;
and training the portrait restoration model, the global discriminator and the local discriminator based on the global discrimination result, the local discrimination result and a preset target loss function until the training is finished to obtain a trained portrait restoration model, wherein the trained portrait restoration model is used for restoring the portrait of the low-quality picture.
According to the portrait restoration model training method, the portrait restoration model training device, the electronic equipment and the computer-readable storage medium, a training data set is obtained, and the training data set comprises a low-quality picture data set and a corresponding label high-quality picture data set; inputting the low-quality image data set into a portrait restoration model to be trained to obtain an output training generation image data set; inputting the training generated picture data set and the corresponding label high-quality picture data set into a global discriminator to obtain a global discrimination result; acquiring a part training image corresponding to each training generation picture in a training generation picture data set; inputting the part training image and the part label image in the corresponding label high-quality picture data set into a local discriminator corresponding to the human face part to obtain a local discrimination result; training a portrait restoration model, a global discriminator and a local discriminator based on a global discrimination result, a local discrimination result and a preset target loss function until the training is completed to obtain a trained portrait restoration model, wherein the trained portrait restoration model is used for restoring the portrait of the low-quality picture, and the local discriminator corresponding to the face part is used for enhancing the detail information of the key part of the restored face, so that the restored image has rich and vivid face detail information, a clear image and high restored image quality.
A portrait restoration method, comprising:
acquiring a portrait picture to be restored, inputting the portrait picture to be restored into a trained portrait restoration model, the trained portrait restoration model is a portrait restoration model to be trained, which is obtained by inputting a low-quality picture data set into the portrait restoration model to be trained, training the human image restoration model, the global discriminator and the local discriminator according to the global discrimination result, the local discrimination result and the preset target loss function until the training is finished, wherein the global decision result is obtained by inputting the training generated picture data set and the corresponding labeled high-quality picture data set into the global decision device, the local judgment result is obtained by inputting a part training image corresponding to each training generation picture in the training generation picture data set and a part label image in the corresponding label high-quality picture data set into a local judger corresponding to the human face part;
and the portrait restoration model restores the face area in the portrait picture to be restored and outputs a corresponding portrait picture with high image quality.
A portrait restoration apparatus comprising:
an acquisition module, configured to acquire a training data set, where the training data set includes a low-quality picture data set and a corresponding tagged high-quality picture data set;
the image generation module is used for inputting the low-quality image data set into a portrait restoration model to be trained to obtain an output training generation image data set;
the global discrimination module is used for inputting the training generated picture data set and the corresponding label high-quality picture data set into a global discriminator to obtain a global discrimination result;
the local judgment module is used for acquiring a part training image corresponding to each training generation picture in the training generation picture data set, inputting the part training image and a part label image in the corresponding label high-quality picture data set into a local judger corresponding to the human face part, and obtaining a local judgment result;
and the training module is used for training the portrait restoration model, the global discriminator and the local discriminator based on the global discrimination result, the local discrimination result and a preset target loss function until the training is finished to obtain a trained portrait restoration model, and the trained portrait restoration model is used for restoring the portrait of the low-quality picture.
An electronic device comprising a memory and a processor, the memory having stored therein a computer program that, when executed by the processor, causes the processor to perform the steps of:
acquiring a portrait picture to be restored, inputting the portrait picture to be restored into a trained portrait restoration model, the trained portrait restoration model is a portrait restoration model to be trained, which is obtained by inputting a low-quality picture data set into the portrait restoration model to be trained, training the human image restoration model, the global discriminator and the local discriminator according to the global discrimination result, the local discrimination result and the preset target loss function until the training is finished, wherein the global decision result is obtained by inputting the training generated picture data set and the corresponding labeled high-quality picture data set into the global decision device, the local judgment result is obtained by inputting a part training image corresponding to each training generation picture in the training generation picture data set and a part label image in the corresponding label high-quality picture data set into a local judger corresponding to the human face part;
and the portrait restoration model restores the face area in the portrait picture to be restored and outputs a corresponding portrait picture with high image quality.
A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, causes the processor to perform the steps of:
acquiring a portrait picture to be restored, inputting the portrait picture to be restored into a trained portrait restoration model, the trained portrait restoration model is a portrait restoration model to be trained, which is obtained by inputting a low-quality picture data set into the portrait restoration model to be trained, training the human image restoration model, the global discriminator and the local discriminator according to the global discrimination result, the local discrimination result and the preset target loss function until the training is finished, wherein the global decision result is obtained by inputting the training generated picture data set and the corresponding labeled high-quality picture data set into the global decision device, the local judgment result is obtained by inputting a part training image corresponding to each training generation picture in the training generation picture data set and a part label image in the corresponding label high-quality picture data set into a local judger corresponding to the human face part;
and the portrait restoration model restores the face area in the portrait picture to be restored and outputs a corresponding portrait picture with high image quality.
According to the portrait restoration method, the portrait restoration device, the electronic equipment and the computer readable storage medium, the face area in the portrait picture to be restored is restored through the trained portrait restoration model, the corresponding portrait picture with high image quality is output, as the trained portrait restoration model enhances the detail information of key parts of the restored face through the local discriminator, the addition of the local generation countermeasure loss is beneficial to generating vivid details at the corresponding portrait part, the network parameters of the portrait restoration model are adjusted in the training process, the network parameters of the global discriminator and the local discriminator are also adjusted synchronously, the countermeasure learning is carried out, and the image obtained through training can be rich and vivid in face detail information of the restored image, the image is clearer, and the quality of the restored image is high.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a diagram of an application environment of a portrait restoration model training method and a portrait restoration method in one embodiment;
FIG. 2 is a schematic flow chart illustrating a method for training a portrait fix model according to an embodiment;
FIG. 3 is a schematic flow chart illustrating the process of obtaining a part training image corresponding to each training generated picture in a training generated picture dataset according to an embodiment;
FIG. 4 is a schematic diagram of a segmentation spectrum obtained in one embodiment;
FIG. 5 is a schematic diagram of a process for training network parameters of a portrait restoration model in one embodiment;
FIG. 6 is a schematic diagram of a process for training global and local classifiers in an embodiment;
FIG. 7 is a flow diagram of portrait fix model data in one embodiment;
FIG. 8 is a schematic diagram of a generator in one embodiment;
FIG. 9 is a diagram illustrating the structure of a global arbiter in one embodiment;
FIG. 10 is a diagram illustrating a local arbiter in one embodiment;
FIG. 11 is a flowchart illustrating a method for portrait fix in one embodiment;
FIG. 12 is a schematic diagram illustrating a partial low-quality image restoration result according to an embodiment;
FIG. 13 is a block diagram showing the construction of an apparatus for training a portrait restoration model according to an embodiment;
FIG. 14 is a block diagram of the construction of a portrait session apparatus according to one embodiment;
fig. 15 is a block diagram showing an internal configuration of an electronic device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Fig. 1 is an application environment diagram of a portrait restoration model training method and a portrait restoration method in one embodiment. As shown in fig. 1, the application environment includes a terminal 110 and a server 120, where the terminal 110 and the server 120 may respectively and independently complete a portrait restoration model training method and a portrait restoration method, and the terminal 110 or the server 120 obtains a training data set, where the training data set includes a low-quality picture data set and a corresponding tagged high-quality picture data set; inputting the low-quality image data set into a portrait restoration model to be trained to obtain an output training generation image data set; inputting the training generated picture data set and the corresponding label high-quality picture data set into a global discriminator to obtain a global discrimination result; acquiring a part training image corresponding to each training generation picture in a training generation picture data set; inputting the part training image and the part label image in the corresponding label high-quality picture data set into a local discriminator corresponding to the human face part to obtain a local discrimination result; and training the portrait restoration model, the global discriminator and the local discriminator based on the global discrimination result, the local discrimination result and the preset target loss function until the training is finished to obtain the trained portrait restoration model, wherein the trained portrait restoration model is used for restoring the portrait of the low-quality picture. The terminal 110 may be a terminal device including a mobile phone, a tablet computer, a PDA (Personal Digital Assistant), a vehicle-mounted computer, a wearable device, and the like. Where the server 120 may be a server or a cluster of servers. The server 120 may also obtain a video to be processed from the terminal 110, and return a target image obtained by processing to the terminal 110 for display.
FIG. 2 is a flowchart of a method for training a portrait fix model in one embodiment. The method for training the portrait restoration model shown in fig. 2 can be applied to the terminal 110 or the server 120, and includes:
step 202, a training data set is obtained, wherein the training data set comprises a low-quality picture data set and a corresponding label high-quality picture data set.
The tag high-quality picture data set comprises a plurality of high-quality pictures containing human faces, corresponding low-quality pictures can be obtained by carrying out image quality degradation processing on the high-quality pictures, and each low-quality picture forms a low-quality picture data set. The image degradation processing mainly involves operations such as noise adding and blurring. In one embodiment, a high-quality portrait picture is obtained, a face region in the high-quality portrait picture is extracted to obtain a tagged high-quality picture, different tagged high-quality pictures form a tagged high-quality picture data set, it is ensured that both a low-quality picture and a corresponding high-quality picture in a training data set only include image content corresponding to the face region, and effectiveness of the training data set on subsequent model training is improved.
Specifically, the low-quality pictures in the low-quality picture data set are input as training data into a human image restoration model to be trained for training, the high-quality pictures in the high-quality picture data set are provided to a global discriminator, the human image restoration model, namely a generator, discriminates the reconstructed generated pictures from the high-quality pictures in the corresponding high-quality picture data set. Meanwhile, the high-quality pictures in the high-quality picture data set are used for extracting part label images corresponding to parts of the human face, the part label images are provided for a local discriminator, part training images corresponding to the parts of the human face in the generated pictures reconstructed by the human image restoration model are discriminated from part label images corresponding to the parts of the human face in the corresponding high-quality pictures, the human image restoration model is optimized by combining a target loss function, and finally the human image restoration model is trained.
And step 204, inputting the low-quality image data set into a portrait restoration model to be trained to obtain an output training generation image data set.
Specifically, the portrait restoration model is a generator in a generation countermeasure network, and includes a coding network and a decoding network, where the coding network extracts image features, the decoding network restores an image, the Encoder module uses backsbones such as but not limited to mobilene, Resnet, vgg, and the like to implement encoding, and the encoding module, i.e., the feature extraction module, is used to perform feature extraction. The decoding module is used for processing the image characteristics to obtain a training generated picture. The human image restoration model can be obtained by using a deep learning algorithm, such as CNN (Convolutional Neural Networks), U-Net algorithm, FCN (full Convolutional Neural Networks), and the like. And inputting the low-quality picture data set into a portrait restoration model to be trained, restoring the portrait by the portrait restoration model through the processing of a coding network and a decoding network to obtain a training generation picture corresponding to the low-quality picture, and forming a training generation picture data set. The human image restoration quality of the training generated picture is continuously improved by continuously adjusting the network parameters of the human image restoration model.
The counterstudy is to learn by making two machine learning models game with each other, so as to obtain the expected machine learning model. The portrait restoration model is confrontational-learned with a global discriminator and a local discriminator, and the goal of the portrait restoration model is to obtain expected output according to input. The goal of the global and local discriminators is to distinguish the output of the portrait restoration model from the real image as much as possible. The input of the global arbiter and the local arbiter comprises the output of the portrait restoration model and the tag high-quality picture dataset. The two networks resist against each other to learn and continuously adjust parameters, and the final purpose is that the portrait restoration model cheats the global discriminator and the local discriminator as much as possible, so that the global discriminator and the local discriminator cannot judge whether the output result of the portrait restoration model is real or not.
And step 206, inputting the training generated picture data set and the corresponding label high-quality picture data set into a global discriminator to obtain a global discrimination result.
Specifically, the global discriminator is configured to discriminate authenticity of a global picture, discriminate each training generated picture in the training generated picture data set from a corresponding tagged high-quality picture in the tagged high-quality picture data set, obtain a global discrimination result, and output a probability value indicating that the input picture is true, where the probability value is between 0 and 1. The parameters of the global discriminator are adjusted in such a direction that the loss value of the global discriminator becomes smaller, so that the discrimination capability of the global discriminator becomes stronger.
And step 208, acquiring a part training image corresponding to each training generation picture in the training generation picture data set, and inputting the part training image and the part label image in the corresponding label high-quality picture data set into a local discriminator corresponding to the human face part to obtain a local discrimination result.
Different face parts correspond to different local discriminators, and the types of the face parts can be customized, such as eye parts, face parts, mouth parts, nose parts and the like. Different local discriminators are used to discriminate the authenticity of the local picture of the corresponding part. If the eye part corresponds to the eye local discriminator, the eye part is used for judging the authenticity of the eye area picture, and the mouth part corresponds to the mouth discriminator, the mouth area picture is judged to be authentic.
Specifically, a component region corresponding to the training generated picture can be acquired according to the position of the face component of at least one category in the tag high-quality picture, so as to obtain a component training image corresponding to the component of at least one category. The local discriminator discriminates the part training image corresponding to the training generated picture and the part label image in the corresponding label high-quality picture data set to obtain a local discrimination result, and the output is a probability value which represents that the input picture is true and is between 0 and 1. The parameter of the local discriminator is adjusted in such a direction that the loss value of the local discriminator becomes smaller, so that the discrimination capability of the local discriminator becomes stronger.
And step 210, training a portrait restoration model, a global discriminator and a local discriminator based on the global discrimination result, the local discrimination result and a preset target loss function until the training is finished to obtain a trained portrait restoration model, wherein the trained portrait restoration model is used for restoring the portrait of the low-quality picture.
Specifically, the global discrimination result and the local discrimination result are used for respectively obtaining global generation countermeasure loss and local generation countermeasure loss of the training generated picture relative to the label high-quality picture, so that corresponding target generation countermeasure loss can be obtained according to the global generation countermeasure loss and the local generation countermeasure loss, and other types of loss such as image content pixel loss, perception loss and the like can be calculated according to the loss type contained in the target loss function, so that the target loss is calculated according to the target loss function and is used for back propagation and adjustment of network parameters of the portrait restoration model. The target loss function may be customized, and in addition to generating the countermeasures, other included loss categories may be customized.
In the training process, besides adjusting the network parameters of the portrait restoration model, the network parameters of the global discriminator and the local discriminator are also adjusted synchronously, wherein the loss functions corresponding to the global discriminator and the local discriminator respectively can be customized, in one embodiment, the losses corresponding to the global discriminator and the local discriminator respectively are obtained through cross entropy calculation, and the network parameters of the global discriminator and the local discriminator are adjusted respectively. And performing parameter iteration by the loop until the training is finished, wherein the condition of the training completion can be defined by self, for example, the preset iteration times are reached, or the quality of the generated training generated picture is not improved any more. And the low-quality pictures which can be input by the trained portrait restoration model are subjected to portrait restoration to obtain pictures of high-quality portrait areas, so that the detail of the human face is restored, and the definition of the pictures is improved.
In the training method of the portrait restoration model in this embodiment, a training data set is obtained, where the training data set includes a low-quality picture data set and a corresponding tagged high-quality picture data set; inputting the low-quality image data set into a portrait restoration model to be trained to obtain an output training generation image data set; inputting the training generated picture data set and the corresponding label high-quality picture data set into a global discriminator to obtain a global discrimination result; acquiring a part training image corresponding to each training generation picture in a training generation picture data set; inputting the part training image and the part label image in the corresponding label high-quality picture data set into a local discriminator corresponding to the human face part to obtain a local discrimination result; training a portrait restoration model, a global discriminator and a local discriminator based on a global discrimination result, a local discrimination result and a preset target loss function until the training is completed to obtain a trained portrait restoration model, wherein the trained portrait restoration model is used for restoring the portrait of the low-quality picture, and the local discriminator corresponding to the face part is used for enhancing the detail information of the key part of the restored face, so that the restored image has rich and vivid face detail information, a clear image and high restored image quality.
In one embodiment, step 204 includes: extracting a plurality of coding features with different scales corresponding to the low-quality image data set through a coding network of a portrait restoration model; extracting a plurality of decoding features with different scales corresponding to the low-quality picture data set through a decoding network of a portrait restoration model; and fusing the coding features and the decoding features with the same scale, and obtaining an output training generation picture data set according to a fusion result.
Specifically, the network structures of the coding network and the decoding network can be designed according to needs, the number of a plurality of different scales can be customized, for example, 5 coding features of different scales are extracted, and the number of different scales corresponding to the coding network and the decoding network is the same because the coding features and the decoding features of the same scale need to be fused. If the coding network extracts the 5 coding features with different scales corresponding to the low-quality picture data set, the decoding network also extracts the 5 decoding features with different scales corresponding to the low-quality picture data set. The combination of the coding network and the decoding network is used, the jump connection is used for fusing the feature spectrums with the same scale to carry out information transmission, the specific fusion mode can be customized, and the jump connection is used for fusing the shallow feature to the deep feature to increase the generalization capability of the algorithm model. In one embodiment, the encoding network has a feature spectrum a, which is superimposed on a feature spectrum b of the same scale size in the decoding network, i.e. a + b, to obtain a new feature spectrum.
In the embodiment, information transmission can be performed by extracting coding features and decoding features of different scales for fusion, a training generated picture data set is generated, and the generation quality of the training generated picture is improved.
In one embodiment, as shown in fig. 3, acquiring a part training image corresponding to each training generation picture in the training generation picture data set in step 208 includes:
and step 208A, inputting each tagged high-quality picture in the tagged high-quality picture data set into a face analysis network to obtain the segmentation spectrum of each corresponding component in different categories, wherein the components in different categories comprise an eye component, a face component and a mouth component.
The face analysis network is used for identifying an input picture to obtain key components in a face, and can identify a plurality of components in different categories. The type and number of components can be customized as desired, and in this embodiment, the different types of components include an eye component, a face component, and a mouth component. In other embodiments, more or fewer components may be included, such as an eyebrow component, an ear component, etc., may also be included. The segmentation spectrum is used to describe the regions of different classes of components on the picture, and the different regions can be distinguished by different labels, as shown in fig. 4, which is a schematic diagram of the segmentation spectrum obtained in one embodiment, in which the regions where the eye component, the face component, and the mouth component are located are marked.
Specifically, each tagged high-quality picture is input into the face analysis network to obtain a segmentation spectrum of each corresponding component of different types, and the region position of the component of different types corresponding to each tagged high-quality picture in the training generated picture corresponding to each tagged high-quality picture can be further obtained through the region position of the component of different types in the segmentation spectrum in the tagged high-quality picture on the picture.
And step 208B, applying the segmentation spectrum corresponding to each label high-quality picture to the corresponding training generated picture to obtain a part training image corresponding to each different type of part of each training generated picture.
Specifically, component training images corresponding to the corresponding components of the respective categories in the training generated picture corresponding to the respective labeled high-quality pictures are obtained from the regions marked in the segmentation spectrum. If the areas of the eye part, the face part and the mouth part are obtained, the areas of the parts are respectively segmented from the picture to form corresponding part training images, such as an eye part training image, a face part training image and a mouth part training image.
In step 208, inputting the part training image and the part label image in the corresponding label high-quality picture data set into a local discriminator corresponding to the face part, and obtaining a local discrimination result includes: and respectively inputting the part training images into the local discriminators with matched classes to obtain local discrimination results output by the local discriminators corresponding to the parts with different classes.
Specifically, different types of components respectively correspond to different local discriminators, and each local discriminator has a corresponding loss function, so that the accuracy of distinguishing different components is improved. The output of the local discriminator is a probability value for judging whether the part label image of the original label high-quality picture and the generated part training image are true, and the probability value is between 0 and 1.
In the embodiment, the segmentation spectrum is obtained by labeling the high-quality picture, so that the position of each type of component can be more accurately obtained, the segmentation spectrum acts on the training generated picture, the obtained component training image is more accurate, the local judgment results of different types of components are respectively output by the corresponding local judgers, and the reliability of the local judgment results is improved.
In one embodiment, as shown in FIG. 5, step 210 comprises:
and step 210A, calculating to obtain global generation countermeasure loss based on the global discrimination result, and calculating to obtain local generation countermeasure loss based on the local discrimination result.
Specifically, the generation countermeasure loss is used to ensure that the generated image and the high-quality image have similar distribution, ensuring that the generated image is sufficiently realistic. Global generation opposition loss is used to guarantee the fidelity of the generated overall image, and local generation opposition loss is used to guarantee the fidelity of the generated local image. Algorithms for calculating the globally generated opposition loss and calculating the locally generated opposition loss can be customized and, in one embodiment, are calculated by the following equations, respectively.
Figure BDA0003018359410000071
Where G denotes a generator, G (·) denotes an output result of the generator, z denotes a low-quality picture, and x denotes a tag high-quality picture. D () refers to the output of the discriminator. When the global generation pair loss is calculated, z represents a global low-quality picture, D (.) represents an output global discrimination result of the global discriminator, and x represents a global label high-quality picture. When the pair loss is locally generated in the calculation, z represents a part training image corresponding to each of different types of parts, and x represents a part label image corresponding to each part training image in the label high-quality picture data set. D (.) refers to the local discrimination result output by each corresponding local discriminator. When there are a plurality of different types of components, there are a plurality of different corresponding local discriminators, and local discrimination results corresponding to the plurality of different components are obtained, thereby obtaining a plurality of corresponding different locally generated countermeasure losses.
And step 210B, obtaining target generation countermeasure loss according to the global generation countermeasure loss and the local generation countermeasure loss.
Specifically, a specific algorithm for obtaining the target generation countermeasure loss can be customized, the global generation countermeasure loss and the local generation countermeasure loss can be weighted to obtain the target generation countermeasure loss, and the weighting weight can be customized according to needs and specific scenes. The local generation countermeasure loss is added into the target generation countermeasure loss, so that vivid details can be generated at the corresponding portrait part, and the image quality is improved. In one embodiment, the calculation formula for the target generation countermeasure loss is as follows:
Lgan=a*Lgan_global+b*Lgan_face+c*Lgan_eyes+d*Lgan_mouth
wherein L isganRepresenting the target generation countermeasure loss, a, b, c, d are weighted weights, Lgan_globalRepresenting a globally generated countermeasure loss, Lgan_faceRepresenting locally generated opponent loss, L, of facial partsgan_eyesRepresenting corresponding locally generated opposition loss, L, of an ocular componentgan_mouthRepresenting a corresponding locally generated opposition loss for the mouthpiece.
Step 210C, calculating pixel loss based on the training generated picture data set and the corresponding label high-quality picture data set.
Specifically, the pixel loss can ensure that the generated image is similar to the high-quality image in pixel level, the pixel loss is calculated according to the content pixels of the image, the specific calculation method can be customized, and in one embodiment, the pixel value difference of the matching position of the training generated image and the corresponding label high-quality image can be directly used as the pixel loss.
And step 210D, respectively extracting the features of the training generated picture data sets and the corresponding label high-quality picture data sets through a pre-training perception network to obtain feature maps, and calculating the distance between the feature maps to obtain the perception loss.
Specifically, the pre-training perception network refers to a trained network capable of perceiving image quality, for example, the network may be a VGG-19 pre-training network, and the image generation result can be closer to the label high-quality image through perception loss. In one embodiment, the L2 distance between the training generated picture and the corresponding label high-quality picture on vgg19 pre-trained by using the Imagenet data set is calculated, the label high-quality picture is sent to vgg19 to obtain a feature spectrum a, the training generated picture is sent to vgg19 to obtain a feature spectrum b, and (a-b) ^2 is the L2 distance, so that the perception loss is obtained.
And step 210E, substituting the target generation countermeasure loss, the pixel loss and the perception loss into a preset target loss function to calculate the target loss, and adjusting the network parameters of the portrait restoration model according to the target loss.
Specifically, the specific formula of the target loss function can be customized, in one embodiment, different weighting weights are set for the target generation countermeasure loss, the pixel loss and the perceptual loss, the target generation countermeasure loss, the pixel loss and the perceptual loss are weighted and calculated by the weighting weights to obtain the target loss, and the target loss is calculated by the following formula:
Ltotal=α*Lgan+β*Lpixel+λ*Lpercertual
wherein L istotalRepresents the target loss, LganRepresenting the target Generation countermeasure loss, LpixelRepresents pixel loss, LpercertualIndicating a loss of perception.
Network parameters of the portrait restoration model are adjusted through target loss, so that the target loss is smaller and smaller, and the image quality of the training generated pictures is improved continuously.
In the embodiment, the target loss is obtained through loss calculation of various different types, the generated image and the high-quality image are similar in pixel level and are vivid enough, the image quality is higher, and the addition of the local loss is favorable for generating vivid details at the corresponding portrait part.
In one embodiment, as shown in FIG. 6, step 210 comprises:
and step 210F, calculating to obtain global discrimination loss based on the global discrimination result and the corresponding global discrimination label, and adjusting the global discriminator according to the global discrimination loss.
Specifically, the global discrimination label is a real discrimination result of the global picture, if the picture is a training generated picture, the global discrimination label is an image which is false, and if the picture is a label high-quality picture, the global discrimination label is an image which is true. The global discrimination loss can be obtained by calculation according to the global discrimination result and the corresponding global discrimination label in a cross entropy mode, and the global discriminator is adjusted according to the global discrimination loss, so that the discrimination result of the global discriminator is adjusted towards the direction with high correct probability, and the discrimination capability of the global discriminator is continuously improved.
And step 210G, calculating local discrimination losses corresponding to the components of the different categories based on the local discrimination results corresponding to the components of the different categories and the corresponding local discrimination labels, and adjusting the matched local discriminators according to the local discrimination losses corresponding to the components of the different categories.
Specifically, the local discrimination label is a real discrimination result of a local picture corresponding to the component, and if the local picture corresponds to the component for training to generate the picture, the local discrimination label is that the image is false, and if the local picture corresponds to the component for labeling the high-quality picture, the local discrimination label is that the image is true. The local discrimination loss corresponding to each different type of component can be obtained by calculation according to the local discrimination result corresponding to each different type of component and the corresponding local discrimination label in a cross entropy mode, and the corresponding local discriminator is adjusted according to the local discrimination loss, so that the discrimination result of the local discriminator is adjusted towards the direction with high correct probability, and the discrimination capability of the local discriminator is continuously improved.
In this embodiment, when the network parameters of the portrait restoration model are adjusted through the target loss, the global discriminator is adjusted through the global discrimination loss, and the matched local discriminator is adjusted through the local discrimination loss, so that the overall network formed by the portrait restoration model, the global discriminator and the local discriminator achieves more effective training, and the image restoration quality of the portrait restoration model obtained through training is higher.
In a specific embodiment, a method for training a portrait restoration model is provided, as shown in fig. 7, which is a data flow chart of the portrait restoration model, and specifically includes the following processes:
1. and performing degradation processing on the tagged high-quality pictures in the tagged high-quality picture data set, wherein operations such as noise addition, blurring and the like are mainly involved, and the corresponding low-quality pictures are obtained to form a low-quality picture data set.
2. The low-quality pictures in the low-quality picture data set are sent to a generator, namely a to-be-trained portrait restoration model, the generator restores the low-quality pictures to generate high-quality pictures to obtain training generated pictures to form a training generated picture data set, as shown in fig. 8, for a generator structural schematic diagram, a combination of an Encoder and a Decoder is used, and a jump connection is used to fuse the feature spectrums of the same scale for information transmission.
3. The generated training generated picture data set and the corresponding labeled high-quality picture data set are sent to a global discriminator to discriminate true and false, so as to obtain a global discrimination result, as shown in fig. 9, the global discrimination result is obtained by 32 times down-sampling, wherein the global discrimination result is a structural schematic diagram of the global discriminator.
4. And sending the labeled high-quality pictures in the labeled high-quality picture data set into a human face analysis network to obtain segmentation spectrums of eyes, a face and a mouth.
5. The segmentation spectrum is applied to a training generated picture in a training generated picture data set to obtain part training images of three parts, the part training images and corresponding part label images in a label high-quality picture data set are input into local discriminators corresponding to human face parts, namely, the local discriminators corresponding to eye parts, face parts and mouth parts are respectively input to obtain corresponding local discrimination results, as shown in fig. 10, the segmentation spectrum is a structural schematic diagram of the local discriminators, and the scale of the segmented human image parts is greatly reduced compared with that of an original image, so that the local discrimination results are obtained after 16-fold down sampling in the image.
6. Calculating to obtain global generation countermeasure loss based on the global discrimination result, calculating to obtain local generation countermeasure loss based on the local discrimination result, obtaining target generation countermeasure loss according to the global generation countermeasure loss and the local generation countermeasure loss, calculating to obtain pixel loss based on the training generated picture data set and the corresponding labeled high-quality picture data set, and calculating the L2 distance between the training generated picture and the corresponding labeled high-quality picture on vgg19 pre-trained by using the Imagenet data set to obtain perception loss;
7. and substituting the target generation countermeasure loss, the pixel loss and the perception loss into a preset target loss function to calculate the target loss, and adjusting the network parameters of the portrait restoration model according to the target loss.
8. And calculating to obtain global discrimination loss based on the global discrimination result and the corresponding global discrimination label, and adjusting the global discriminator according to the global discrimination loss.
9. And calculating to obtain local discrimination losses corresponding to the components of different classes based on the local discrimination results corresponding to the components of different classes and the corresponding local discrimination labels, and adjusting the matched local discriminators according to the local discrimination losses corresponding to the components of different classes.
10. And obtaining a trained portrait restoration model until the training is finished, wherein the trained portrait restoration model is used for restoring the portrait of the low-quality picture.
In the embodiment, the detail information of the key part of the repaired face is enhanced through three different local discriminators, the addition of three locally generated countermeasure losses is beneficial to generating vivid details at the corresponding portrait part, adjusting the network parameters of the portrait repair model, synchronously adjusting the network parameters of the global discriminator and the local discriminators, and performing countermeasure learning, so that the trained portrait repair model is obtained, the detail information of the repaired face of the image is rich and vivid, the image is clearer, and the repaired image is high in quality.
In one embodiment, as shown in fig. 11, there is provided a portrait restoration method, including the steps of:
step 302, obtaining a portrait picture to be restored, inputting the portrait picture to be restored into a trained portrait restoration model, wherein the trained portrait restoration model is obtained by inputting a low-quality picture data set into the portrait restoration model to be trained to generate a picture data set for training, training the human image restoration model, the global discriminator and the local discriminator according to the global discrimination result, the local discrimination result and the preset target loss function until the training is finished, the global judgment result is obtained by inputting the training generated picture data set and the corresponding labeled high-quality picture data set into a global judger, and the local judgment result is obtained by inputting the part training image corresponding to each training generated picture in the training generated picture data set and the part label image in the corresponding labeled high-quality picture data set into a local judger corresponding to the human face part.
The portrait picture to be restored is a low-quality picture containing a human face, such as human face blur, noise and the like. The pictures can be collected in real time or downloaded from a picture library.
Specifically, the portrait picture to be restored is input into the trained portrait restoration model, and then the face area in the portrait picture to be restored can be restored. The trained portrait restoration model is obtained by training through the training method in the above embodiment, and the specific training process can be seen in the above embodiment.
And step 304, the portrait restoration model restores the face area in the portrait picture to be restored and outputs a corresponding portrait picture with high image quality.
Specifically, the portrait restoration model restores a face region in a portrait picture to be restored, restores image details of each part region of a face, such as a face, a mouth, eyes, ears, and the like, and obtains a high-quality portrait picture. When a plurality of different face areas exist in the portrait picture to be restored, the different face areas can be restored to obtain the portrait picture with high image quality.
In the embodiment, the face region in the portrait picture to be restored is restored through the trained portrait restoration model, and the corresponding portrait picture with high image quality is output.
It is to be understood that the training process of the portrait restoration model in the present embodiment may refer to the steps in the above embodiments.
In an embodiment, the obtaining a to-be-repaired portrait picture in step 302, and the inputting the to-be-repaired portrait picture into the trained portrait repair model includes: and identifying one or more original face regions in the portrait picture to be restored, and inputting the one or more original face regions into the trained portrait restoration model.
Specifically, when the to-be-restored portrait picture includes not only the face region part but also other images, such as a body, a background, and other regions, one or more original face regions in the to-be-restored portrait picture may be recognized first to form images corresponding to the one or more original face regions, then the images corresponding to the one or more original face regions are input into the trained portrait restoration model, and the positions of the one or more original face regions in the original image are recorded, for example, the coordinate range of the original face region is recorded.
Step 304 includes: the portrait restoration model obtains a restored face area corresponding to the original face area; and pasting the face repairing area to the matched area according to the position of the portrait picture to be repaired to obtain the portrait picture with high image quality.
Specifically, the portrait restoration model respectively restores images corresponding to one or more input original face areas to obtain one or more corresponding restored face areas, so that each restored face area is attached to an area matched with the original image again according to the position information recorded before, and a high-quality portrait picture is obtained.
In the embodiment, one or more original face regions in the portrait picture to be restored are recognized, then the one or more original face regions are input into the portrait restoration model, the face regions are restored to obtain high-definition face region images, the portrait picture to be restored is pasted back according to the position relationship to obtain the high-quality portrait picture, the face regions can be automatically recognized according to images in other regions, targeted image quality restoration is performed, and the method is intelligent and efficient.
In a specific embodiment, a method for training a portrait restoration model is provided, which specifically includes the following processes:
1. and acquiring a portrait picture to be restored, and performing picture preprocessing, including operations of portrait alignment, cutting and the like.
2. And intercepting the portrait of the preprocessed portrait picture to be restored through a face detection network to obtain one or more original face areas, and inputting the one or more original face areas into a trained portrait restoration model to generate a corresponding restored face area.
3. And pasting the face repairing area to the matched area according to the position of the portrait picture to be repaired to obtain the portrait picture with high image quality.
In the embodiment, the face area in the portrait picture to be restored is restored through the trained portrait restoration model, only the face area is input into the trained portrait restoration model to generate the corresponding restored face area, and then the restored face area is pasted in the matched area according to the position of the portrait picture to be restored to obtain the portrait picture with high image quality, which is suitable for the portrait picture to be restored comprising a plurality of faces, because the trained portrait restoration model enhances the detail information of the key parts of the restored faces through the local discriminator, the addition of the countermeasure loss is generated locally to be beneficial to generating vivid details at the corresponding portrait parts, the network parameters of the portrait restoration model are adjusted in the training process, the network parameters of the global discriminator and the local discriminator are also adjusted synchronously to carry out the countermeasure learning, the image obtained by the training can make the face detail information of the restored image rich and vivid, the image is clearer, and the quality of the repaired image is high.
The portrait restoration method of the embodiment is adopted to restore 358 low-quality portraits, the restoration effect is good, the details of eyes, mouths, teeth and the like are well restored, partial results are shown in fig. 12, the first behavior is an old low-quality photo, the second behavior is a picture restored by the algorithm, and compared with an original image and a restored result image, the restored image has rich and vivid face detail information and high image detail quality.
Using an image quality evaluation algorithm BRISQUE algorithm to perform batch evaluation and scoring on 358 images before and after repair; in the evaluation process, in order to evaluate the repairing effect of different resolutions, the algorithm is evaluated under two different resolutions, the evaluation result is shown in the following table, and the smaller the score is, the better the image quality is judged by the algorithm:
size of picture Raw data set Post-repair data set
512*512 57.6 48.23
256*256 42.186 36.12
Therefore, through objective quality evaluation, the quality of the image repaired by the portrait repairing method is better.
It should be understood that although the various steps in the flow charts of fig. 2-5 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-5 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternating with other steps or at least some of the sub-steps or stages of other steps.
FIG. 13 is a block diagram of an apparatus for training a portrait fix model 500 according to an embodiment. As shown in fig. 13, a training apparatus 500 for a portrait restoration model includes: an obtaining module 502, a picture generating module 504, a global discriminating module 506, a local discriminating module 508, and a training module 510, wherein:
an obtaining module 502 is configured to obtain a training data set, where the training data set includes a low-quality picture data set and a corresponding tagged high-quality picture data set.
And the picture generation module 504 is configured to input the low-quality picture data set into a portrait restoration model to be trained, so as to obtain an output training generation picture data set.
And a global decision module 506, configured to input the training generated picture data set and the corresponding labeled high quality picture data set into a global decision device, so as to obtain a global decision result.
The local discrimination module 508 is configured to acquire a part training image corresponding to each training generated picture in the training generated picture data set, and input the part training image and a part tag image in the corresponding tag high-quality picture data set to a local discriminator corresponding to the face part to obtain a local discrimination result.
And a training module 510, configured to train the portrait restoration model, the global discriminator, and the local discriminator based on the global discrimination result, the local discrimination result, and a preset target loss function until training is completed, so as to obtain a trained portrait restoration model, where the trained portrait restoration model is used to perform portrait restoration on a low-quality picture.
In the training apparatus 500 of the portrait restoration model in this embodiment, a training data set is obtained, where the training data set includes a low-quality picture data set and a corresponding tagged high-quality picture data set; inputting the low-quality image data set into a portrait restoration model to be trained to obtain an output training generation image data set; inputting the training generated picture data set and the corresponding label high-quality picture data set into a global discriminator to obtain a global discrimination result; acquiring a part training image corresponding to each training generation picture in a training generation picture data set; inputting the part training image and the part label image in the corresponding label high-quality picture data set into a local discriminator corresponding to the human face part to obtain a local discrimination result; training a portrait restoration model, a global discriminator and a local discriminator based on a global discrimination result, a local discrimination result and a preset target loss function until the training is completed to obtain a trained portrait restoration model, wherein the trained portrait restoration model is used for restoring the portrait of the low-quality picture, and the local discriminator corresponding to the face part is used for enhancing the detail information of the key part of the restored face, so that the restored image has rich and vivid face detail information, a clear image and high restored image quality.
In one embodiment, the picture generation module 504 is further configured to extract, through a coding network of the portrait restoration model, a plurality of coding features of different scales corresponding to the low-quality picture data set; extracting a plurality of decoding features with different scales corresponding to the low-quality picture data set through a decoding network of a portrait restoration model; and fusing the coding features and the decoding features with the same scale, and obtaining an output training generation picture data set according to a fusion result.
The human image restoration model training device 500 in this embodiment can perform information transfer by extracting coding features and decoding features of different scales to perform fusion, generate a training generated picture data set, and improve the generation quality of the training generated picture.
In one embodiment, the local discrimination module 508 is further configured to input each tagged high-quality picture in the tagged high-quality picture data set into a face analysis network to obtain a segmentation spectrum of each corresponding different class of components, where the different class of components includes an eye component, a face component, and a mouth component; applying the segmentation spectrum corresponding to each label high-quality picture to the corresponding training generation picture to obtain a component training image corresponding to each different type of component of each training generation picture; and respectively inputting the part training images into the local discriminators with matched classes to obtain local discrimination results output by the local discriminators corresponding to the parts with different classes.
In the human image restoration model training device 500, the segmentation spectrum is obtained by labeling the high-quality picture, so that the position of each type of component can be more accurately obtained, the segmentation spectrum acts on the training generated picture, the obtained component training image is more accurate, the local discrimination results of different types of components are output by the corresponding local discriminators, and the reliability of the local discrimination results is improved.
In one embodiment, the training module 510 is further configured to calculate a global generation countermeasure loss based on the global discrimination result; calculating to obtain local generation countermeasure loss based on the local discrimination result; obtaining target generation countermeasure loss according to the global generation countermeasure loss and the local generation countermeasure loss; calculating to obtain pixel loss based on the training generated picture data set and the corresponding label high-quality picture data set; respectively extracting the features of the training generated picture data set and the corresponding label high-quality picture data set through a pre-training perception network to obtain feature maps, and calculating the distance between the feature maps to obtain perception loss; substituting the target generation countermeasure loss, the pixel loss and the perception loss into a preset target loss function to calculate to obtain a target loss; and adjusting network parameters of the portrait restoration model according to the target loss.
In the embodiment, the target loss is obtained through loss calculation of various different types, the generated image and the high-quality image are similar in pixel level and are vivid enough, the image quality is higher, and the addition of the local loss is favorable for generating vivid details at the corresponding portrait part.
In one embodiment, the training module 510 is further configured to calculate a global discriminant loss based on the global discriminant result and the corresponding global discriminant label, and adjust the global discriminator according to the global discriminant loss; and calculating to obtain local discrimination losses corresponding to the components of different classes based on the local discrimination results corresponding to the components of different classes and the corresponding local discrimination labels, and adjusting the matched local discriminators according to the local discrimination losses corresponding to the components of different classes.
In this embodiment, when the network parameters of the portrait restoration model are adjusted through the target loss, the global discriminator is adjusted through the global discrimination loss, and the matched local discriminator is adjusted through the local discrimination loss, so that the overall network formed by the portrait restoration model, the global discriminator and the local discriminator achieves more effective training, and the image restoration quality of the portrait restoration model obtained through training is higher.
In one embodiment, as shown in fig. 14, there is provided a portrait repair apparatus 600 including:
an input module 602, configured to obtain a portrait picture to be repaired, input the portrait picture to be repaired into a trained portrait repair model, the trained portrait restoration model is a portrait restoration model to be trained, which is obtained by inputting a low-quality picture data set into the portrait restoration model to be trained, training the human image restoration model, the global discriminator and the local discriminator according to the global discrimination result, the local discrimination result and the preset target loss function until the training is finished, wherein the global decision result is obtained by inputting the training generated picture data set and the corresponding labeled high-quality picture data set into the global decision device, and the local judgment result is obtained by inputting a part training image corresponding to each training generated picture in the training generated picture data set and a part label image in the corresponding label high-quality picture data set into a local judger corresponding to the human face part.
The repairing module 604 is configured to repair a face region in a to-be-repaired face picture by the face repairing model, and output a corresponding high-quality face picture.
In the embodiment, the face region in the portrait picture to be restored is restored through the trained portrait restoration model, and the corresponding portrait picture with high image quality is output.
In one embodiment, the input module 602 is further configured to identify one or more original face regions in the portrait picture to be repaired, and input the one or more original face regions into the trained portrait repair model. The repairing module 604 is further configured to obtain a repaired face region corresponding to the original face region by the portrait repairing model; and pasting the face repairing area to the matched area according to the position of the portrait picture to be repaired to obtain the portrait picture with high image quality.
In the embodiment, one or more original face regions in the portrait picture to be restored are recognized, then the one or more original face regions are input into the portrait restoration model, the face regions are restored to obtain high-definition face region images, the portrait picture to be restored is pasted back according to the position relationship to obtain the high-quality portrait picture, the face regions can be automatically recognized according to images in other regions, targeted image quality restoration is performed, and the method is intelligent and efficient.
For the specific limitations of the portrait restoration model training device and the portrait restoration device, reference may be made to the above limitations of the portrait restoration model training method and the portrait restoration method, which are not described herein again. The modules in the portrait restoration model training device and the portrait restoration device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
Fig. 15 is a schematic internal structure diagram of an electronic device in one embodiment. As shown in fig. 15, the electronic apparatus includes a processor and a memory connected by a system bus. Wherein, the processor is used for providing calculation and control capability and supporting the operation of the whole electronic equipment. The memory may include a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The computer program can be executed by a processor to implement the portrait restoration model training method or the portrait restoration method provided in the above embodiments. The internal memory provides a cached execution environment for the operating system computer programs in the non-volatile storage medium. The electronic device may be a mobile phone, a server, etc.
The portrait restoration model training device and the portrait restoration device provided in the embodiments of the present application may be implemented in the form of a computer program. The computer program may be run on a terminal or a server. The program modules constituted by the computer program may be stored on the memory of the terminal or the server. When executed by a processor, the computer program implements a portrait restoration model training method or a portrait restoration method described in the embodiments of the present application.
The embodiment of the application also provides a computer readable storage medium. One or more non-transitory computer-readable storage media containing computer-executable instructions that, when executed by one or more processors, cause the processors to perform a portrait restoration model training method or a portrait restoration method as described in embodiments of the present application.
A computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of training a portrait restoration model or the method of portrait restoration described in embodiments of the present application.
Any reference to memory, storage, database, or other medium used herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM), which acts as external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and bus dynamic RAM (RDRAM).
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (11)

1. A training method of a portrait restoration model is characterized by comprising the following steps:
acquiring a training data set, wherein the training data set comprises a low-quality picture data set and a corresponding label high-quality picture data set;
inputting the low-quality image data set into a portrait restoration model to be trained to obtain an output training generation image data set;
inputting the training generated picture data set and the corresponding label high-quality picture data set into a global discriminator to obtain a global discrimination result;
acquiring a part training image corresponding to each training generation picture in a training generation picture data set;
inputting the part training image and the part label image in the corresponding label high-quality picture data set into a local discriminator corresponding to the human face part to obtain a local discrimination result;
and training the portrait restoration model, the global discriminator and the local discriminator based on the global discrimination result, the local discrimination result and a preset target loss function until the training is finished to obtain a trained portrait restoration model, wherein the trained portrait restoration model is used for restoring the portrait of the low-quality picture.
2. The method of claim 1, wherein inputting the low-quality picture dataset into a human image restoration model to be trained and obtaining an output training generated picture dataset comprises:
extracting a plurality of coding features with different scales corresponding to the low-quality image data set through a coding network of the portrait restoration model;
extracting a plurality of decoding features with different scales corresponding to the low-quality picture data set through a decoding network of the portrait restoration model;
and fusing the coding features and the decoding features with the same scale, and obtaining the output training generation picture data set according to a fusion result.
3. The method of claim 1, wherein obtaining a component training image corresponding to each training generation picture in the training generation picture dataset comprises:
inputting each tagged high-quality picture in the tagged high-quality picture data set into a face analysis network to obtain a segmentation spectrum of each corresponding component in different categories, wherein the components in different categories comprise an eye component, a face component and a mouth component;
applying the segmentation spectrum corresponding to each label high-quality picture to the corresponding training generation picture to obtain a component training image corresponding to each different type of component of each training generation picture;
inputting the part training image and the part label image in the corresponding label high-quality picture data set into a local discriminator corresponding to the human face part to obtain a local discrimination result, wherein the local discrimination result comprises the following steps:
and respectively inputting the part training images into the local discriminators with matched classes to obtain local discrimination results output by the local discriminators corresponding to the parts with different classes.
4. The method of claim 1, wherein the training the human image restoration model, the global discriminators and the local discriminators based on the global discrimination result, the local discrimination result and a preset target loss function comprises:
calculating to obtain global generation countermeasure loss based on the global discrimination result;
calculating to obtain local generation countermeasure loss based on the local discrimination result;
obtaining target generation countermeasure loss according to the global generation countermeasure loss and the local generation countermeasure loss;
calculating to obtain pixel loss based on the training generated picture data set and the corresponding label high-quality picture data set;
respectively extracting the features of the training generated picture data set and the corresponding label high-quality picture data set through a pre-training perception network to obtain feature maps, and calculating the distance between the feature maps to obtain perception loss;
substituting the target generation countermeasure loss, the pixel loss and the perception loss into a preset target loss function to calculate to obtain a target loss;
and adjusting the network parameters of the portrait restoration model according to the target loss.
5. The method of claim 4, wherein the training the human image restoration model, the global discriminators and the local discriminators based on the global discrimination result, the local discrimination result and a preset target loss function comprises:
calculating to obtain global discrimination loss based on a global discrimination result and a corresponding global discrimination label, and adjusting the global discriminator according to the global discrimination loss;
and calculating to obtain local discrimination losses corresponding to the components of different classes based on the local discrimination results corresponding to the components of different classes and the corresponding local discrimination labels, and adjusting the matched local discriminators according to the local discrimination losses corresponding to the components of different classes.
6. A method of portrait restoration, comprising:
acquiring a portrait picture to be restored, inputting the portrait picture to be restored into a trained portrait restoration model, the trained portrait restoration model is a portrait restoration model to be trained, which is obtained by inputting a low-quality picture data set into the portrait restoration model to be trained, training the human image restoration model, the global discriminator and the local discriminator according to the global discrimination result, the local discrimination result and the preset target loss function until the training is finished, wherein the global decision result is obtained by inputting the training generated picture data set and the corresponding labeled high-quality picture data set into the global decision device, the local judgment result is obtained by inputting a part training image corresponding to each training generation picture in the training generation picture data set and a part label image in the corresponding label high-quality picture data set into a local judger corresponding to the human face part;
and the portrait restoration model restores the face area in the portrait picture to be restored and outputs a corresponding portrait picture with high image quality.
7. The method according to claim 6, wherein the obtaining the portrait picture to be restored and inputting the portrait picture to be restored into the trained portrait restoration model comprises:
identifying one or more original face regions in the portrait picture to be restored, and inputting the one or more original face regions into a trained portrait restoration model;
the human image restoration model restores the human face area in the human image picture to be restored, and outputting the corresponding human image picture with high image quality comprises the following steps:
the human image restoration model obtains a restored human face area corresponding to the original human face area;
and pasting the face repairing area on the matched area according to the position of the portrait picture to be repaired to obtain the portrait picture with high image quality.
8. A training device for a portrait restoration model, comprising:
an acquisition module, configured to acquire a training data set, where the training data set includes a low-quality picture data set and a corresponding tagged high-quality picture data set;
the image generation module is used for inputting the low-quality image data set into a portrait restoration model to be trained to obtain an output training generation image data set;
the global discrimination module is used for inputting the training generated picture data set and the corresponding label high-quality picture data set into a global discriminator to obtain a global discrimination result;
the local judgment module is used for acquiring a part training image corresponding to each training generation picture in the training generation picture data set, inputting the part training image and a part label image in the corresponding label high-quality picture data set into a local judger corresponding to the human face part, and obtaining a local judgment result;
and the training module is used for training the portrait restoration model, the global discriminator and the local discriminator based on the global discrimination result, the local discrimination result and a preset target loss function until the training is finished to obtain a trained portrait restoration model, and the trained portrait restoration model is used for restoring the portrait of the low-quality picture.
9. A portrait restoration device, comprising:
an input module for obtaining a portrait picture to be restored and inputting the portrait picture to be restored into a trained portrait restoration model, the trained portrait restoration model is a portrait restoration model to be trained, which is obtained by inputting a low-quality picture data set into the portrait restoration model to be trained, training the human image restoration model, the global discriminator and the local discriminator according to the global discrimination result, the local discrimination result and the preset target loss function until the training is finished, wherein the global decision result is obtained by inputting the training generated picture data set and the corresponding labeled high-quality picture data set into the global decision device, the local judgment result is obtained by inputting a part training image corresponding to each training generation picture in the training generation picture data set and a part label image in the corresponding label high-quality picture data set into a local judger corresponding to the human face part;
and the human image restoration module restores the human face area in the human image picture to be restored and outputs a corresponding human image picture with high image quality.
10. An electronic device comprising a memory and a processor, the memory having stored therein a computer program that, when executed by the processor, causes the processor to perform the method of any of claims 1-5 or 6-7.
11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1 to 5 or 6 to 7.
CN202110395414.9A 2021-04-13 2021-04-13 Portrait restoration model training method and device and electronic equipment Withdrawn CN113139915A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110395414.9A CN113139915A (en) 2021-04-13 2021-04-13 Portrait restoration model training method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110395414.9A CN113139915A (en) 2021-04-13 2021-04-13 Portrait restoration model training method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN113139915A true CN113139915A (en) 2021-07-20

Family

ID=76812226

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110395414.9A Withdrawn CN113139915A (en) 2021-04-13 2021-04-13 Portrait restoration model training method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN113139915A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114511466A (en) * 2022-02-21 2022-05-17 北京大学深圳研究生院 Blind person face image restoration method based on generation of confrontation network prior
CN115376188A (en) * 2022-08-17 2022-11-22 天翼爱音乐文化科技有限公司 Video call processing method, system, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109191402A (en) * 2018-09-03 2019-01-11 武汉大学 The image repair method and system of neural network are generated based on confrontation
CN110570366A (en) * 2019-08-16 2019-12-13 西安理工大学 Image restoration method based on double-discrimination depth convolution generation type countermeasure network
CN111507914A (en) * 2020-04-10 2020-08-07 北京百度网讯科技有限公司 Training method, repairing method, device, equipment and medium of face repairing model
CN111553864A (en) * 2020-04-30 2020-08-18 深圳市商汤科技有限公司 Image restoration method and device, electronic equipment and storage medium
CN111681182A (en) * 2020-06-04 2020-09-18 Oppo广东移动通信有限公司 Picture restoration method and device, terminal equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109191402A (en) * 2018-09-03 2019-01-11 武汉大学 The image repair method and system of neural network are generated based on confrontation
CN110570366A (en) * 2019-08-16 2019-12-13 西安理工大学 Image restoration method based on double-discrimination depth convolution generation type countermeasure network
CN111507914A (en) * 2020-04-10 2020-08-07 北京百度网讯科技有限公司 Training method, repairing method, device, equipment and medium of face repairing model
CN111553864A (en) * 2020-04-30 2020-08-18 深圳市商汤科技有限公司 Image restoration method and device, electronic equipment and storage medium
CN111681182A (en) * 2020-06-04 2020-09-18 Oppo广东移动通信有限公司 Picture restoration method and device, terminal equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114511466A (en) * 2022-02-21 2022-05-17 北京大学深圳研究生院 Blind person face image restoration method based on generation of confrontation network prior
CN114511466B (en) * 2022-02-21 2024-04-26 北京大学深圳研究生院 Blind face image restoration method based on generation of countermeasure network priori
CN115376188A (en) * 2022-08-17 2022-11-22 天翼爱音乐文化科技有限公司 Video call processing method, system, electronic equipment and storage medium
CN115376188B (en) * 2022-08-17 2023-10-24 天翼爱音乐文化科技有限公司 Video call processing method, system, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US20230116801A1 (en) Image authenticity detection method and device, computer device, and storage medium
CN110490212A (en) Molybdenum target image processing arrangement, method and apparatus
WO2021179471A1 (en) Face blur detection method and apparatus, computer device and storage medium
CN111680672B (en) Face living body detection method, system, device, computer equipment and storage medium
US11176724B1 (en) Identity preserving realistic talking face generation using audio speech of a user
CN113139915A (en) Portrait restoration model training method and device and electronic equipment
CN113112416B (en) Semantic-guided face image restoration method
CN114596290A (en) Defect detection method, defect detection device, storage medium, and program product
CN115909172A (en) Depth-forged video detection, segmentation and identification system, terminal and storage medium
CN115731597A (en) Automatic segmentation and restoration management platform and method for mask image of face mask
CN115497139A (en) Method for detecting and identifying face covered by mask and integrating attention mechanism
WO2023279799A1 (en) Object identification method and apparatus, and electronic system
CN112101320A (en) Model training method, image generation method, device, equipment and storage medium
CN113627233A (en) Visual semantic information-based face counterfeiting detection method and device
CN112132766A (en) Image restoration method and device, storage medium and electronic device
CN115115552B (en) Image correction model training method, image correction device and computer equipment
CN116994175A (en) Space-time combination detection method, device and equipment for depth fake video
CN111696090A (en) Method for evaluating quality of face image in unconstrained environment
CN115147705A (en) Face copying detection method and device, electronic equipment and storage medium
CN113468954B (en) Face counterfeiting detection method based on local area features under multiple channels
CN114399824A (en) Multi-angle side face correction method and device, computer equipment and medium
CN115708135A (en) Face recognition model processing method, face recognition method and device
CN113688698A (en) Face correction recognition method and system based on artificial intelligence
CN117593216A (en) Training method of image restoration model, image restoration method and related device
CN114549409A (en) Image processing method, image processing device, storage medium and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20210720

WW01 Invention patent application withdrawn after publication