CN112991484A - Intelligent face editing method and device, storage medium and equipment - Google Patents
Intelligent face editing method and device, storage medium and equipment Download PDFInfo
- Publication number
- CN112991484A CN112991484A CN202110466411.XA CN202110466411A CN112991484A CN 112991484 A CN112991484 A CN 112991484A CN 202110466411 A CN202110466411 A CN 202110466411A CN 112991484 A CN112991484 A CN 112991484A
- Authority
- CN
- China
- Prior art keywords
- image
- geometric
- face
- appearance
- local
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 230000004927 fusion Effects 0.000 claims abstract description 23
- 238000012549 training Methods 0.000 claims description 37
- 238000004590 computer program Methods 0.000 claims description 16
- 230000006870 function Effects 0.000 claims description 12
- 230000015572 biosynthetic process Effects 0.000 claims description 11
- 238000003786 synthesis reaction Methods 0.000 claims description 11
- 238000000605 extraction Methods 0.000 claims description 6
- 239000000203 mixture Substances 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 claims description 5
- 125000004122 cyclic group Chemical group 0.000 claims description 5
- 238000011176 pooling Methods 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 230000008447 perception Effects 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 description 10
- 210000001508 eye Anatomy 0.000 description 5
- 239000000284 extract Substances 0.000 description 4
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000001815 facial effect Effects 0.000 description 3
- 238000005286 illumination Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 230000002194 synthesizing effect Effects 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 210000000887 face Anatomy 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000037303 wrinkles Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 238000010420 art technique Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000037308 hair color Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000000873 masking effect Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 210000000697 sensory organ Anatomy 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention relates to an intelligent face editing method, an intelligent face editing device, a storage medium and intelligent face editing equipment. It is suitable for the fields of computer vision and computer graphics. An intelligent face editing method, an intelligent face editing device, a storage medium and equipment are provided. The technical scheme adopted by the invention is as follows: an intelligent face editing method is characterized in that: inputting the geometric feature image and the appearance feature image of the face into corresponding trained local decoupling modules according to the face parts, and extracting the corresponding geometric features and appearance features of each part of the face; the local decoupling module generates local images and local intermediate feature images corresponding to all parts of the human face based on the geometric features and the appearance features; and fusing the local intermediate feature images corresponding to all parts of the human face through the trained global fusion module to generate a final human face image with the geometric features and the appearance features.
Description
Technical Field
The invention relates to an intelligent face editing method, an intelligent face editing device, a storage medium and intelligent face editing equipment. It is suitable for the fields of computer vision and computer graphics.
Background
The face synthesis technology is one of the important subjects in the field of digital image processing, and there are many related technologies around high-quality face synthesis. The face synthesis technology based on deep learning mainly includes the following two types: synthesizing a new face by Gaussian sampling and using a generative confrontation network (GAN); and (3) using a condition generating type confrontation network, inputting information such as a semantic tag graph, a sketch and an attribute label, and synthesizing a corresponding face. Although there are many techniques for synthesizing a real face by sketch, most of them cannot control the appearance of the generated face or the effect of face synthesis is poor.
For face editing, some prior art techniques use tagged face attribute tag data to decouple the hidden space of the GAN. And editing the attributes of the human face by operating the projection codes of the hidden space. However, these techniques can only edit a specific attribute, and cannot modify contents other than the attribute tag, and thus the degree of freedom is low. Some techniques use semantic markup to edit faces, but because of the lack of geometric information in their input, cannot edit geometric information of the face such as wrinkles, hair style trends, etc. Some technologies use sketches to edit faces, but the technologies are based on image completion technology and have more limitations.
Portenier et al, published in 2018 in ACM Transactions on Graphics, "facial: Deep Sketch-Based Face Image Editing," propose a Sketch-Based Face Editing system, which edits a Face using a masking mark, a Sketch, and a color stroke. Jo et al, 2019, "SC-FEGAN, Face Editing genetic adaptive Network With User's Sketch and color," published by Proceedings of the IEEE/CVF International Conference on Computer Vision, used style loss to generate higher quality, more robust results. However, the prior art cannot edit the overall appearance of the face, and cannot generate a real and vivid face when a pure sketch is used as input.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: aiming at the existing problems, an intelligent face editing method, an intelligent face editing device, a storage medium and intelligent face editing equipment are provided.
The technical scheme adopted by the invention is as follows: an intelligent face editing method is characterized in that:
inputting the geometric feature image and the appearance feature image of the face into corresponding trained local decoupling modules according to the face parts, and extracting the corresponding geometric features and appearance features of each part of the face;
the local decoupling module generates local images and local intermediate feature images corresponding to all parts of the human face based on the geometric features and the appearance features;
and fusing the local intermediate feature images corresponding to all parts of the human face through the trained global fusion module to generate a final human face image with the geometric features and the appearance features.
The geometric feature image is a draft of a human face or a real image of the human face, and the local decoupling module comprises a draft encoderImage encoderAppearance encoderAnd image composition generator;
The extracting of the geometric features corresponding to each part of the human face from the geometric feature image includes:
training cursive chart encoderSketch encoderThe hidden space in the bottleneck layer is of dimension ofOf low resolution, wherein、Andthe height, width and channel number of the geometric feature map;
Training image encoderCorresponding human face real imageA geometric hidden space mapped to the sketch representation, denoted as。
When in useAndinput pre-trained decoderTime, algorithm pairEach layer of (a) imposes constraints while also at the final outputAndwith the addition of L1 losses.
The geometric feature extraction training of the local decoupling module comprises the following steps:
first, trainAndlearning the geometric hidden space of the sketch using the L1 reconstruction loss function, onceAfter training, the geometric characteristics are expressed as;
The network is then trainedFrom the real image of the human faceAs input and predict geometric characteristicsSo that it follows the same distribution as the learned geometry space, the loss function is defined as follows:
wherein the content of the first and second substances,Nis a decoderIndex 0 corresponds to the input feature map, indexNThe other indices are intermediate feature maps corresponding to the output image.
The appearance encoderAnd eliminating spatial information on the face appearance characteristic image and extracting appearance characteristics irrelevant to the geometric characteristics by utilizing global average pooling.
The appearance encoder and the image synthesis generator adopt an exchange training strategy;
using geometric images of human facesGeometric characteristics ofAnd human face appearance characteristic imageAppearance characteristics ofGenerating an imageI.e. by;
Use ofThe appearance characteristics ofGeometric features of (1), cyclically reconstructing the imageI.e. by。
Training a local decoupling module by adopting the following loss functions, comprising:
a. self-weight building loss:
wherein:representing a loss of perception;a loss of feature matching representing a discriminator;expressing color loss, converting the image to CIE-Lab color space, and controlling hue by calculating chroma distance in a and b channels; the a and b channels contain color information in the CIE LAB color space; 、 、 setting specific parameters according to experience;
b. loss of cyclic exchange:
c. the resistance loss:
the distribution of the generated image is limited using a multi-scale discriminator D to match the distribution of the real image:
An intelligent face editing device, comprising:
the characteristic extraction unit is used for inputting the human face geometric characteristic image and the human face appearance characteristic image into corresponding trained local decoupling modules according to human face parts and extracting the geometric characteristics and appearance characteristics corresponding to all parts of the human face;
the image generation unit is used for generating local images and local intermediate feature images corresponding to all parts of the human face by the local decoupling module based on the geometric features and the appearance features;
and the image fusion unit is used for fusing the local intermediate feature images corresponding to all parts of the human face through the trained global fusion module to generate a final human face image with the geometric features and the appearance features.
A storage medium having stored thereon a computer program executable by a processor, the computer program comprising: the computer program, when executed, implements the steps of the intelligent face editing method.
A computer device having a memory and a processor, the memory having stored thereon a computer program executable by the processor, the computer program comprising: the computer program, when executed, implements the steps of the intelligent face editing method.
The invention has the beneficial effects that: the invention relates to a face synthesis and editing technology based on a sketch, wherein the geometric information represented by the sketch is rich, the geometric details of a face can be controlled, and the face synthesis and editing technology is more flexible compared with the prior art. Meanwhile, the decoupling technology can replace the appearance corresponding to the human face and edit the information such as skin color, hair color and the like.
The invention divides the human face into five parts of a left eye, a right eye, a nose, a mouth and a background, combines a local decoupling module and a global fusion module, can respectively edit local geometric and appearance characteristics, and has higher quality of a synthetic result at a local detail position.
The local decoupling module of the invention codes the image and the sketch into the same space, thereby ensuring the decoupling of information. In the training process, the geometric and appearance of different images are combined by using exchange operation to generate an intermediate result, and the result is respectively subjected to geometric and appearance constraints, so that geometric information can be extracted from a human face or a real image and combined with the appearance of other images to generate a new local synthesis result.
According to the invention, a local decoupling module generates an intermediate feature map, a global fusion module splices the intermediate feature map according to a fixed position, and then a down-sampling network, a residual error network and an up-sampling network are used, and the result of block splicing is fused through a discriminator in the GAN and a related loss function optimization network. And splicing the intermediate feature maps synthesized by the local decoupling modules to generate the human face with high reality sense.
Drawings
Fig. 1 is a network framework structure diagram (showing a structured local-to-global training strategy, and showing a training strategy of exchange and loop reconstruction of a local decoupling module) of the embodiment.
Fig. 2 is a schematic diagram of the structure and training strategy of the geometric encoder in the embodiment (encoding the sketch geometry and the real image into the same potential space, and extracting geometric information from the real image and the sketch).
Fig. 3 shows the face generation results of the geometry and appearance exchange in the example (the first row provides appearance information and the first column provides geometry information).
Fig. 4 shows the partial editing result using the sketch in the embodiment (the input sketch is edited in a serialized manner, and the system generates a corresponding face editing result, and the system provides a high degree of freedom and creativity for the user).
Fig. 5 shows the result of the partial appearance editing in the embodiment (geometric features of the fixed image, which is generated by replacing the appearance reference image of the eyes and mouth).
Fig. 6 shows the result of interpolation of geometry and appearance in the embodiment (the images in the upper left corner and the lower right corner are real images, and the rest are interpolation generation results).
Detailed Description
As shown in fig. 1, the present embodiment provides an intelligent face editing method by using an image decoupling technology and adopting a local-to-global method.
The face is divided into 5 parts in this example: after image blocking is completed, a local decoupling module is designed to extract and generate decoupling characteristics of images of each part; after the generation result of each part is obtained, the blocking results are spliced and fused through the global fusion module, and a face image result with global consistency is obtained.
The network structure of the embodiment comprises 5 local decoupling modules for decoupling geometric and appearance information, and 1 global fusion module realizes the fusion of local features and generates a result with high quality and global consistency. In the process of network training, an exchange strategy is used, and the characteristic constraint with consistent cycle is designed, so that the robustness and the generalization of a network framework are ensured.
The intelligent face editing method in the embodiment specifically comprises the following steps:
1) local decoupling: inputting a human face geometric feature image (the image comprises the geometric features of the human face) and a human face appearance feature image (the image comprises the appearance features of the human face) into a corresponding trained local decoupling module according to the human face part, and extracting the geometric features and the appearance features corresponding to all parts of the human face, wherein the geometric feature image is a human face sketch or a human face real image.
The geometrical characteristics mainly comprise two aspects: 1. shape information such as the shape of the five sense organs, the face shape of a person, the length of hair, and the like; 2. geometric details, i.e. the representation of details of geometric features of a human face, such as wrinkles of the person's face, the trend of the hair, etc.
The appearance characteristics mainly comprise three contents: 1. color information such as color development, skin color, lip color, and the like of a human face; 2. material information, namely the texture of the hair and skin of the human face, such as the smoothness of the skin and the like; 3. the illumination information is information of the influence of the illumination condition on the brightness of the human face, such as the brightness of light, the change of shadow, and the like. In some cases, the effects of the above factors on appearance are mutual, for example, illumination changes may affect the expression of skin color, and appearance characteristics do not make clear division between each of the above factors.
And aiming at each local block, the local decoupling module extracts the geometric and appearance information of the local block and then fuses the geometric and appearance information to generate a local feature map. Therefore, the local decoupling module comprises a geometric encoder and an appearance encoder, which acquire geometric and appearance features, respectively.
1a) Geometric encoder: the sketch is a monochromatic outline of a real image, and geometric information can be extracted. Thus, the self-encoder network extracts the geometric information directly from the input sketch. It is difficult to directly extract geometric features from real images, and when a real face partial image is used as input, an intuitive method for extracting the geometric features is to convert the real image into a sketch by using a pre-trained image-to-sketch conversion network, and then apply the generated sketch to a geometric encoder of the sketch.
To simplify the network, this embodiment proposes a unified method for extracting geometric information from the sketch and the real image, which is achieved by training two autoencoders, the sketch encoderAnd an image encoderOne for sketches and the other for images. In the present embodiment, the implicit distribution in the image space is aligned with the implicit distribution in the sketch space, so that only the geometric information is encoded.
First training the encoder by sketchSketch encoderThe composed network generates the intermediate features of the sketch, as shown in fig. 2. To preserve the necessary spatial information, the hidden space in the bottleneck layer is not in the form of vectors, but has dimensions ofOf low resolution, wherein 、 Andis the height, width and number of channels of the geometric feature map. The input and output of the network are sketches, which can be edge maps extracted from the images or hand-drawn sketches. For hand-drawn sketches, particularly incomplete sketches in the sketches drawing process, the inventor uses the sketches manifold projection in the preprocessing process to improve the robustness of the system.
Order toRepresents a real image, andshowing the corresponding sketch thereof. By pre-trainingExtractingIs given by the formula。
Then, the encoder needs to be trainedWill correspond to the imageA geometric hidden space mapped to the sketch representation, denoted as。
To ensureAndfollow the same scoreCloth, whenAndinput pre-trained decoderTime, algorithm pairEach layer of (a) imposes constraints while also at the final outputAndwith the addition of L1 losses.
1b) Appearance encoder: appearance is another important attribute of facial images. The mapping between the geometric sketch of the face and the real face image is one-to-many and by specifying the appearance, this ambiguity can be resolved.
The present embodiment uses an appearance encoderAppearance encoder for extracting appearance featuresGlobal average pooling (i.e., averaging all spatial positions in the feature map for each feature channel) is utilized to remove spatial information and extract appearance features that are independent of geometric features.
Since the appearance features are extracted for local regions, deleting spatial information does not result in a significant loss of useful information. The interpolation experiment of the appearance and the geometric characteristics of the human face, such as the graph 6, proves thatA continuous face appearance space can be learned.
2) Image generation: and the local decoupling module generates local images and local intermediate feature images corresponding to all parts of the human face based on the geometric features and the appearance features.
The local decoupling module further comprises an image composition generatorAnd combining the geometric features and appearance features in one image or two images (one providing the geometric features and the other providing the appearance features) to obtain the converted local images and the intermediate feature map.
An image composition generator: and inputting independent geometric and appearance characteristics to generate a result of reconstruction or geometric and appearance exchange. To control the appearance of the generated face image, the present embodiment employs adaptive instance normalization (AdaIN) in the face image synthesis generation network.
The image composition generator in this embodiment includes 4 residual blocks and 4 upsampling layers: firstly, injecting appearance characteristics into each residual block; then, obtaining feature maps through 4 times of upsampling operations, wherein each feature map has the same resolution as the input image but 64 channels; finally, an image is generated by a convolutional layer that is consistent with the input geometric and appearance characteristics.
3) And (3) global fusion: and fusing the local intermediate feature images corresponding to all parts of the human face through the trained global fusion module to generate a final human face image with the geometric features and the appearance features.
To convert the local image feature map into a complete and natural facial image, one feasible approach is to directly combine local image blocks generated by local decoupling (as generated in fig. 2)The process of (d). However, this intuitive approach is prone to show artifacts at the boundaries of local partitions.
In the embodiment, the local image blocks are not directly combined, but the intermediate characteristic diagram generated by the local decoupling module is input into the image generation network for combination, so that the network integrates more information streams and generates a high-quality image.
The global fusion module in this example comprises three units: an encoder, a residual block and a decoder. On the basis of the feature map of a given background part, corresponding blocks in the feature map of the background are replaced by feature maps generated by corresponding other blocks in the sequence of mouth, nose, left eye and right eye, and the influence of overlapping among the blocks is reduced. And sending the fused feature map into a global fusion module to generate a brand new face with the input specified appearance and geometric features.
In this example, the whole network framework is trained step by step, the local decoupling module is trained first, then the parameters of the local decoupling module are fixed, and the global fusion module is trained.
A-1) training data set: this embodiment requires a large-scale sketch to match the image data set training network. Meanwhile, the sketch in the sketch image pair requires higher quality, and is similar to a hand-drawn sketch. Conventional edge extraction techniques, such as HED and Canny, typically fail to produce an ideal edge map. Therefore, in the present embodiment, the edge image is obtained using the photocopy filter in Photoshop, then the edge image is simplified to obtain the sketch, and the training data set is constructed, and the resolution of both the image and the sketch is set to 512 × 512 using CelebA HQ as the training data.
A-2) decoupling training, wherein the training process of a local decoupling module comprises three steps:
first, trainAndthe geometric hidden space of the sketch is learned using the L1 reconstruction loss function. Once the cover is closedAfter training, the geometric characteristics can be expressed as。
Then, training the networkFrom a real imageAs input and predict geometric characteristicsSo that it follows the same distribution as the learned geometry space. The loss function is defined as follows:
wherein the content of the first and second substances,N=7 is a decoderIndex 0 corresponds to the input feature map, indexNThe other indices are intermediate feature maps corresponding to the output image. In this example, in optimizingWhen the parameters of (1) are fixedAndthe weight of (c).
Geometric characteristics of sketchOr geometrical characteristics of real imagesRandom input. In the following sections, willAndare all shown asWithout distinguishing the origin.
In the embodiment, the appearance encoder and the image synthesis generator adopt an exchange training strategy, and the appearance and the geometric structure of the real face image are decoupled by using a cycle consistency loss item; multi-scale discriminators and adversarial loss are also employed to ensure the realism of the generated images.
The exchange training strategy in this example is illustrated as follows: given two images in a training set(Is a real image or sketch (as a geometric feature image of a human face)) and(is a real image (as a face appearance feature image)), as shown in fig. 1, through pre-trainingOrFromAndextracting geometric features fromAndby usingFromExtracting appearance characteristics.
By mixingGeometric characteristics ofExchange of geometrical characteristics of, use ofGeometric characteristics ofAndto generate an imageI.e. by。
Use ofThe appearance characteristics ofGeometric features of (1), cyclically reconstructing the imageI.e. by。
The present embodiment also includes self-reconstruction loss: when in useAndin the same image (e.g. in the same picture)) As input, it can be reconstructed by using its geometric and appearance features,can be expressed as。
In this embodiment, the following loss function is used to train the local decoupling module:
self-weight building loss:
when the geometric and appearance features are from the same image, i.e.The self-reconstruction consistency of the algorithm requires that it be reconstructable through the network frameworkI。
The self-reconstruction loss function contains three terms: 1) loss of perceptionGenerating visual similarity between the image and the input image by the pre-trained VGG-19 model metric; 2) loss of feature matching for discriminatorsAiming at stabilizing the training process; 3) color lossConverting the image into CIE-Lab color space by calculationaAndbthe chrominance distance in the channel controls the hue. The self-reconstruction loss can be expressed as:
whereinaAndbthe channels contain color information in the CIE LAB color space, set empirically 、 、 。
Loss of cyclic exchange:
to completely decouple the geometric and appearance features, the present embodiment uses a swapping method to generate a face image from the geometric and appearance features of different images, i.e.≠Loss of cyclic exchangeContaining itemAnd。
to completely decouple the geometric and appearance characteristics of the face image, the face image is processedIs replaced byAfter appearance of (2)After that, should be keptThe geometry of (2). Thus, the algorithm introduces geometric losses (geometry loss)Constraining the geometry of the generated image to be invariant by comparison with the input image:
the network uses a cyclic consistency penalty term to ensure that images are exchangedThe appearance of (1)The same is true. Use ofGeometry of (2) and exchange imagesThe appearance of (1), the image producedShould cyclically reconstruct the image. Reconstruction loss formula before this example usesConstraints to achieve cycle consistency:
The cycle exchange loss was:
The resistance loss:
in this embodiment, a multi-scale discriminator D is used to limit the distribution of the generated image to match the distribution of the real image:
Optimization target of local decoupling module in embodimentIs the sum of the above 3 items, minimizingThe following 3 networks will be optimized:,,。can be expressed as:
a-3) global fusion training: after the local decoupling module is trained, a global fusion module needs to be trained, and a feature map generated by the local decoupling module is fused to generate a final result. Similar to the previous stage, the countermeasure penalty, feature matching penalty, and perceptual penalty are used as penalty functions for the global fusion module, which does not use a swapping strategy because it does not involve any decoupling operations.
The embodiment also provides an intelligent face editing device, which comprises a feature extraction unit, an image generation unit and an image fusion unit, wherein the feature extraction unit is used for inputting the geometric feature image and the appearance feature image of the face into corresponding trained local decoupling modules according to the face part and extracting the geometric features and the appearance features corresponding to all parts of the face; the image generation unit is used for generating local images and intermediate characteristic images corresponding to all parts of the human face by the local decoupling module based on the geometric characteristics and the appearance characteristics; the image fusion unit is used for fusing the local intermediate feature images corresponding to all parts of the human face through the trained global fusion module to generate a final human face image with the geometric features and the appearance features.
The present embodiment also provides a storage medium having stored thereon a computer program executable by a processor, the computer program, when executed, implementing the steps of the intelligent face editing method in the present embodiment.
The present embodiment also provides a computer device having a memory and a processor, the memory storing thereon a computer program executable by the processor, the computer program, when executed, implementing the steps of the intelligent face editing method in the present embodiment.
Claims (10)
1. An intelligent face editing method is characterized in that:
inputting the geometric feature image and the appearance feature image of the face into corresponding trained local decoupling modules according to the face parts, and extracting the corresponding geometric features and appearance features of each part of the face;
the local decoupling module generates local images and local intermediate feature images corresponding to all parts of the human face based on the geometric features and the appearance features;
and fusing the local intermediate feature images corresponding to all parts of the human face through the trained global fusion module to generate a final human face image with the geometric features and the appearance features.
2. The intelligent face editing method according to claim 1, characterized in that: the geometric feature image is a draft of a human face or a real image of the human face, and the local decoupling module comprises a draft encoderImage encoderAppearance encoderAnd image composition generator;
The extracting of the geometric features corresponding to each part of the human face from the geometric feature image includes:
training cursive chart encoderSketch encoderThe hidden space in the bottleneck layer is of dimension ofOf low resolution, wherein、Andthe height, width and channel number of the geometric feature map;
4. The intelligent face editing method according to claim 3, wherein the training of the local decoupling module comprises:
first, trainAndlearning the geometric hidden space of the sketch using the L1 reconstruction loss function, onceAfter training, the geometric characteristics are expressed as;
The network is then trainedFrom the real image of the human faceAs input and predict geometric characteristicsSo that it follows the same distribution as the learned geometry space, the loss function is defined as follows:
5. The intelligent face editing method according to claim 2, 3 or 4, characterized in that: the appearance encoderAnd eliminating spatial information on the face appearance characteristic image and extracting appearance characteristics irrelevant to the geometric characteristics by utilizing global average pooling.
6. The intelligent face editing method of claim 5, wherein the appearance encoder and the image synthesis generator employ an exchange training strategy;
using geometric images of human facesGeometric characteristics ofAnd human face appearance characteristic imageAppearance characteristics ofGenerating an imageI.e. by;
7. The intelligent face editing method of claim 6, wherein the training of the local decoupling module using the following loss function comprises:
a. self-weight building loss:
wherein:representing a loss of perception;a loss of feature matching representing a discriminator;expressing color loss, converting the image to CIE-Lab color space, and controlling hue by calculating chroma distance in a and b channels; the a and b channels contain color information in the CIE LAB color space; 、 、 setting specific parameters according to experience;
b. loss of cyclic exchange:
c. the resistance loss:
the distribution of the generated image is limited using a multi-scale discriminator D to match the distribution of the real image:
8. An intelligent face editing device, comprising:
the characteristic extraction unit is used for inputting the human face geometric characteristic image and the human face appearance characteristic image into corresponding trained local decoupling modules according to human face parts and extracting the geometric characteristics and appearance characteristics corresponding to all parts of the human face;
the image generation unit is used for generating local images and local intermediate feature images corresponding to all parts of the human face by the local decoupling module based on the geometric features and the appearance features;
and the image fusion unit is used for fusing the local intermediate feature images corresponding to all parts of the human face through the trained global fusion module to generate a final human face image with the geometric features and the appearance features.
9. A storage medium having stored thereon a computer program executable by a processor, the computer program comprising: the computer program when executed implements the steps of the intelligent face editing method of any one of claims 1 to 7.
10. A computer device having a memory and a processor, the memory having stored thereon a computer program executable by the processor, the computer program comprising: the computer program when executed implements the steps of the intelligent face editing method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110466411.XA CN112991484B (en) | 2021-04-28 | 2021-04-28 | Intelligent face editing method and device, storage medium and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110466411.XA CN112991484B (en) | 2021-04-28 | 2021-04-28 | Intelligent face editing method and device, storage medium and equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112991484A true CN112991484A (en) | 2021-06-18 |
CN112991484B CN112991484B (en) | 2021-09-03 |
Family
ID=76340521
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110466411.XA Active CN112991484B (en) | 2021-04-28 | 2021-04-28 | Intelligent face editing method and device, storage medium and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112991484B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113470182A (en) * | 2021-09-03 | 2021-10-01 | 中科计算技术创新研究院 | Face geometric feature editing method and deep face remodeling editing method |
CN114845067A (en) * | 2022-07-04 | 2022-08-02 | 中科计算技术创新研究院 | Hidden space decoupling-based depth video propagation method for face editing |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111160138A (en) * | 2019-12-11 | 2020-05-15 | 杭州电子科技大学 | Fast face exchange method based on convolutional neural network |
CN111915693A (en) * | 2020-05-22 | 2020-11-10 | 中国科学院计算技术研究所 | Sketch-based face image generation method and system |
CN112188234A (en) * | 2019-07-03 | 2021-01-05 | 广州虎牙科技有限公司 | Image processing and live broadcasting method and related device |
CN112241708A (en) * | 2020-10-19 | 2021-01-19 | 戴姆勒股份公司 | Method and apparatus for generating new person image from original person image |
CN112258387A (en) * | 2020-10-30 | 2021-01-22 | 北京航空航天大学 | Image conversion system and method for generating cartoon portrait based on face photo |
CN112668401A (en) * | 2020-12-09 | 2021-04-16 | 中国科学院信息工程研究所 | Face privacy protection method and device based on feature decoupling |
CN112734890A (en) * | 2020-12-22 | 2021-04-30 | 上海影谱科技有限公司 | Human face replacement method and device based on three-dimensional reconstruction |
CN112837210A (en) * | 2021-01-28 | 2021-05-25 | 南京大学 | Multi-form-style face cartoon automatic generation method based on feature image blocks |
-
2021
- 2021-04-28 CN CN202110466411.XA patent/CN112991484B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112188234A (en) * | 2019-07-03 | 2021-01-05 | 广州虎牙科技有限公司 | Image processing and live broadcasting method and related device |
CN111160138A (en) * | 2019-12-11 | 2020-05-15 | 杭州电子科技大学 | Fast face exchange method based on convolutional neural network |
CN111915693A (en) * | 2020-05-22 | 2020-11-10 | 中国科学院计算技术研究所 | Sketch-based face image generation method and system |
CN112241708A (en) * | 2020-10-19 | 2021-01-19 | 戴姆勒股份公司 | Method and apparatus for generating new person image from original person image |
CN112258387A (en) * | 2020-10-30 | 2021-01-22 | 北京航空航天大学 | Image conversion system and method for generating cartoon portrait based on face photo |
CN112668401A (en) * | 2020-12-09 | 2021-04-16 | 中国科学院信息工程研究所 | Face privacy protection method and device based on feature decoupling |
CN112734890A (en) * | 2020-12-22 | 2021-04-30 | 上海影谱科技有限公司 | Human face replacement method and device based on three-dimensional reconstruction |
CN112837210A (en) * | 2021-01-28 | 2021-05-25 | 南京大学 | Multi-form-style face cartoon automatic generation method based on feature image blocks |
Non-Patent Citations (1)
Title |
---|
SHUYU CHEN,ETC: "DeepFaceDrawing: Deep Generation of Face Images from Sketches", 《ACM TRANSACTIONS ON GRAPHICS》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113470182A (en) * | 2021-09-03 | 2021-10-01 | 中科计算技术创新研究院 | Face geometric feature editing method and deep face remodeling editing method |
CN114845067A (en) * | 2022-07-04 | 2022-08-02 | 中科计算技术创新研究院 | Hidden space decoupling-based depth video propagation method for face editing |
Also Published As
Publication number | Publication date |
---|---|
CN112991484B (en) | 2021-09-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | Deep generation of face images from sketches | |
Chai et al. | Using latent space regression to analyze and leverage compositionality in gans | |
US11880766B2 (en) | Techniques for domain to domain projection using a generative model | |
Chen et al. | Beautyglow: On-demand makeup transfer framework with reversible generative network | |
Khakhulin et al. | Realistic one-shot mesh-based head avatars | |
Wang et al. | A survey on face data augmentation | |
CN108288072A (en) | A kind of facial expression synthetic method based on generation confrontation network | |
Shi et al. | Deep generative models on 3d representations: A survey | |
CN110222668A (en) | Based on the multi-pose human facial expression recognition method for generating confrontation network | |
US11562536B2 (en) | Methods and systems for personalized 3D head model deformation | |
Zhang et al. | Hair-GAN: Recovering 3D hair structure from a single image using generative adversarial networks | |
CN113470182B (en) | Face geometric feature editing method and deep face remodeling editing method | |
CN112991484B (en) | Intelligent face editing method and device, storage medium and equipment | |
Singh et al. | Neural style transfer: A critical review | |
CN113807265B (en) | Diversified human face image synthesis method and system | |
US11587288B2 (en) | Methods and systems for constructing facial position map | |
JP7462120B2 (en) | Method, system and computer program for extracting color from two-dimensional (2D) facial images | |
JP2024506170A (en) | Methods, electronic devices, and programs for forming personalized 3D head and face models | |
Xia et al. | Controllable continuous gaze redirection | |
Zhao et al. | Cartoon image processing: a survey | |
CN117635771A (en) | Scene text editing method and device based on semi-supervised contrast learning | |
Hilsmann et al. | Going beyond free viewpoint: creating animatable volumetric video of human performances | |
Huang et al. | IA-FaceS: A bidirectional method for semantic face editing | |
Li et al. | Learning disentangled representation for one-shot progressive face swapping | |
Liu et al. | Transformer-based high-fidelity facial displacement completion for detailed 3d face reconstruction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 12 / F, building 4, 108 Xiangyuan Road, Gongshu District, Hangzhou City, Zhejiang Province 310015 Applicant after: Zhongke Computing Technology Innovation Research Institute Address before: 12 / F, building 4, 108 Xiangyuan Road, Gongshu District, Hangzhou City, Zhejiang Province 310015 Applicant before: Institute of digital economy industry, Institute of computing technology, Chinese Academy of Sciences |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |