CN108288072A - A kind of facial expression synthetic method based on generation confrontation network - Google Patents
A kind of facial expression synthetic method based on generation confrontation network Download PDFInfo
- Publication number
- CN108288072A CN108288072A CN201810078963.1A CN201810078963A CN108288072A CN 108288072 A CN108288072 A CN 108288072A CN 201810078963 A CN201810078963 A CN 201810078963A CN 108288072 A CN108288072 A CN 108288072A
- Authority
- CN
- China
- Prior art keywords
- expression
- facial
- face
- facial expression
- loss
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 230000008921 facial expression Effects 0.000 title claims abstract description 97
- 238000010189 synthetic method Methods 0.000 title abstract 2
- 230000014509 gene expression Effects 0.000 claims abstract description 96
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 46
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 46
- 230000001815 facial effect Effects 0.000 claims abstract description 39
- 230000006870 function Effects 0.000 claims abstract description 20
- 238000012546 transfer Methods 0.000 claims abstract description 20
- 238000000034 method Methods 0.000 claims abstract description 14
- 230000008569 process Effects 0.000 claims abstract description 13
- 230000007935 neutral effect Effects 0.000 claims description 29
- 238000001308 synthesis method Methods 0.000 claims description 12
- 230000014759 maintenance of location Effects 0.000 claims description 9
- 238000012549 training Methods 0.000 claims description 9
- 230000008485 antagonism Effects 0.000 claims description 7
- 238000013527 convolutional neural network Methods 0.000 claims description 6
- 210000000887 face Anatomy 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 6
- 238000000513 principal component analysis Methods 0.000 claims description 6
- 230000002194 synthesizing effect Effects 0.000 claims description 5
- 238000001514 detection method Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 239000003550 marker Substances 0.000 claims description 3
- 230000000116 mitigating effect Effects 0.000 claims description 3
- 238000011084 recovery Methods 0.000 claims description 3
- 230000000717 retained effect Effects 0.000 claims description 3
- 230000003042 antagnostic effect Effects 0.000 claims 1
- 125000004122 cyclic group Chemical group 0.000 claims 1
- 230000004075 alteration Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 239000002131 composite material Substances 0.000 description 2
- 230000001143 conditioned effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006996 mental state Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/755—Deformable models or variational models, e.g. snakes or active contours
- G06V10/7553—Deformable models or variational models, e.g. snakes or active contours based on shape, e.g. active shape models [ASM]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/757—Matching configurations of points or features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Human Computer Interaction (AREA)
- Processing Or Creating Images (AREA)
- Image Analysis (AREA)
Abstract
What is proposed in the present invention is a kind of based on the facial expression synthetic method for generating confrontation network, and main contents include:The facial expression synthesis of geometry guiding and facial geometric operation, its process is, it first gives the thermal map of target face expression and the front face of expression does not synthesize new facial image correspondingly, then summation is weighted to all loss functions and obtains total losses function, then facial expression edition is guided using the geometric position of one group of datum mark, by using the acquisition facial expression transfer of Expression synthesis model as a result, carrying out facial expression interpolation finally by the value of Serial regulation form parameter.The present invention fights network using the generation of geometry guiding, the photorealism of different expressions can be generated from single image, and fine granularity control is carried out to composograph, facial expression transfer and interpolation can also be easily carried out, facial expression transfer is realized and intersects Expression Recognition.
Description
Technical Field
The invention relates to the field of facial expression synthesis, in particular to a facial expression synthesis method based on a generation countermeasure network.
Background
The human face plays a very important information expression role in human communication, which conveys human emotional and mental states. In recent years, automatic processing of facial expressions by computers has become a hot research topic in the fields of computer vision, computer graphics, pattern recognition and the like, and has wide application prospects in video conferences, movie production, intelligent human-computer interfaces and the like. The facial expression processing includes facial expression recognition and facial expression synthesis. The synthesis of the facial expressions enables people to use the equipment more conveniently, and if the synthesis of the facial expressions enables a computer to generate fine and vivid facial expression animations, the interestingness of man-machine interaction can be further increased, and a better interaction atmosphere can be created. The facial expression synthesis can also be applied to character simulation in movies, games or advertisements, and by applying the facial expression synthesis and data driving or parameter driving, the manufacturing cost can be greatly reduced, and the working efficiency is improved. The face of a suspected molecule is reproduced through a reconstruction and synthesis technology, and a key clue is provided for case detection and pursuit; the facial expression synthesis can also make the traditional computer-aided instruction system more vivid and interesting, and further improve the learning enthusiasm of students. However, the conventional method utilizes a variational self-encoder which can generate a high-resolution realistic image, but the computation is very complex, and the image generated by the depth generation model is often lack of details, blurred or low in resolution.
The invention provides a facial expression synthesis method based on a generation countermeasure network, which comprises the steps of firstly giving a heat map of a target facial expression and a front face without the expression to correspondingly synthesize a new face image, then carrying out weighted summation on all loss functions to obtain a total loss function, then adopting the geometric positions of a group of reference points to guide facial expression editing, obtaining a facial expression transfer result by using an expression synthesis model, and finally carrying out facial expression interpolation by linearly adjusting the value of a shape parameter. The invention utilizes the generation of the confrontation network guided by geometry to generate vivid images with different expressions from a single image, and carries out fine-grained control on the composite image, and can also easily carry out facial expression transfer and interpolation to realize facial expression transfer and cross expression recognition.
Disclosure of Invention
The invention aims to provide a facial expression synthesis method based on generation of a countermeasure network, which aims to solve the problems of image blurring or low resolution and the like.
In order to solve the above problems, the present invention provides a facial expression synthesis method based on generation of an confrontation network, which mainly comprises:
geometrically guided facial expression synthesis;
and (II) geometric operation of the face.
Wherein said geometrically guided facial expression synthesis, as with an Active Appearance Model (AAM), the face geometry is defined by a set of fiducial points; the heat map is used to encode the location of the facial fiducial and provides a per-pixel likelihood of fiducial location; given a heat map of the target facial expression and a frontal face without expression (hereinafter referred to as an expressionless face), a new face image (an expressive face) is synthesized accordingly;
given a pair of generators GE:(IN,HE)→IEAnd GN:(IE,HE)→INWherein, INIs a non-expressive face, IEIs a facial expression, HEIs corresponding to IEA heat map of (a); the two discriminators associated with these two generators are DEAnd DNDividing the actual triples (I, H, I') by the corresponding regions and generating the triples (I, H, g (I)); i and I' are the facial images of the blankness and expression, and vice versa;
HEas a supplementary note in control and expression removal in expression synthesis in both of these face editing modes; in the expression synthesis process, HEFor specifying target expressions, so that GECan express neutral expression INConverting into a desired expression; in the expression removal process, HEResponsible for refers toShow thatEIn order to INRecovery of (1);
the loss of the geometric-guided facial expression synthesis comprises antagonism loss, pixel loss, cycle consistency loss and identity retention loss, and the weighted sum of the four loss functions is the total loss function.
Further, the antagonism loss and pixel loss, since the proposed face editing model generates a result conditioned on the input face image and heat map, apply a Generation Antagonism Network (GAN) to the condition setting; the loss of opposition of the generator and the discriminator are respectively shown as follows:
the generator is not only tasked with deceiving the discriminator, but also synthesizing a real image similar to the target calibration as much as possible; loss per pixel LpixelForcing the transformed face image to have a small distance with calibrated real data in the original pixel space; l ispixelThe form of (A) is as follows:
Lpixel=EI,H,I′~P(I,H,I′)‖I′-G(I,H)‖1(3)
encouraging fuzzy output using the L1 distance; (I, H, I') is (I)N,HE,IE) And (I)E,HE,IN) Depending on the generator.
Further, the cycle consistency loss, generator GEAnd GNConstructing a complete mapping cycle between the neutral expression face and the expression face; if a facial image is converted from a neutral expression to an angry expression and then converted to a neutral expression, then it is idealThe same face image should be obtained in the case; therefore, an extra cycle consistency penalty L is introducedcycTo ensure the consistency of the source image with the reconstructed image, e.g. INAnd GN(GE(IN,HE),HE)、IEAnd GE(GN(IE,HE),HE);LcycThe calculation is as follows:
Lcyc=EI,H~P(I,H)‖I-G′(G(I,H))‖1(4)
wherein G' is the generator opposite G; if G is used to convert the neutral expression to the expression specified by the facial geometry heat map H, then G' is used to restore the neutral expression with the help of H.
Further, the identity is preserved lost, facial expression editing preserves facial features after expression synthesis and removal; thus, identity retention terms are employed to enforce identity consistency:
Lidentity=EI,H~P(I,H)‖F(I)-F(G(I,H))‖1(5)
wherein F is a feature extractor for facial recognition; adopting a model of the mitigation convolutional neural network as a feature extraction network, wherein the model comprises 9 convolutional layers, 4 maximum pool layers and 1 full-connection layer; the convolutional neural network is lightened to be trained into a classifier capable of distinguishing various identities in advance, so that the most prominent characteristic of face recognition can be captured; thus, the face identity retained by the face editing process can be performed with this loss.
Further, the total loss function, the final complete target G of the generatorNAnd GEIs a weighted sum L of all losses defined aboveG-advRemoving the pattern difference, L, between the actual and generated samplespixelEnsuring pixel correctness, LcycEnsures the period consistency of the reconstructed image and the source image, LidentityThe identity is preserved by the mapping process, so the total loss function is:
LG=LG-adv+α1Lpixel+α2Lcyc+α3Lidentity(6)
wherein, α1、α2、α3Is the loss weight coefficient.
The facial geometric operations comprise facial expression editing, facial expression transfer, facial expression synthesis and interpolation.
Further, the facial expression editing is guided by the geometric positions of a group of datum points; the human face has unique physiological structure characteristics, so that the correlation between the positions of the datum points is strong; therefore, changes in the face geometry should be limited to avoid unreasonable settings; in consideration of the prior knowledge of face distribution, a parameterized shape model is established as a geometric generator, and a basic shape model is learned from a marked training image;
firstly, normalizing the face to the same proportion according to the positions of two eyes and rotating the face to the horizontal; principal Component Analysis (PCA) is then applied to obtain a basic shape model of the K fiducial locations:
s(p)=s0+Sp(7)
wherein, s0∈R2K×1,S∈R2K×N,p∈RN×1(ii) a Basic shape s0Is the average shape of all training images, the column of S is the N eigenvectors corresponding to the N largest eigenvalues; different face geometries can be obtained by varying the value of the shape parameter p;
however, facial geometry is not only related to facial expressions, but also has a large relationship to facial identity; facial geometry varies from person to person even under the same expression; for example, the distance between the eyes and the length of the nose depend largely on the identity of the face and not the expression; based on these individual differences, an individual-specific shape model based on equation (7) is proposed, which can be generated by using individual-specific shape modelsNeutral shapeInstead of the mean shape s0Exporting; the individual specific shape model is given by:
wherein,representing identity related changes and p representing changes caused by facial expressions.
Further, the facial expression is transferred, and two expression faces I are givenAAnd IBDetecting facial markers sAAnd sB(ii) a The expressive removal model is first used to restore the non-expressed face:
wherein,andrespectively represent IAAnd IBNeutral expressive faces of (1); thus, a neutral shapeAndmay be obtained by facial marker detection; the shape parameters are then derived by solving the following least squares regression problem:
changing the shape parameters to obtain the transfer position of the reference point:
converting the heat map according to the shape of the transfer, and connecting the heat map with the corresponding non-expressive surface to be used as an input of expression synthesis; finally, the result of facial expression transfer is obtained by using the expression synthesis model:
the expression synthesis model is represented by the above formula.
Further, the facial expression synthesis and interpolation firstly prepares a neutral expression face image and shape parameters for the target expression; obtaining a neutral expression face based on the proposed expression removal model; shape parameters for a particular expression may be learned from the annotated training dataset by a basic shape model (as shown in equation (7)); once the values of the shape parameters are associated with certain semantic attributes (such as fear and surprise), they can be used to synthesize a facial expression with the desired semantic type; further, the facial expression interpolation may be performed by linearly adjusting the value of the shape parameter.
Drawings
Fig. 1 is a system block diagram of a facial expression synthesis method based on generation of a confrontation network according to the present invention.
Fig. 2 is a flow chart of a facial expression synthesis method based on generation of a confrontation network according to the present invention.
Fig. 3 is an example of facial expression synthesis based on geometric guidance of a facial expression synthesis method of generating a confrontational network according to the present invention.
Detailed Description
It should be noted that the embodiments and features of the embodiments in the present application can be combined with each other without conflict, and the present invention is further described in detail with reference to the drawings and specific embodiments.
Fig. 1 is a system block diagram of a facial expression synthesis method based on generation of a confrontation network according to the present invention. Mainly comprises geometric-guided facial expression synthesis and facial geometric operation.
The loss of the geometric-guided facial expression synthesis comprises antagonism loss, pixel loss, cycle consistency loss and identity retention loss, and the weighted sum of the four loss functions is the total loss function.
For contrast loss and pixel loss, since the proposed face editing model generates a result conditioned on the input face image and heat map, a generation countermeasure network (GAN) is applied to the condition setting; the loss of opposition of the generator and the discriminator are respectively shown as follows:
the generator is not only tasked with deceiving the discriminator, but also synthesizing a real image similar to the target calibration as much as possible; loss per pixel LpixelForcing the transformed face image to have a small distance with calibrated real data in the original pixel space; l ispixelThe form of (A) is as follows:
Lpixel=EI,H,I′~P(I,H,I′)‖I′-G(I,H)‖1(3)
encouraging fuzzy output using the L1 distance; (I, H, I') is (I)N,HE,IE) And (I)E,HE,IN) Depending on the generator.
Loss of cycle consistency, generator GEAnd GNConstructing a complete mapping cycle between the neutral expression face and the expression face; if a face image is converted from a neutral expression to an angry expression and then converted to a neutral expression, the same face image should be obtained in an ideal case; therefore, an extra cycle consistency penalty L is introducedcycTo ensure the consistency of the source image with the reconstructed image, e.g. INAnd GN(GE(IN,HE),HE)、IEAnd GE(GN(IE,HE),HE);LcycThe calculation is as follows:
Lcyc=EI,H~P(I,H)‖I-G′(G(I,H))‖1(4)
wherein G' is the generator opposite G; if G is used to convert the neutral expression to the expression specified by the facial geometry heat map H, then G' is used to restore the neutral expression with the help of H.
Identity retention loss, facial expression editing retains facial features after expression synthesis and removal; thus, identity retention terms are employed to enforce identity consistency:
Lidentity=EI,H~P(I,H)‖F(I)-F(G(I,H))‖1(5)
wherein F is a feature extractor for facial recognition; adopting a model of the mitigation convolutional neural network as a feature extraction network, wherein the model comprises 9 convolutional layers, 4 maximum pool layers and 1 full-connection layer; the convolutional neural network is lightened to be trained into a classifier capable of distinguishing various identities in advance, so that the most prominent characteristic of face recognition can be captured; thus, the face identity retained by the face editing process can be performed with this loss.
Total loss function, final complete target G of generatorNAnd GEIs a weighted sum L of all losses defined aboveG-advRemoving the pattern difference, L, between the actual and generated samplespixelEnsuring pixel correctness, LcycEnsures the period consistency of the reconstructed image and the source image, LidentityThe identity is preserved by the mapping process, so the total loss function is:
LG=LG-adv+α1Lpixel+α2Lcyc+α3Lidentity(6)
wherein, α1、α2、α3Is the loss weight coefficient.
The facial geometric operations comprise facial expression editing, facial expression transfer, facial expression synthesis and interpolation.
Facial expression editing, wherein the geometric positions of a group of datum points are adopted to guide the facial expression editing; the human face has unique physiological structure characteristics, so that the correlation between the positions of the datum points is strong; therefore, changes in the face geometry should be limited to avoid unreasonable settings; in consideration of the prior knowledge of face distribution, a parameterized shape model is established as a geometric generator, and a basic shape model is learned from a marked training image;
firstly, normalizing the face to the same proportion according to the positions of two eyes and rotating the face to the horizontal; principal Component Analysis (PCA) is then applied to obtain a basic shape model of the K fiducial locations:
s(p)=s0+Sp(7)
wherein, s0∈R2K×1,S∈R2K×N,p∈RN×1(ii) a Basic shape s0Is the average shape of all training images, the column of S is the N eigenvectors corresponding to the N largest eigenvalues; different face geometries can be obtained by varying the value of the shape parameter p;
however, facial geometry is not only related to facial expressions, but also has a large relationship to facial identity; facial geometry varies from person to person even under the same expression; for example, the distance between the eyes and the length of the nose depend largely on the identity of the face and not the expression; based on these individual differences, an individual-specific shape model based on equation (7) is proposed, which can be constructed by using the neutral shapes of different individualsInstead of the mean shape s0Exporting; the individual specific shape model is given by:
wherein,representing identity-related changes, and p represents changes caused by facial expressions
Facial expression transition, given two expressing faces IAAnd IBDetecting facial markers sAAnd sB(ii) a The expressive removal model is first used to restore the non-expressed face:
wherein,andrespectively represent IAAnd IBNeutral expressive faces of (1); thus, a neutral shapeAndmay be obtained by facial marker detection; the shape parameters are then derived by solving the following least squares regression problem:
changing the shape parameters to obtain the transfer position of the reference point:
converting the heat map according to the shape of the transfer, and connecting the heat map with the corresponding non-expressive surface to be used as an input of expression synthesis; finally, the result of facial expression transfer is obtained by using the expression synthesis model:
the expression synthesis model is represented by the above formula.
Synthesizing and interpolating facial expressions, namely firstly preparing a neutral expression facial image and shape parameters for a target expression; obtaining a neutral expression face based on the proposed expression removal model; shape parameters for a particular expression may be learned from the annotated training dataset by a basic shape model (as shown in equation (7)); once the values of the shape parameters are associated with certain semantic attributes (such as fear and surprise), they can be used to synthesize a facial expression with the desired semantic type; further, the facial expression interpolation may be performed by linearly adjusting the value of the shape parameter.
Fig. 2 is a flow chart of a facial expression synthesis method based on generation of a confrontation network according to the present invention. The method comprises the steps of firstly giving a heat map of target facial expressions and correspondingly synthesizing a new face image of a front face without the expressions, then conducting weighted summation on all loss functions to obtain a total loss function, guiding facial expression editing by adopting the geometric positions of a group of datum points, obtaining a facial expression transfer result by using an expression synthesis model, and finally conducting facial expression interpolation by linearly adjusting the values of shape parameters.
Fig. 3 is an example of facial expression synthesis based on geometric guidance of a facial expression synthesis method of generating a confrontational network according to the present invention. As with the Active Appearance Model (AAM), face geometry is defined by a set of fiducial points; the heat map is used to encode the location of the facial fiducial and provides a per-pixel likelihood of fiducial location; given a heat map of the target facial expression and a frontal face without expression (hereinafter referred to as an expressionless face), a new face image (an expressive face) is synthesized accordingly;
given a pair of generators GE:(IN,HE)→IEAnd GN:(IE,HE)→INWherein, INIs a non-expressive face, IEIs a facial expression, HEIs corresponding to IEA heat map of (a); and the twoTwo discriminators associated with the generator are DEAnd DNDividing the actual triples (I, H, I') by the corresponding regions and generating the triples (I, H, g (I)); i and I' are the facial images of the blankness and expression, and vice versa;
HEas a supplementary note in control and expression removal in expression synthesis in both of these face editing modes; in the expression synthesis process, HEFor specifying target expressions, so that GECan express neutral expression INConverting into a desired expression; in the expression removal process, HEIs responsible for indicating IEIn order to INRecovery of (1);
the loss of the geometric-guided facial expression synthesis comprises antagonism loss, pixel loss, cycle consistency loss and identity retention loss, and the weighted sum of the four loss functions is the total loss function.
As shown, the images in the first column are the input face, and the remaining images are the input heat map and the composite result.
It will be appreciated by persons skilled in the art that the invention is not limited to details of the foregoing embodiments and that the invention can be embodied in other specific forms without departing from the spirit or scope of the invention. In addition, various modifications and alterations of this invention may be made by those skilled in the art without departing from the spirit and scope of this invention, and such modifications and alterations should also be viewed as being within the scope of this invention. It is therefore intended that the following appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
Claims (10)
1. A facial expression synthesis method based on generation of an antagonistic network is characterized by mainly comprising (I) facial expression synthesis of geometric guidance; face geometry operation (two).
2. Geometrically guided facial expression synthesis (one) based on claim 1, characterized in that, like the Active Appearance Model (AAM), the face geometry is defined by a set of reference points; the heat map is used to encode the location of the facial fiducial and provides a per-pixel likelihood of fiducial location; given a heat map of the target facial expression and a frontal face without expression (hereinafter referred to as an expressionless face), a new face image (an expressive face) is synthesized accordingly;
given a pair of generators GE:(IN,HE)→IEAnd GN:(IE,HE)→INWherein, INIs a non-expressive face, IEIs a facial expression, HEIs corresponding to IEA heat map of (a); the two discriminators associated with these two generators are DEAnd DNDividing the actual triples (I, H, I') by the corresponding regions and generating the triples (I, H, g (I)); i and I' are the facial images of the blankness and expression, and vice versa;
HEas a supplementary note in control and expression removal in expression synthesis in both of these face editing modes; in the expression synthesis process, HEFor specifying target expressions, so that GECan express neutral expression INConverting into a desired expression; in the expression removal process, HEIs responsible for indicating IEIn order to INRecovery of (1);
the loss of the geometric-guided facial expression synthesis comprises antagonism loss, pixel loss, cycle consistency loss and identity retention loss, and the weighted sum of the four loss functions is the total loss function.
3. Antagonism and pixel loss based on claim 2, characterized in that the generative confrontation network (GAN) is applied to condition settings as a result of the proposed face editing model generating conditions on input face images and heat maps; the loss of opposition of the generator and the discriminator are respectively shown as follows:
the generator is not only tasked with deceiving the discriminator, but also synthesizing a real image similar to the target calibration as much as possible; loss per pixel LpixelForcing the transformed face image to have a small distance with calibrated real data in the original pixel space; l ispixelThe form of (A) is as follows:
Lpixel=EI,H,I′~P(I,H,I′)‖I′-G(I,H)‖1(3)
encouraging fuzzy output using the L1 distance; (I, H, I') is (I)N,HE,IE) And (I)E,HE,IN) Depending on the generator.
4. Loss of cyclic consistency according to claim 2, characterized in that the generator G is based onEAnd GNConstructing a complete mapping cycle between the neutral expression face and the expression face; if a face image is converted from a neutral expression to an angry expression and then converted to a neutral expression, the same face image should be obtained in an ideal case; therefore, an extra cycle consistency penalty L is introducedcycTo ensure the consistency of the source image with the reconstructed image, e.g. INAnd GN(GE(IN,HE),HE)、IEAnd GE(GN(IE,HE),HE);LcycThe calculation is as follows:
Lcyc=EI,H~P(I,H)‖I-G′(G(I,H))‖1(4)
wherein G' is the generator opposite G; if G is used to convert the neutral expression to the expression specified by the facial geometry heat map H, then G' is used to restore the neutral expression with the help of H.
5. The identity retention loss of claim 2, wherein facial expression editing retains facial features after expression synthesis and removal; thus, identity retention terms are employed to enforce identity consistency:
Lidentity=EI,H~P(I,H)‖F(I)-F(G(I,H))‖1(5)
wherein F is a feature extractor for facial recognition; adopting a model of the mitigation convolutional neural network as a feature extraction network, wherein the model comprises 9 convolutional layers, 4 maximum pool layers and 1 full-connection layer; the convolutional neural network is lightened to be trained into a classifier capable of distinguishing various identities in advance, so that the most prominent characteristic of face recognition can be captured; thus, the face identity retained by the face editing process can be performed with this loss.
6. The overall loss function of claim 2, wherein the final complete target G of the generatorNAnd GEIs a weighted sum L of all losses defined aboveG-advRemoving the pattern difference, L, between the actual and generated samplespixelEnsuring pixel correctness, LcycEnsures the period consistency of the reconstructed image and the source image, LidentityThe identity is preserved by the mapping process, so the total loss function is:
LG=LG-adv+α1Lpixel+α2Lcyc+α3Lidentity(6)
wherein, α1、α2、α3Is the loss weight coefficient.
7. A facial geometry operation (ii) as claimed in claim 1 wherein the facial geometry operation comprises facial expression editing, facial expression transfer, facial expression synthesis and interpolation.
8. The facial expression editor of claim 7, wherein the geometric locations of a set of fiducial points are used to guide the facial expression editor; the human face has unique physiological structure characteristics, so that the correlation between the positions of the datum points is strong; therefore, changes in the face geometry should be limited to avoid unreasonable settings; in consideration of the prior knowledge of face distribution, a parameterized shape model is established as a geometric generator, and a basic shape model is learned from a marked training image;
firstly, normalizing the face to the same proportion according to the positions of two eyes and rotating the face to the horizontal; principal Component Analysis (PCA) is then applied to obtain a basic shape model of the K fiducial locations:
s(p)=s0+Sp(7)
wherein, s0∈R2K×1,S∈R2K×N,p∈RN×1(ii) a Basic shape s0Is the average shape of all training images, the column of S is the N eigenvectors corresponding to the N largest eigenvalues; different face geometries can be obtained by varying the value of the shape parameter p;
however, facial geometry is not only related to facial expressions, but also has a large relationship to facial identity; facial geometry varies from person to person even under the same expression; for example, the distance between the eyes and the length of the nose depend largely on the identity of the face and not the expression; based on these individual differences, an individual-specific shape model based on equation (7) is proposed, which can be constructed by using the neutral shapes of different individualsInstead of the mean shape s0Exporting; the individual specific shape model is given by:
wherein,representing identity related changes and p representing changes caused by facial expressions.
9. Facial expression transfer according to claim 7, characterized in that two expressive faces I are givenAAnd IBDetecting facial markers sAAnd sB(ii) a The expressive removal model is first used to restore the non-expressed face:
wherein,andrespectively represent IAAnd IBNeutral expressive faces of (1); thus, a neutral shapeAndmay be obtained by facial marker detection; the shape parameters are then derived by solving the following least squares regression problem:
changing the shape parameters to obtain the transfer position of the reference point:
converting the heat map according to the shape of the transfer, and connecting the heat map with the corresponding non-expressive surface to be used as an input of expression synthesis; finally, the result of facial expression transfer is obtained by using the expression synthesis model:
the expression synthesis model is represented by the above formula.
10. The facial expression synthesis and interpolation of claim 7, wherein a neutral expression face image and shape parameters are prepared for the target expression; obtaining a neutral expression face based on the proposed expression removal model; shape parameters for a particular expression may be learned from the annotated training dataset by a basic shape model (as shown in equation (7)); once the values of the shape parameters are associated with certain semantic attributes (such as fear and surprise), they can be used to synthesize a facial expression with the desired semantic type; further, the facial expression interpolation may be performed by linearly adjusting the value of the shape parameter.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810078963.1A CN108288072A (en) | 2018-01-26 | 2018-01-26 | A kind of facial expression synthetic method based on generation confrontation network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810078963.1A CN108288072A (en) | 2018-01-26 | 2018-01-26 | A kind of facial expression synthetic method based on generation confrontation network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108288072A true CN108288072A (en) | 2018-07-17 |
Family
ID=62835796
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810078963.1A Withdrawn CN108288072A (en) | 2018-01-26 | 2018-01-26 | A kind of facial expression synthetic method based on generation confrontation network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108288072A (en) |
Cited By (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109325549A (en) * | 2018-10-25 | 2019-02-12 | 电子科技大学 | A kind of facial image fusion method |
CN109409222A (en) * | 2018-09-20 | 2019-03-01 | 中国地质大学(武汉) | A kind of multi-angle of view facial expression recognizing method based on mobile terminal |
CN109447906A (en) * | 2018-11-08 | 2019-03-08 | 北京印刷学院 | A kind of picture synthetic method based on generation confrontation network |
CN109448083A (en) * | 2018-09-29 | 2019-03-08 | 浙江大学 | A method of human face animation is generated from single image |
CN109472838A (en) * | 2018-10-25 | 2019-03-15 | 广东智媒云图科技股份有限公司 | A kind of sketch generation method and device |
CN109558836A (en) * | 2018-11-28 | 2019-04-02 | 中国科学院深圳先进技术研究院 | A kind of processing method and relevant device of facial image |
CN109670491A (en) * | 2019-02-25 | 2019-04-23 | 百度在线网络技术(北京)有限公司 | Identify method, apparatus, equipment and the storage medium of facial image |
CN109840926A (en) * | 2018-12-29 | 2019-06-04 | 中国电子科技集团公司信息科学研究院 | A kind of image generating method, device and equipment |
CN109871888A (en) * | 2019-01-30 | 2019-06-11 | 中国地质大学(武汉) | A kind of image generating method and system based on capsule network |
CN109903363A (en) * | 2019-01-31 | 2019-06-18 | 天津大学 | Condition generates confrontation Network Three-dimensional human face expression moving cell synthetic method |
CN110084193A (en) * | 2019-04-26 | 2019-08-02 | 深圳市腾讯计算机系统有限公司 | Data processing method, equipment and medium for Facial image synthesis |
CN110148194A (en) * | 2019-05-07 | 2019-08-20 | 北京航空航天大学 | Image rebuilding method and device |
CN110210556A (en) * | 2019-05-29 | 2019-09-06 | 中国科学技术大学 | Pedestrian identifies data creation method again |
CN110276745A (en) * | 2019-05-22 | 2019-09-24 | 南京航空航天大学 | A kind of pathological image detection algorithm based on generation confrontation network |
CN110322548A (en) * | 2019-06-11 | 2019-10-11 | 北京工业大学 | A kind of three-dimensional grid model generation method based on several picture parametrization |
CN110363060A (en) * | 2019-04-04 | 2019-10-22 | 杭州电子科技大学 | The small sample target identification method of confrontation network is generated based on proper subspace |
CN110620884A (en) * | 2019-09-19 | 2019-12-27 | 平安科技(深圳)有限公司 | Expression-driven-based virtual video synthesis method and device and storage medium |
CN110689480A (en) * | 2019-09-27 | 2020-01-14 | 腾讯科技(深圳)有限公司 | Image transformation method and device |
CN110909814A (en) * | 2019-11-29 | 2020-03-24 | 华南理工大学 | Classification method based on feature separation |
WO2020062120A1 (en) * | 2018-09-29 | 2020-04-02 | 浙江大学 | Method for generating facial animation from single image |
CN111310647A (en) * | 2020-02-12 | 2020-06-19 | 北京云住养科技有限公司 | Generation method and device for automatic identification falling model |
CN111563427A (en) * | 2020-04-23 | 2020-08-21 | 中国科学院半导体研究所 | Method, device and equipment for editing attribute of face image |
CN111860041A (en) * | 2019-04-26 | 2020-10-30 | 北京陌陌信息技术有限公司 | Face conversion model training method, device, equipment and medium |
CN112215927A (en) * | 2020-09-18 | 2021-01-12 | 腾讯科技(深圳)有限公司 | Method, device, equipment and medium for synthesizing face video |
CN112364745A (en) * | 2020-11-04 | 2021-02-12 | 北京瑞莱智慧科技有限公司 | Method and device for generating countermeasure sample and electronic equipment |
CN112488984A (en) * | 2019-09-11 | 2021-03-12 | 中信戴卡股份有限公司 | Method and device for acquiring defect picture generation network and defect picture generation method |
CN112767250A (en) * | 2021-01-19 | 2021-05-07 | 南京理工大学 | Video blind super-resolution reconstruction method and system based on self-supervision learning |
CN113033511A (en) * | 2021-05-21 | 2021-06-25 | 中国科学院自动化研究所 | Face anonymization method based on control decoupling identity representation |
CN113643400A (en) * | 2021-08-23 | 2021-11-12 | 哈尔滨工业大学(威海) | Image generation method |
WO2021228183A1 (en) * | 2020-05-13 | 2021-11-18 | Huawei Technologies Co., Ltd. | Facial re-enactment |
CN113706379A (en) * | 2021-07-29 | 2021-11-26 | 山东财经大学 | Interlayer interpolation method and system based on medical image processing |
US11475608B2 (en) | 2019-09-26 | 2022-10-18 | Apple Inc. | Face image generation with pose and expression control |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008144728A1 (en) * | 2007-05-21 | 2008-11-27 | Bracco Imaging S.P.A. | Conjugates which bind a blood protein such as human serum albumin and methods of using the same in diagnostic and therapeutic applications |
CN104123562A (en) * | 2014-07-10 | 2014-10-29 | 华东师范大学 | Human body face expression identification method and device based on binocular vision |
CN107437077A (en) * | 2017-08-04 | 2017-12-05 | 深圳市唯特视科技有限公司 | A kind of method that rotation face based on generation confrontation network represents study |
-
2018
- 2018-01-26 CN CN201810078963.1A patent/CN108288072A/en not_active Withdrawn
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008144728A1 (en) * | 2007-05-21 | 2008-11-27 | Bracco Imaging S.P.A. | Conjugates which bind a blood protein such as human serum albumin and methods of using the same in diagnostic and therapeutic applications |
CN104123562A (en) * | 2014-07-10 | 2014-10-29 | 华东师范大学 | Human body face expression identification method and device based on binocular vision |
CN107437077A (en) * | 2017-08-04 | 2017-12-05 | 深圳市唯特视科技有限公司 | A kind of method that rotation face based on generation confrontation network represents study |
Non-Patent Citations (1)
Title |
---|
SONG, LINGXIAO ET AL: ""Geometry Guided Adversarial Facial Expression Synthesis"", 《HTTPS://ARXIV.ORG/PDF/1712.03474.PDF》 * |
Cited By (50)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109409222A (en) * | 2018-09-20 | 2019-03-01 | 中国地质大学(武汉) | A kind of multi-angle of view facial expression recognizing method based on mobile terminal |
US11544887B2 (en) | 2018-09-29 | 2023-01-03 | Zhejiang University | Method for generating facial animation from single image |
WO2020062120A1 (en) * | 2018-09-29 | 2020-04-02 | 浙江大学 | Method for generating facial animation from single image |
CN109448083A (en) * | 2018-09-29 | 2019-03-08 | 浙江大学 | A method of human face animation is generated from single image |
CN109325549A (en) * | 2018-10-25 | 2019-02-12 | 电子科技大学 | A kind of facial image fusion method |
CN109472838A (en) * | 2018-10-25 | 2019-03-15 | 广东智媒云图科技股份有限公司 | A kind of sketch generation method and device |
CN109325549B (en) * | 2018-10-25 | 2022-03-04 | 电子科技大学 | Face image fusion method |
CN109447906B (en) * | 2018-11-08 | 2023-07-11 | 北京印刷学院 | Picture synthesis method based on generation countermeasure network |
CN109447906A (en) * | 2018-11-08 | 2019-03-08 | 北京印刷学院 | A kind of picture synthetic method based on generation confrontation network |
CN109558836A (en) * | 2018-11-28 | 2019-04-02 | 中国科学院深圳先进技术研究院 | A kind of processing method and relevant device of facial image |
CN109558836B (en) * | 2018-11-28 | 2021-06-15 | 中国科学院深圳先进技术研究院 | Face image processing method and related equipment |
CN109840926A (en) * | 2018-12-29 | 2019-06-04 | 中国电子科技集团公司信息科学研究院 | A kind of image generating method, device and equipment |
CN109840926B (en) * | 2018-12-29 | 2023-06-20 | 中国电子科技集团公司信息科学研究院 | Image generation method, device and equipment |
CN109871888A (en) * | 2019-01-30 | 2019-06-11 | 中国地质大学(武汉) | A kind of image generating method and system based on capsule network |
CN109903363A (en) * | 2019-01-31 | 2019-06-18 | 天津大学 | Condition generates confrontation Network Three-dimensional human face expression moving cell synthetic method |
CN109670491A (en) * | 2019-02-25 | 2019-04-23 | 百度在线网络技术(北京)有限公司 | Identify method, apparatus, equipment and the storage medium of facial image |
CN110363060A (en) * | 2019-04-04 | 2019-10-22 | 杭州电子科技大学 | The small sample target identification method of confrontation network is generated based on proper subspace |
CN111860041B (en) * | 2019-04-26 | 2024-06-11 | 北京陌陌信息技术有限公司 | Face conversion model training method, device, equipment and medium |
US11854247B2 (en) | 2019-04-26 | 2023-12-26 | Tencent Technology (Shenzhen) Company Limited | Data processing method and device for generating face image and medium |
CN111860041A (en) * | 2019-04-26 | 2020-10-30 | 北京陌陌信息技术有限公司 | Face conversion model training method, device, equipment and medium |
CN110084193A (en) * | 2019-04-26 | 2019-08-02 | 深圳市腾讯计算机系统有限公司 | Data processing method, equipment and medium for Facial image synthesis |
EP3961486A4 (en) * | 2019-04-26 | 2022-07-13 | Tencent Technology (Shenzhen) Company Limited | Data processing method and device for facial image generation, and medium |
CN110148194A (en) * | 2019-05-07 | 2019-08-20 | 北京航空航天大学 | Image rebuilding method and device |
CN110276745A (en) * | 2019-05-22 | 2019-09-24 | 南京航空航天大学 | A kind of pathological image detection algorithm based on generation confrontation network |
CN110210556A (en) * | 2019-05-29 | 2019-09-06 | 中国科学技术大学 | Pedestrian identifies data creation method again |
CN110210556B (en) * | 2019-05-29 | 2022-09-06 | 中国科学技术大学 | Pedestrian re-identification data generation method |
CN110322548A (en) * | 2019-06-11 | 2019-10-11 | 北京工业大学 | A kind of three-dimensional grid model generation method based on several picture parametrization |
CN110322548B (en) * | 2019-06-11 | 2023-04-18 | 北京工业大学 | Three-dimensional grid model generation method based on geometric image parameterization |
CN112488984A (en) * | 2019-09-11 | 2021-03-12 | 中信戴卡股份有限公司 | Method and device for acquiring defect picture generation network and defect picture generation method |
CN110620884B (en) * | 2019-09-19 | 2022-04-22 | 平安科技(深圳)有限公司 | Expression-driven-based virtual video synthesis method and device and storage medium |
WO2021051605A1 (en) * | 2019-09-19 | 2021-03-25 | 平安科技(深圳)有限公司 | Virtual video synthesis method and apparatus based on expression driving, and storage medium |
CN110620884A (en) * | 2019-09-19 | 2019-12-27 | 平安科技(深圳)有限公司 | Expression-driven-based virtual video synthesis method and device and storage medium |
US11475608B2 (en) | 2019-09-26 | 2022-10-18 | Apple Inc. | Face image generation with pose and expression control |
CN110689480A (en) * | 2019-09-27 | 2020-01-14 | 腾讯科技(深圳)有限公司 | Image transformation method and device |
CN110689480B (en) * | 2019-09-27 | 2021-08-10 | 腾讯科技(深圳)有限公司 | Image transformation method and device |
CN110909814A (en) * | 2019-11-29 | 2020-03-24 | 华南理工大学 | Classification method based on feature separation |
CN110909814B (en) * | 2019-11-29 | 2023-05-26 | 华南理工大学 | Classification method based on feature separation |
CN111310647A (en) * | 2020-02-12 | 2020-06-19 | 北京云住养科技有限公司 | Generation method and device for automatic identification falling model |
CN111563427A (en) * | 2020-04-23 | 2020-08-21 | 中国科学院半导体研究所 | Method, device and equipment for editing attribute of face image |
WO2021228183A1 (en) * | 2020-05-13 | 2021-11-18 | Huawei Technologies Co., Ltd. | Facial re-enactment |
CN112215927A (en) * | 2020-09-18 | 2021-01-12 | 腾讯科技(深圳)有限公司 | Method, device, equipment and medium for synthesizing face video |
CN112215927B (en) * | 2020-09-18 | 2023-06-23 | 腾讯科技(深圳)有限公司 | Face video synthesis method, device, equipment and medium |
CN112364745A (en) * | 2020-11-04 | 2021-02-12 | 北京瑞莱智慧科技有限公司 | Method and device for generating countermeasure sample and electronic equipment |
CN112767250A (en) * | 2021-01-19 | 2021-05-07 | 南京理工大学 | Video blind super-resolution reconstruction method and system based on self-supervision learning |
CN112767250B (en) * | 2021-01-19 | 2021-10-15 | 南京理工大学 | Video blind super-resolution reconstruction method and system based on self-supervision learning |
CN113033511A (en) * | 2021-05-21 | 2021-06-25 | 中国科学院自动化研究所 | Face anonymization method based on control decoupling identity representation |
CN113706379B (en) * | 2021-07-29 | 2023-05-26 | 山东财经大学 | Interlayer interpolation method and system based on medical image processing |
CN113706379A (en) * | 2021-07-29 | 2021-11-26 | 山东财经大学 | Interlayer interpolation method and system based on medical image processing |
CN113643400B (en) * | 2021-08-23 | 2022-05-24 | 哈尔滨工业大学(威海) | Image generation method |
CN113643400A (en) * | 2021-08-23 | 2021-11-12 | 哈尔滨工业大学(威海) | Image generation method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108288072A (en) | A kind of facial expression synthetic method based on generation confrontation network | |
Cheng et al. | Parametric modeling of 3D human body shape—A survey | |
Li et al. | PoNA: Pose-guided non-local attention for human pose transfer | |
CN109509152A (en) | A kind of image super-resolution rebuilding method of the generation confrontation network based on Fusion Features | |
CN101324961B (en) | Human face portion three-dimensional picture pasting method in computer virtual world | |
Sheng et al. | Deep neural representation guided face sketch synthesis | |
CN109472198A (en) | A kind of video smiling face's recognition methods of attitude robust | |
CN113393550B (en) | Fashion garment design synthesis method guided by postures and textures | |
CN110796593A (en) | Image processing method, device, medium and electronic equipment based on artificial intelligence | |
CN111476241B (en) | Character clothing conversion method and system | |
CN113255457A (en) | Animation character facial expression generation method and system based on facial expression recognition | |
CN117496072B (en) | Three-dimensional digital person generation and interaction method and system | |
CN113362422B (en) | Shadow robust makeup transfer system and method based on decoupling representation | |
CN111028319A (en) | Three-dimensional non-photorealistic expression generation method based on facial motion unit | |
CN107066979A (en) | A kind of human motion recognition method based on depth information and various dimensions convolutional neural networks | |
CN111950430A (en) | Color texture based multi-scale makeup style difference measurement and migration method and system | |
CN111462274A (en) | Human body image synthesis method and system based on SMP L model | |
Zhou et al. | Towards multi-domain face synthesis via domain-invariant representations and multi-level feature parts | |
CN116883608B (en) | Multi-mode digital person social attribute control method and related device | |
CN117611428A (en) | Fashion character image style conversion method | |
CN111368663A (en) | Method, device, medium and equipment for recognizing static facial expressions in natural scene | |
Peng et al. | ISFB-GAN: Interpretable semantic face beautification with generative adversarial network | |
Li et al. | Ecnet: Effective controllable text-to-image diffusion models | |
Yang et al. | Semantics-preserving sketch embedding for face generation | |
Bai et al. | Construction of virtual image synthesis module based on computer technology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20180717 |
|
WW01 | Invention patent application withdrawn after publication |