CN108171770B - Facial expression editing method based on generative confrontation network - Google Patents

Facial expression editing method based on generative confrontation network Download PDF

Info

Publication number
CN108171770B
CN108171770B CN201810048098.6A CN201810048098A CN108171770B CN 108171770 B CN108171770 B CN 108171770B CN 201810048098 A CN201810048098 A CN 201810048098A CN 108171770 B CN108171770 B CN 108171770B
Authority
CN
China
Prior art keywords
face
picture
generator
expression
discriminator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810048098.6A
Other languages
Chinese (zh)
Other versions
CN108171770A (en
Inventor
张刚
韩琥
张�杰
山世光
陈熙霖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Seetatech Beijing Technology Co ltd
Original Assignee
Seetatech Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Seetatech Beijing Technology Co ltd filed Critical Seetatech Beijing Technology Co ltd
Priority to CN201810048098.6A priority Critical patent/CN108171770B/en
Publication of CN108171770A publication Critical patent/CN108171770A/en
Application granted granted Critical
Publication of CN108171770B publication Critical patent/CN108171770B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/001Texturing; Colouring; Generation of texture or colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention discloses a facial expression editing method based on a generative confrontation network, which comprises the following steps of: in the data preparation stage, the face image is manually marked and cut; a model design stage, wherein a generator and a discriminator generate a model; in the model training stage, a real face picture with labels and a picture generated by a generator are input into a discriminator, training is carried out, and the discriminator is used for distinguishing the distribution of a real sample and a generated sample, and learning the distribution of face expression and the distribution of face identity information; inputting the face picture to be edited and the expression control vector into a generator, and outputting the face picture controlled by the expression control vector; then, the trained discriminator is actually trained; repeating the steps to complete the construction of the model; the input image is used for testing the constructed model. The invention can ensure that the generator generates the face picture which is closer to the real face picture distribution, better keeps the face identity information and more effectively edits the face picture by expression.

Description

Facial expression editing method based on generative confrontation network
Technical Field
The invention relates to an editing method, in particular to a facial expression editing method based on a generative confrontation network, and belongs to the technical field of computer vision.
Background
The facial expression editing requires that the facial expression in the photo is controlled while the face identity information is kept, and the technology has wide application in the fields of facial animation, social software, face recognition data set augmentation and the like. The current facial expression editing method is based on a three-dimensional face deformable model, and the representative methods are as follows: a single-camera and motion capture data-based facial expression editing method with patent number 201310451508.9, which comprises the following main technical means: generating a three-dimensional face model of a user by using a picture of the user, decoupling the three-dimensional face model, and separating out identity and expression; and then, by controlling the facial expression components, a new facial three-dimensional model is synthesized, and facial expression editing is realized. The problems and disadvantages of this method are: the method is only suitable for editing the expression of the three-dimensional face model and is not suitable for editing the expression of the two-dimensional face picture. Because when the facial expression changes, not only the face shape changes, but also the face surface texture changes. Therefore, the facial expression editing method based on the three-dimensional face model is difficult to modify the facial texture.
Disclosure of Invention
In order to solve the defects of the technology, the invention provides a facial expression editing method based on a generative confrontation network.
In order to solve the technical problems, the invention adopts the technical scheme that: a facial expression editing method based on a generative confrontation network comprises the following overall steps:
step S1, data preparation phase
a. Manually labeling each face in the RGB image set, and labeling face identity information and face expression information; the marking information of each picture is represented by [ i, j ], wherein i represents that the picture belongs to the ith individual, and j represents that the picture belongs to the jth expression;
b. cutting the faces in the marked image set from the pictures through a face detector and a face characteristic point detector, and aligning the faces;
step S2, model design phase
a. The model consists of two parts, namely a generator G and a discriminator D; the generator G is used for generating a face picture controlled by expression control vectors according to the input face picture to be edited and the expression control vectors; the discriminator D is used for distinguishing the distribution of real samples and generated samples according to the pictures generated by the generator G and the real face pictures with labels, and learning the distribution of face expression and the distribution of face identity information;
b. forming a frame edited by the facial expressions based on the generative confrontation network by using the generator G and the discriminator D so as to carry out confrontation training;
step S3, model training phase
a. Inputting the real labeled human face picture and the picture generated by the generator G into a discriminator D, training and enabling the discriminator D to be used for distinguishing the distribution of a real sample and a generated sample, and learning the distribution of human face expression and the distribution of human face identity information; the picture generated by the generator G is marked as false [0], and the real face picture with the label is marked as true [1, i, j ];
b. and inputting the face picture img0[ i, j ] to be edited and the expression control vector y into a generator G, and outputting the face picture controlled by the expression control vector. Then inputting the false picture [0] output by the generator G into the discriminator D, so that the discriminator discriminates the false picture [1], the face identity information i and the face expression information j, and the generator G is ensured to generate a face picture which is more real, better in identity information retention and more effective in expression control;
c. repeating the step a for 3 times and then repeating the step b for 1 time, so that the discriminant D is better trained; the better the discriminant D is trained, the more beneficial the training of the generator G is;
d. storing the model parameters once per epoch, editing the facial expressions on the test set, and observing the output picture effect of the generator G; stopping model training when the generator G generates a human face picture meeting the requirements; meanwhile, storing the face picture to generate a model parameter with the best visual effect;
step S4, model testing phase
a. The input image is an image I containing a human face;
b. inputting the image I into a face detector, obtaining a face position, cutting the image I by using the face position to obtain a face image img0, and aligning the face of the face image img 0;
c. and inputting the aligned face picture and the expression control vector into a generator G to obtain a face picture img1 after expression editing.
The invention uses a unique full convolution network as a generator, and simultaneously performs true and false classification, facial expression classification and facial identity classification on the discriminator, thereby ensuring that the generator generates a facial picture which is closer to the distribution of a real facial picture, better keeping the facial identity information and editing the facial picture more effectively by expression.
Drawings
FIG. 1 is a diagram of a model design architecture of the present invention.
Fig. 2 is a schematic overall flowchart of step S4.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
A facial expression editing method based on a generative confrontation network comprises the following overall steps:
step S1, data preparation phase
a. Manually labeling each face in the RGB image set, and labeling face identity information and face expression information; the marking information of each picture is represented by [ i, j ], i represents that the picture belongs to the ith individual (i is more than or equal to 0 and less than N), and j represents that the picture belongs to the jth expression (j is more than or equal to 0 and less than M); the whole picture set comprises N persons and M expressions;
b. cutting the faces in the marked image set from the pictures through a face detector and a face characteristic point detector, and aligning the faces;
step S2, model design phase
a. The model consists of two parts, namely a generator G and a discriminator D; the generator G is used for generating a face picture controlled by expression control vectors according to the input face picture to be edited and the expression control vectors; the discriminator D is used for distinguishing the distribution of real samples and generated samples according to the pictures generated by the generator G and the real face pictures with labels, and learning the distribution of face expression and the distribution of face identity information; the overall framework of the model is shown in FIG. 1;
b. forming a frame edited by the facial expressions based on the generative confrontation network by using the generator G and the discriminator D so as to carry out confrontation training; the network structures of the generator G and the discriminator D are shown in tables 1 and 2, respectively.
Table 1 network structure of generator G
Generator G
Aligned color face pictures 128 x 3
3 × 3 convolution, batch normalization, exponential linear unit activation; 3 x 3 convolution, batch normalization, exponential linear cell activation
Maximum pooling 2 x 2
3 × 3 convolution, batch normalization, exponential linear unit activation; 3 x 3 convolution, batch normalization, exponential linear cell activation
Maximum pooling 2 x 2
3 × 3 convolution, batch normalization, exponential linear unit activation; 3 x 3 convolution, batch normalization, exponential linear cell activation
Nearest neighbor upsampling 2 x 2
3 × 3 convolution, batch normalization, exponential linear unit activation; 3 x 3 convolution, batch normalization, exponential linear cell activation
Nearest neighbor upsampling 2 x 2
3 × 3 convolution, batch normalization, exponential linear unit activation; 3 x 3 convolution, batch normalization, exponential linear cell activation
3 x 3 convolution, hyperbolic tangent function activation
TABLE 2 network architecture for discrimination D
Discriminator D
Aligned color face pictures 128 x 3
3 × 3 convolution, batch normalization, leaky modified Linear cell activation (slope 0.02)
3 × 3 convolution (step 2), batch normalization, leaky modified linear cell activation (slope 0.02)
3 × 3 convolution (step 2), batch normalization, leaky modified linear cell activation (slope 0.02)
3 × 3 convolution (step 2), batch normalization, leaky modified linear cell activation (slope 0.02)
3 × 3 convolution (step 2), batch normalization, leaky modified linear cell activation (slope 0.02)
Global mean pooling
Full connection layer
True/false classification; classifying the expressions; identity classification
Step S3, model training phase
a. Inputting the real face picture with the label and the picture generated by the generator G into a discriminator D, training and enabling the discriminator D to be used for distinguishing the distribution of the real sample and the generated sample, and learning the distribution of the face expression and the distribution of the face identity information. Wherein, the picture generated by the generator G is marked as false [0], and the real face picture with label is marked as true [1, i, j ]. The face identity label i and the face expression label j are both from a real face picture with labels;
b. and inputting the face picture img0[ i, j ] to be edited and the expression control vector y into a generator G, and outputting the face picture controlled by the expression control vector. And inputting the false picture [0] output by the generator G into a discriminator D, so that the discriminator judges the false picture [1], the face identity information i and the face expression information j. Therefore, the generator G can generate a more real face picture, the identity information can be well kept, and the expression control is more effective;
c. repeating the step a for 3 times and then repeating the step b for 1 time, so that the discriminant D is better trained; the better the discriminant D is trained, the more beneficial the training of the generator G is;
d. model parameters are saved once per epoch (1 epoch equals one training with all samples in the picture set) and facial expression editing is performed on the test set, observing the output picture effect of generator G. And stopping model training when the generator G generates the human face picture meeting the requirements. Meanwhile, the face picture is stored to generate the model parameter with the best visual effect.
Step S4, model testing phase
a. The input image is an image I containing a human face;
b. inputting the image I into a face detector, obtaining a face position, cutting the image I by using the face position to obtain a face image img0, and aligning the face of the face image img 0;
c. inputting the aligned face picture and the expression control vector into a generator G to obtain a face picture img1 after expression editing; the overall flow chart of this step is shown in fig. 2.
Compared with the prior art, the invention has the following key points and advantages:
first, with respect to the generator G network: 1) the activation function adopts an exponential linear unit, and the upsampling layer adopts nearest neighbor upsampling 2 x 2; 2) and adding the expression control information to an input end, and connecting the expression control information with the input face picture to form a four-channel tensor through a full connecting layer and remodeling operation.
The beneficial effects are that: the activation function adopts an exponential linear unit to increase the nonlinearity of the network, so that the network has stronger nonlinear fitting capacity; the sampling layer adopts nearest neighbor upsampling 2 x 2 comparison deconvolution operation, and the picture generation effect is better; the expression control information is added to the input end, and the advantage is that the face picture information and the expression control information can interact earlier, so that a better control effect is achieved.
Second, with regard to the arbiter D network: the discriminator D network is used not only to distinguish the distribution of real samples and generated samples, but also to learn the distribution of facial expressions and the distribution of facial identity information.
The beneficial effects are that: the discriminator D simultaneously performs true and false classification, facial expression classification and facial identity classification, so that the generator G can be ensured to generate a facial picture distribution which is closer to the real facial picture, the facial identity information is better kept, and the facial picture is more effectively edited by expression.
Thirdly, the confrontation training process: 1) in the countercheck training process, the discriminator D is used for distinguishing the distribution of real samples and generated samples, learning the distribution of facial expressions and the distribution of facial identity information; the generator G is used for deceiving the discriminator D on one hand to reduce the difference between the generated sample and the real sample; on the other hand, the picture generated by G is judged to be the same person as the input picture of G by D, and the expression information is the same as the expression control vector.
The beneficial effects are that: the generator G can be ensured to generate a face picture which is closer to the real face picture distribution, the face identity information can be better kept, and the face picture can be more effectively edited by expression.
The above embodiments are not intended to limit the present invention, and the present invention is not limited to the above examples, and those skilled in the art may make variations, modifications, additions or substitutions within the technical scope of the present invention.

Claims (1)

1. A facial expression editing method based on a generative confrontation network is characterized in that: the method comprises the following overall steps:
step S1, data preparation phase
a1, manually labeling each face in the RGB image set, and marking face identity information and face expression information; the marking information of each picture is represented by [ i, j ], wherein i represents that the picture belongs to the ith individual, and j represents that the picture belongs to the jth expression;
b1, cutting the faces in the marked image set from the pictures through a face detector and a face characteristic point detector, and aligning the faces;
step S2, model design phase
a2, the model is composed of two parts, which are a generator G and a discriminator D respectively; the generator G is used for generating a face picture controlled by expression control vectors according to the input face picture to be edited and the expression control vectors; the discriminator D is used for distinguishing the distribution of real samples and generated samples according to the pictures generated by the generator G and the real face pictures with labels, and learning the distribution of face expression and the distribution of face identity information;
b2, composing a frame of facial expression edition based on a generative confrontation network by using the generator G and the discriminator D, thereby carrying out confrontation training;
step S3, model training phase
a3, inputting the real face picture with the label and the picture generated by the generator G into a discriminator D, training and enabling the discriminator D to be used for distinguishing the distribution of real samples and generated samples, and learning the distribution of face expression and the distribution of face identity information; the picture generated by the generator G is marked as false [0], and the real face picture with the label is marked as true [1, i, j ];
b3, inputting a face picture img0[ i, j ] to be edited and an expression control vector y into a generator G, and outputting a face picture controlled by the expression control vector; then inputting the false picture [0] output by the generator G into the discriminator D, so that the discriminator discriminates the false picture [1], the face identity information i and the face expression information j, and the generator G is ensured to generate a face picture which is more real, better in identity information retention and more effective in expression control;
c3, repeating the step a3 every 3 times, then repeating the step b 31 time, and training the discriminator D and the generator G;
d3, storing the model parameters once per epoch, editing the facial expression on the test set, and observing the output picture effect of the generator G; stopping model training when the generator G generates a human face picture meeting the requirements; meanwhile, saving the model parameters generated by the current face picture;
step S4, model testing phase
a4, the input image is an image I containing a human face;
b4, inputting the image I into a face detector and obtaining a face position, cutting the image I by using the face position to obtain a face image img0, and aligning the face of the face image img 0;
c4, inputting the aligned face picture and the expression control vector into a generator G to obtain an expression edited face picture img 1.
CN201810048098.6A 2018-01-18 2018-01-18 Facial expression editing method based on generative confrontation network Active CN108171770B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810048098.6A CN108171770B (en) 2018-01-18 2018-01-18 Facial expression editing method based on generative confrontation network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810048098.6A CN108171770B (en) 2018-01-18 2018-01-18 Facial expression editing method based on generative confrontation network

Publications (2)

Publication Number Publication Date
CN108171770A CN108171770A (en) 2018-06-15
CN108171770B true CN108171770B (en) 2021-04-06

Family

ID=62514820

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810048098.6A Active CN108171770B (en) 2018-01-18 2018-01-18 Facial expression editing method based on generative confrontation network

Country Status (1)

Country Link
CN (1) CN108171770B (en)

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110659657A (en) * 2018-06-29 2020-01-07 北京京东尚科信息技术有限公司 Method and device for training model
CN108921123A (en) * 2018-07-17 2018-11-30 重庆科技学院 A kind of face identification method based on double data enhancing
CN108932660A (en) * 2018-07-26 2018-12-04 北京旷视科技有限公司 A kind of commodity using effect analogy method, device and equipment
CN109242000B (en) * 2018-08-09 2021-08-31 百度在线网络技术(北京)有限公司 Image processing method, device, equipment and computer readable storage medium
CN109308725B (en) * 2018-08-29 2020-09-22 华南理工大学 System for generating mobile terminal table sentiment picture
CN109508643A (en) * 2018-10-19 2019-03-22 北京陌上花科技有限公司 Image processing method and device for porny
CN109508669B (en) * 2018-11-09 2021-07-23 厦门大学 Facial expression recognition method based on generative confrontation network
CN109801228A (en) * 2018-12-18 2019-05-24 合肥阿巴赛信息科技有限公司 A kind of jewelry picture beautification algorithm based on deep learning
CN109829959B (en) * 2018-12-25 2021-01-08 中国科学院自动化研究所 Facial analysis-based expression editing method and device
CN111401101A (en) * 2018-12-29 2020-07-10 上海智臻智能网络科技股份有限公司 Video generation system based on portrait
CN109903363A (en) * 2019-01-31 2019-06-18 天津大学 Condition generates confrontation Network Three-dimensional human face expression moving cell synthetic method
CN109993678B (en) * 2019-03-26 2020-04-07 南京联创北斗技术应用研究院有限公司 Robust information hiding method based on deep confrontation generation network
CN110276252B (en) * 2019-05-15 2021-07-30 北京大学 Anti-expression-interference face recognition method based on generative countermeasure network
CN110427864B (en) * 2019-07-29 2023-04-21 腾讯科技(深圳)有限公司 Image processing method and device and electronic equipment
GB2586260B (en) * 2019-08-15 2021-09-15 Huawei Tech Co Ltd Facial image processing
CN110570383B (en) * 2019-09-25 2022-05-06 北京字节跳动网络技术有限公司 Image processing method and device, electronic equipment and storage medium
CN110688972B (en) * 2019-09-30 2023-02-03 上海依图网络科技有限公司 System and method for improving face generation performance
CN111275779B (en) * 2020-01-08 2022-12-16 网易(杭州)网络有限公司 Expression migration method, training method and device of image generator and electronic equipment
CN111243066B (en) * 2020-01-09 2022-03-22 浙江大学 Facial expression migration method based on self-supervision learning and confrontation generation mechanism
CN111242213B (en) * 2020-01-13 2023-07-25 上海大学 Label-free automatic face attribute editing method
CN111524216B (en) * 2020-04-10 2023-06-27 北京百度网讯科技有限公司 Method and device for generating three-dimensional face data
CN111563427A (en) * 2020-04-23 2020-08-21 中国科学院半导体研究所 Method, device and equipment for editing attribute of face image
CN111932438A (en) * 2020-06-18 2020-11-13 浙江大华技术股份有限公司 Image style migration method, equipment and storage device
CN111932661B (en) * 2020-08-19 2023-10-24 上海艾麒信息科技股份有限公司 Facial expression editing system and method and terminal
CN112200024B (en) * 2020-09-24 2022-10-11 复旦大学 Two-dimensional facial expression recognition method through three-dimensional deformable model learning
CN112581591A (en) * 2021-01-29 2021-03-30 秒影工场(北京)科技有限公司 Adjustable human face picture generation method based on GAN and three-dimensional model parameters
CN113780084A (en) * 2021-08-11 2021-12-10 上海藤核智能科技有限公司 Face data amplification method based on generative countermeasure network, electronic equipment and storage medium
CN116363737B (en) * 2023-06-01 2023-07-28 北京烽火万家科技有限公司 Face image attribute editing method, system, electronic equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120232966A1 (en) * 2011-03-08 2012-09-13 Bank Of America Corporation Identifying predetermined objects in a video stream captured by a mobile device
CN106204698A (en) * 2015-05-06 2016-12-07 北京蓝犀时空科技有限公司 Virtual image for independent assortment creation generates and uses the method and system of expression
CN107292813B (en) * 2017-05-17 2019-10-22 浙江大学 A kind of multi-pose Face generation method based on generation confrontation network
CN107437077A (en) * 2017-08-04 2017-12-05 深圳市唯特视科技有限公司 A kind of method that rotation face based on generation confrontation network represents study

Also Published As

Publication number Publication date
CN108171770A (en) 2018-06-15

Similar Documents

Publication Publication Date Title
CN108171770B (en) Facial expression editing method based on generative confrontation network
CN108363973B (en) Unconstrained 3D expression migration method
CN110084156A (en) A kind of gait feature abstracting method and pedestrian's personal identification method based on gait feature
CN111028305A (en) Expression generation method, device, equipment and storage medium
CN110781829A (en) Light-weight deep learning intelligent business hall face recognition method
CN112132197A (en) Model training method, image processing method, device, computer equipment and storage medium
CN111028319B (en) Three-dimensional non-photorealistic expression generation method based on facial motion unit
Shuai et al. Object detection system based on SSD algorithm
CN112507904B (en) Real-time classroom human body posture detection method based on multi-scale features
Li et al. Task relation networks
CN112580521A (en) Multi-feature true and false video detection method based on MAML (maximum likelihood modeling language) meta-learning algorithm
CN110889335B (en) Human skeleton double interaction behavior identification method based on multichannel space-time fusion network
Depuru et al. Convolutional neural network based human emotion recognition system: A deep learning approach
CN115205926A (en) Lightweight robust face alignment method and system based on multitask learning
CN111401116A (en) Bimodal emotion recognition method based on enhanced convolution and space-time L STM network
CN111626197B (en) Recognition method based on human behavior recognition network model
CN112560668A (en) Human behavior identification method based on scene prior knowledge
CN110378979A (en) The method automatically generated based on the generation confrontation customized high-resolution human face picture of network implementations
Zhu et al. Switchgan for multi-domain facial image translation
Li et al. Face hallucination with pose variation
Li et al. Comparative research of fpn and mtcn in face attribute recognition
CN113326739B (en) Online learning participation degree evaluation method based on space-time attention network, evaluation system, equipment and storage medium
CN111340111B (en) Method for recognizing face image set based on wavelet kernel extreme learning machine
Yan Expression capture of virtual reality equipment based on expression weighted distance SLLE
Xia et al. Students’ Classroom Behavior Recognition Based on Behavior Pose and Attention Mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant