CN108171770B - Facial expression editing method based on generative confrontation network - Google Patents
Facial expression editing method based on generative confrontation network Download PDFInfo
- Publication number
- CN108171770B CN108171770B CN201810048098.6A CN201810048098A CN108171770B CN 108171770 B CN108171770 B CN 108171770B CN 201810048098 A CN201810048098 A CN 201810048098A CN 108171770 B CN108171770 B CN 108171770B
- Authority
- CN
- China
- Prior art keywords
- face
- picture
- generator
- expression
- discriminator
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/001—Texturing; Colouring; Generation of texture or colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Abstract
The invention discloses a facial expression editing method based on a generative confrontation network, which comprises the following steps of: in the data preparation stage, the face image is manually marked and cut; a model design stage, wherein a generator and a discriminator generate a model; in the model training stage, a real face picture with labels and a picture generated by a generator are input into a discriminator, training is carried out, and the discriminator is used for distinguishing the distribution of a real sample and a generated sample, and learning the distribution of face expression and the distribution of face identity information; inputting the face picture to be edited and the expression control vector into a generator, and outputting the face picture controlled by the expression control vector; then, the trained discriminator is actually trained; repeating the steps to complete the construction of the model; the input image is used for testing the constructed model. The invention can ensure that the generator generates the face picture which is closer to the real face picture distribution, better keeps the face identity information and more effectively edits the face picture by expression.
Description
Technical Field
The invention relates to an editing method, in particular to a facial expression editing method based on a generative confrontation network, and belongs to the technical field of computer vision.
Background
The facial expression editing requires that the facial expression in the photo is controlled while the face identity information is kept, and the technology has wide application in the fields of facial animation, social software, face recognition data set augmentation and the like. The current facial expression editing method is based on a three-dimensional face deformable model, and the representative methods are as follows: a single-camera and motion capture data-based facial expression editing method with patent number 201310451508.9, which comprises the following main technical means: generating a three-dimensional face model of a user by using a picture of the user, decoupling the three-dimensional face model, and separating out identity and expression; and then, by controlling the facial expression components, a new facial three-dimensional model is synthesized, and facial expression editing is realized. The problems and disadvantages of this method are: the method is only suitable for editing the expression of the three-dimensional face model and is not suitable for editing the expression of the two-dimensional face picture. Because when the facial expression changes, not only the face shape changes, but also the face surface texture changes. Therefore, the facial expression editing method based on the three-dimensional face model is difficult to modify the facial texture.
Disclosure of Invention
In order to solve the defects of the technology, the invention provides a facial expression editing method based on a generative confrontation network.
In order to solve the technical problems, the invention adopts the technical scheme that: a facial expression editing method based on a generative confrontation network comprises the following overall steps:
step S1, data preparation phase
a. Manually labeling each face in the RGB image set, and labeling face identity information and face expression information; the marking information of each picture is represented by [ i, j ], wherein i represents that the picture belongs to the ith individual, and j represents that the picture belongs to the jth expression;
b. cutting the faces in the marked image set from the pictures through a face detector and a face characteristic point detector, and aligning the faces;
step S2, model design phase
a. The model consists of two parts, namely a generator G and a discriminator D; the generator G is used for generating a face picture controlled by expression control vectors according to the input face picture to be edited and the expression control vectors; the discriminator D is used for distinguishing the distribution of real samples and generated samples according to the pictures generated by the generator G and the real face pictures with labels, and learning the distribution of face expression and the distribution of face identity information;
b. forming a frame edited by the facial expressions based on the generative confrontation network by using the generator G and the discriminator D so as to carry out confrontation training;
step S3, model training phase
a. Inputting the real labeled human face picture and the picture generated by the generator G into a discriminator D, training and enabling the discriminator D to be used for distinguishing the distribution of a real sample and a generated sample, and learning the distribution of human face expression and the distribution of human face identity information; the picture generated by the generator G is marked as false [0], and the real face picture with the label is marked as true [1, i, j ];
b. and inputting the face picture img0[ i, j ] to be edited and the expression control vector y into a generator G, and outputting the face picture controlled by the expression control vector. Then inputting the false picture [0] output by the generator G into the discriminator D, so that the discriminator discriminates the false picture [1], the face identity information i and the face expression information j, and the generator G is ensured to generate a face picture which is more real, better in identity information retention and more effective in expression control;
c. repeating the step a for 3 times and then repeating the step b for 1 time, so that the discriminant D is better trained; the better the discriminant D is trained, the more beneficial the training of the generator G is;
d. storing the model parameters once per epoch, editing the facial expressions on the test set, and observing the output picture effect of the generator G; stopping model training when the generator G generates a human face picture meeting the requirements; meanwhile, storing the face picture to generate a model parameter with the best visual effect;
step S4, model testing phase
a. The input image is an image I containing a human face;
b. inputting the image I into a face detector, obtaining a face position, cutting the image I by using the face position to obtain a face image img0, and aligning the face of the face image img 0;
c. and inputting the aligned face picture and the expression control vector into a generator G to obtain a face picture img1 after expression editing.
The invention uses a unique full convolution network as a generator, and simultaneously performs true and false classification, facial expression classification and facial identity classification on the discriminator, thereby ensuring that the generator generates a facial picture which is closer to the distribution of a real facial picture, better keeping the facial identity information and editing the facial picture more effectively by expression.
Drawings
FIG. 1 is a diagram of a model design architecture of the present invention.
Fig. 2 is a schematic overall flowchart of step S4.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
A facial expression editing method based on a generative confrontation network comprises the following overall steps:
step S1, data preparation phase
a. Manually labeling each face in the RGB image set, and labeling face identity information and face expression information; the marking information of each picture is represented by [ i, j ], i represents that the picture belongs to the ith individual (i is more than or equal to 0 and less than N), and j represents that the picture belongs to the jth expression (j is more than or equal to 0 and less than M); the whole picture set comprises N persons and M expressions;
b. cutting the faces in the marked image set from the pictures through a face detector and a face characteristic point detector, and aligning the faces;
step S2, model design phase
a. The model consists of two parts, namely a generator G and a discriminator D; the generator G is used for generating a face picture controlled by expression control vectors according to the input face picture to be edited and the expression control vectors; the discriminator D is used for distinguishing the distribution of real samples and generated samples according to the pictures generated by the generator G and the real face pictures with labels, and learning the distribution of face expression and the distribution of face identity information; the overall framework of the model is shown in FIG. 1;
b. forming a frame edited by the facial expressions based on the generative confrontation network by using the generator G and the discriminator D so as to carry out confrontation training; the network structures of the generator G and the discriminator D are shown in tables 1 and 2, respectively.
Table 1 network structure of generator G
Generator G |
Aligned color face pictures 128 x 3 |
3 × 3 convolution, batch normalization, exponential linear unit activation; 3 x 3 convolution, batch normalization, exponential linear cell activation |
Maximum pooling 2 x 2 |
3 × 3 convolution, batch normalization, exponential linear unit activation; 3 x 3 convolution, batch normalization, exponential linear cell activation |
Maximum pooling 2 x 2 |
3 × 3 convolution, batch normalization, exponential linear unit activation; 3 x 3 convolution, batch normalization, exponential linear cell activation |
Nearest neighbor upsampling 2 x 2 |
3 × 3 convolution, batch normalization, exponential linear unit activation; 3 x 3 convolution, batch normalization, exponential linear cell activation |
Nearest neighbor upsampling 2 x 2 |
3 × 3 convolution, batch normalization, exponential linear unit activation; 3 x 3 convolution, batch normalization, exponential linear cell activation |
3 x 3 convolution, hyperbolic tangent function activation |
TABLE 2 network architecture for discrimination D
Discriminator D |
Aligned color face pictures 128 x 3 |
3 × 3 convolution, batch normalization, leaky modified Linear cell activation (slope 0.02) |
3 × 3 convolution (step 2), batch normalization, leaky modified linear cell activation (slope 0.02) |
3 × 3 convolution (step 2), batch normalization, leaky modified linear cell activation (slope 0.02) |
3 × 3 convolution (step 2), batch normalization, leaky modified linear cell activation (slope 0.02) |
3 × 3 convolution (step 2), batch normalization, leaky modified linear cell activation (slope 0.02) |
Global mean pooling |
Full connection layer |
True/false classification; classifying the expressions; identity classification |
Step S3, model training phase
a. Inputting the real face picture with the label and the picture generated by the generator G into a discriminator D, training and enabling the discriminator D to be used for distinguishing the distribution of the real sample and the generated sample, and learning the distribution of the face expression and the distribution of the face identity information. Wherein, the picture generated by the generator G is marked as false [0], and the real face picture with label is marked as true [1, i, j ]. The face identity label i and the face expression label j are both from a real face picture with labels;
b. and inputting the face picture img0[ i, j ] to be edited and the expression control vector y into a generator G, and outputting the face picture controlled by the expression control vector. And inputting the false picture [0] output by the generator G into a discriminator D, so that the discriminator judges the false picture [1], the face identity information i and the face expression information j. Therefore, the generator G can generate a more real face picture, the identity information can be well kept, and the expression control is more effective;
c. repeating the step a for 3 times and then repeating the step b for 1 time, so that the discriminant D is better trained; the better the discriminant D is trained, the more beneficial the training of the generator G is;
d. model parameters are saved once per epoch (1 epoch equals one training with all samples in the picture set) and facial expression editing is performed on the test set, observing the output picture effect of generator G. And stopping model training when the generator G generates the human face picture meeting the requirements. Meanwhile, the face picture is stored to generate the model parameter with the best visual effect.
Step S4, model testing phase
a. The input image is an image I containing a human face;
b. inputting the image I into a face detector, obtaining a face position, cutting the image I by using the face position to obtain a face image img0, and aligning the face of the face image img 0;
c. inputting the aligned face picture and the expression control vector into a generator G to obtain a face picture img1 after expression editing; the overall flow chart of this step is shown in fig. 2.
Compared with the prior art, the invention has the following key points and advantages:
first, with respect to the generator G network: 1) the activation function adopts an exponential linear unit, and the upsampling layer adopts nearest neighbor upsampling 2 x 2; 2) and adding the expression control information to an input end, and connecting the expression control information with the input face picture to form a four-channel tensor through a full connecting layer and remodeling operation.
The beneficial effects are that: the activation function adopts an exponential linear unit to increase the nonlinearity of the network, so that the network has stronger nonlinear fitting capacity; the sampling layer adopts nearest neighbor upsampling 2 x 2 comparison deconvolution operation, and the picture generation effect is better; the expression control information is added to the input end, and the advantage is that the face picture information and the expression control information can interact earlier, so that a better control effect is achieved.
Second, with regard to the arbiter D network: the discriminator D network is used not only to distinguish the distribution of real samples and generated samples, but also to learn the distribution of facial expressions and the distribution of facial identity information.
The beneficial effects are that: the discriminator D simultaneously performs true and false classification, facial expression classification and facial identity classification, so that the generator G can be ensured to generate a facial picture distribution which is closer to the real facial picture, the facial identity information is better kept, and the facial picture is more effectively edited by expression.
Thirdly, the confrontation training process: 1) in the countercheck training process, the discriminator D is used for distinguishing the distribution of real samples and generated samples, learning the distribution of facial expressions and the distribution of facial identity information; the generator G is used for deceiving the discriminator D on one hand to reduce the difference between the generated sample and the real sample; on the other hand, the picture generated by G is judged to be the same person as the input picture of G by D, and the expression information is the same as the expression control vector.
The beneficial effects are that: the generator G can be ensured to generate a face picture which is closer to the real face picture distribution, the face identity information can be better kept, and the face picture can be more effectively edited by expression.
The above embodiments are not intended to limit the present invention, and the present invention is not limited to the above examples, and those skilled in the art may make variations, modifications, additions or substitutions within the technical scope of the present invention.
Claims (1)
1. A facial expression editing method based on a generative confrontation network is characterized in that: the method comprises the following overall steps:
step S1, data preparation phase
a1, manually labeling each face in the RGB image set, and marking face identity information and face expression information; the marking information of each picture is represented by [ i, j ], wherein i represents that the picture belongs to the ith individual, and j represents that the picture belongs to the jth expression;
b1, cutting the faces in the marked image set from the pictures through a face detector and a face characteristic point detector, and aligning the faces;
step S2, model design phase
a2, the model is composed of two parts, which are a generator G and a discriminator D respectively; the generator G is used for generating a face picture controlled by expression control vectors according to the input face picture to be edited and the expression control vectors; the discriminator D is used for distinguishing the distribution of real samples and generated samples according to the pictures generated by the generator G and the real face pictures with labels, and learning the distribution of face expression and the distribution of face identity information;
b2, composing a frame of facial expression edition based on a generative confrontation network by using the generator G and the discriminator D, thereby carrying out confrontation training;
step S3, model training phase
a3, inputting the real face picture with the label and the picture generated by the generator G into a discriminator D, training and enabling the discriminator D to be used for distinguishing the distribution of real samples and generated samples, and learning the distribution of face expression and the distribution of face identity information; the picture generated by the generator G is marked as false [0], and the real face picture with the label is marked as true [1, i, j ];
b3, inputting a face picture img0[ i, j ] to be edited and an expression control vector y into a generator G, and outputting a face picture controlled by the expression control vector; then inputting the false picture [0] output by the generator G into the discriminator D, so that the discriminator discriminates the false picture [1], the face identity information i and the face expression information j, and the generator G is ensured to generate a face picture which is more real, better in identity information retention and more effective in expression control;
c3, repeating the step a3 every 3 times, then repeating the step b 31 time, and training the discriminator D and the generator G;
d3, storing the model parameters once per epoch, editing the facial expression on the test set, and observing the output picture effect of the generator G; stopping model training when the generator G generates a human face picture meeting the requirements; meanwhile, saving the model parameters generated by the current face picture;
step S4, model testing phase
a4, the input image is an image I containing a human face;
b4, inputting the image I into a face detector and obtaining a face position, cutting the image I by using the face position to obtain a face image img0, and aligning the face of the face image img 0;
c4, inputting the aligned face picture and the expression control vector into a generator G to obtain an expression edited face picture img 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810048098.6A CN108171770B (en) | 2018-01-18 | 2018-01-18 | Facial expression editing method based on generative confrontation network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810048098.6A CN108171770B (en) | 2018-01-18 | 2018-01-18 | Facial expression editing method based on generative confrontation network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108171770A CN108171770A (en) | 2018-06-15 |
CN108171770B true CN108171770B (en) | 2021-04-06 |
Family
ID=62514820
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810048098.6A Active CN108171770B (en) | 2018-01-18 | 2018-01-18 | Facial expression editing method based on generative confrontation network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108171770B (en) |
Families Citing this family (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110659657A (en) * | 2018-06-29 | 2020-01-07 | 北京京东尚科信息技术有限公司 | Method and device for training model |
CN108921123A (en) * | 2018-07-17 | 2018-11-30 | 重庆科技学院 | A kind of face identification method based on double data enhancing |
CN108932660A (en) * | 2018-07-26 | 2018-12-04 | 北京旷视科技有限公司 | A kind of commodity using effect analogy method, device and equipment |
CN109242000B (en) * | 2018-08-09 | 2021-08-31 | 百度在线网络技术(北京)有限公司 | Image processing method, device, equipment and computer readable storage medium |
CN109308725B (en) * | 2018-08-29 | 2020-09-22 | 华南理工大学 | System for generating mobile terminal table sentiment picture |
CN109508643A (en) * | 2018-10-19 | 2019-03-22 | 北京陌上花科技有限公司 | Image processing method and device for porny |
CN109508669B (en) * | 2018-11-09 | 2021-07-23 | 厦门大学 | Facial expression recognition method based on generative confrontation network |
CN109801228A (en) * | 2018-12-18 | 2019-05-24 | 合肥阿巴赛信息科技有限公司 | A kind of jewelry picture beautification algorithm based on deep learning |
CN109829959B (en) * | 2018-12-25 | 2021-01-08 | 中国科学院自动化研究所 | Facial analysis-based expression editing method and device |
CN111401101A (en) * | 2018-12-29 | 2020-07-10 | 上海智臻智能网络科技股份有限公司 | Video generation system based on portrait |
CN109903363A (en) * | 2019-01-31 | 2019-06-18 | 天津大学 | Condition generates confrontation Network Three-dimensional human face expression moving cell synthetic method |
CN109993678B (en) * | 2019-03-26 | 2020-04-07 | 南京联创北斗技术应用研究院有限公司 | Robust information hiding method based on deep confrontation generation network |
CN110276252B (en) * | 2019-05-15 | 2021-07-30 | 北京大学 | Anti-expression-interference face recognition method based on generative countermeasure network |
CN110427864B (en) * | 2019-07-29 | 2023-04-21 | 腾讯科技(深圳)有限公司 | Image processing method and device and electronic equipment |
GB2586260B (en) * | 2019-08-15 | 2021-09-15 | Huawei Tech Co Ltd | Facial image processing |
CN110570383B (en) * | 2019-09-25 | 2022-05-06 | 北京字节跳动网络技术有限公司 | Image processing method and device, electronic equipment and storage medium |
CN110688972B (en) * | 2019-09-30 | 2023-02-03 | 上海依图网络科技有限公司 | System and method for improving face generation performance |
CN111275779B (en) * | 2020-01-08 | 2022-12-16 | 网易(杭州)网络有限公司 | Expression migration method, training method and device of image generator and electronic equipment |
CN111243066B (en) * | 2020-01-09 | 2022-03-22 | 浙江大学 | Facial expression migration method based on self-supervision learning and confrontation generation mechanism |
CN111242213B (en) * | 2020-01-13 | 2023-07-25 | 上海大学 | Label-free automatic face attribute editing method |
CN111524216B (en) * | 2020-04-10 | 2023-06-27 | 北京百度网讯科技有限公司 | Method and device for generating three-dimensional face data |
CN111563427A (en) * | 2020-04-23 | 2020-08-21 | 中国科学院半导体研究所 | Method, device and equipment for editing attribute of face image |
CN111932438A (en) * | 2020-06-18 | 2020-11-13 | 浙江大华技术股份有限公司 | Image style migration method, equipment and storage device |
CN111932661B (en) * | 2020-08-19 | 2023-10-24 | 上海艾麒信息科技股份有限公司 | Facial expression editing system and method and terminal |
CN112200024B (en) * | 2020-09-24 | 2022-10-11 | 复旦大学 | Two-dimensional facial expression recognition method through three-dimensional deformable model learning |
CN112581591A (en) * | 2021-01-29 | 2021-03-30 | 秒影工场(北京)科技有限公司 | Adjustable human face picture generation method based on GAN and three-dimensional model parameters |
CN113780084A (en) * | 2021-08-11 | 2021-12-10 | 上海藤核智能科技有限公司 | Face data amplification method based on generative countermeasure network, electronic equipment and storage medium |
CN116363737B (en) * | 2023-06-01 | 2023-07-28 | 北京烽火万家科技有限公司 | Face image attribute editing method, system, electronic equipment and storage medium |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120232966A1 (en) * | 2011-03-08 | 2012-09-13 | Bank Of America Corporation | Identifying predetermined objects in a video stream captured by a mobile device |
CN106204698A (en) * | 2015-05-06 | 2016-12-07 | 北京蓝犀时空科技有限公司 | Virtual image for independent assortment creation generates and uses the method and system of expression |
CN107292813B (en) * | 2017-05-17 | 2019-10-22 | 浙江大学 | A kind of multi-pose Face generation method based on generation confrontation network |
CN107437077A (en) * | 2017-08-04 | 2017-12-05 | 深圳市唯特视科技有限公司 | A kind of method that rotation face based on generation confrontation network represents study |
-
2018
- 2018-01-18 CN CN201810048098.6A patent/CN108171770B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN108171770A (en) | 2018-06-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108171770B (en) | Facial expression editing method based on generative confrontation network | |
CN108363973B (en) | Unconstrained 3D expression migration method | |
CN110084156A (en) | A kind of gait feature abstracting method and pedestrian's personal identification method based on gait feature | |
CN111028305A (en) | Expression generation method, device, equipment and storage medium | |
CN110781829A (en) | Light-weight deep learning intelligent business hall face recognition method | |
CN112132197A (en) | Model training method, image processing method, device, computer equipment and storage medium | |
CN111028319B (en) | Three-dimensional non-photorealistic expression generation method based on facial motion unit | |
Shuai et al. | Object detection system based on SSD algorithm | |
CN112507904B (en) | Real-time classroom human body posture detection method based on multi-scale features | |
Li et al. | Task relation networks | |
CN112580521A (en) | Multi-feature true and false video detection method based on MAML (maximum likelihood modeling language) meta-learning algorithm | |
CN110889335B (en) | Human skeleton double interaction behavior identification method based on multichannel space-time fusion network | |
Depuru et al. | Convolutional neural network based human emotion recognition system: A deep learning approach | |
CN115205926A (en) | Lightweight robust face alignment method and system based on multitask learning | |
CN111401116A (en) | Bimodal emotion recognition method based on enhanced convolution and space-time L STM network | |
CN111626197B (en) | Recognition method based on human behavior recognition network model | |
CN112560668A (en) | Human behavior identification method based on scene prior knowledge | |
CN110378979A (en) | The method automatically generated based on the generation confrontation customized high-resolution human face picture of network implementations | |
Zhu et al. | Switchgan for multi-domain facial image translation | |
Li et al. | Face hallucination with pose variation | |
Li et al. | Comparative research of fpn and mtcn in face attribute recognition | |
CN113326739B (en) | Online learning participation degree evaluation method based on space-time attention network, evaluation system, equipment and storage medium | |
CN111340111B (en) | Method for recognizing face image set based on wavelet kernel extreme learning machine | |
Yan | Expression capture of virtual reality equipment based on expression weighted distance SLLE | |
Xia et al. | Students’ Classroom Behavior Recognition Based on Behavior Pose and Attention Mechanism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |