CN111783658A - Two-stage expression animation generation method based on double generation countermeasure network - Google Patents

Two-stage expression animation generation method based on double generation countermeasure network Download PDF

Info

Publication number
CN111783658A
CN111783658A CN202010621885.2A CN202010621885A CN111783658A CN 111783658 A CN111783658 A CN 111783658A CN 202010621885 A CN202010621885 A CN 202010621885A CN 111783658 A CN111783658 A CN 111783658A
Authority
CN
China
Prior art keywords
expression
stage
image
target
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010621885.2A
Other languages
Chinese (zh)
Other versions
CN111783658B (en
Inventor
郭迎春
王静洁
刘依
朱叶
郝小可
于洋
师硕
阎刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei University of Technology
Original Assignee
Hebei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hebei University of Technology filed Critical Hebei University of Technology
Priority to CN202010621885.2A priority Critical patent/CN111783658B/en
Publication of CN111783658A publication Critical patent/CN111783658A/en
Application granted granted Critical
Publication of CN111783658B publication Critical patent/CN111783658B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • G06V40/176Dynamic expression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to a two-stage expression animation generation method based on a double generation antagonistic network, which comprises the steps of firstly, extracting expression characteristics in a target expression profile by using an expression migration network faceGAN in a first stage, migrating the expression characteristics to a source face, and generating a first-stage prediction graph; in the second stage, the detail generation network fineGAN is used for supplementing and enriching the details of the eye and mouth regions which have larger contribution to the change of the expression in the first-stage prediction graph, a fine-grained second-stage prediction graph is generated and synthesized into a face video animation, and the expression migration network faceGAN and the detail generation network fineGAN are both realized by adopting a generation confrontation network. The method includes the steps that an anti-network is generated in two stages to generate expression animation, expression conversion is conducted in the first stage, image details are optimized in the second stage, the designated area of an image is extracted through a mask vector to conduct emphasis optimization, and meanwhile, the important part generation effect is better by combining the use of a local discriminator.

Description

Two-stage expression animation generation method based on double generation countermeasure network
Technical Field
The technical scheme of the invention relates to image data processing in computer vision, in particular to a two-stage expression animation generation method based on a double generation antagonistic network.
Background
The facial expression synthesis refers to transferring expressions from a target expression reference face to a source face, identity information of a newly synthesized source face image is kept unchanged, but the expressions of the newly synthesized source face image are kept consistent with the target expression reference face, and the technology is gradually applied to the fields of movie and television production, virtual reality, criminal investigation and the like. The synthesis of the facial expression has important research value in both academic and industrial fields, and how to robustly synthesize natural and vivid facial expressions becomes a challenging hot research topic.
The existing facial expression synthesis methods can be divided into two categories, namely a traditional graphics method and an image generation method based on deep learning. The first type of conventional graphical method generally uses a parametric model to parameterize a source face image, and a design model performs expression conversion to generate a new image, or distorts the face image by using feature correspondence and an optical flow graph to assemble a face patch from existing expression data, but the process of designing the model is detailed and complicated, and a very expensive calculation amount is generated, and the generalization capability is poor.
And a second expression synthesis method based on deep learning. Firstly, extracting facial features by using a deep neural network, mapping an image from a high-dimensional space to a feature vector, changing source expression features by adding an expression label, synthesizing a target facial image by using the deep neural network, and mapping the target facial image back to the high-dimensional space. The appearance of GAN networks then brings about eosin for clear image synthesis, which has attracted great attention once it is proposed. In the field of image synthesis, a large number of research methods such as GAN variants have been introduced to generate images. For example, a Conditional generation countermeasure network (CGAN) may generate an image under specific supervision information, and in the field of facial expression generation, an expression label may be used as Conditional supervision information to generate facial images of different expressions. At present, the GAN network-based correlation method also has some disadvantages, and when an expression animation is generated, unreasonable artifacts, fuzzy generated images, low resolution and the like may occur.
The facial expression generation is image-image conversion, the invention aims to generate facial animation, belongs to image-video conversion, and increases the challenge on the time dimension compared with the task of facial expression generation. Xing et al use a gender-preserving network in the GP-GAN for Synthesizing Faces from Landmarks to enable the network to learn more gender information, but this method still has a deficiency in preserving face identity information, which may result in the generated face having different identity characteristics from the target face. CN108288072A discloses a facial expression synthesis method based on a generation countermeasure network, which does not consider fine-grained generation of a face image, omits the extraction of detail features of a source face image, and has the defects of fuzzy generation result and low resolution. CN110084121A discloses a method for realizing facial expression migration of a cyclic generation type confrontation network based on spectrum normalization, the method adopts an expression unique heat vector to supervise the training process of the network, the discreteness of the unique heat vector limits the learning ability of the network, so that the network can only learn the expression of target emotion, such as happiness, sadness, surprise and the like, but can not learn the emotion degree, and the method is deficient in the aspect of continuous generation of the emotion. CN105069830A discloses an expression animation generation method and device, which can only generate expression animations with six specified templates, but human expressions are very rich and complex, so that the method has poor expansibility and cannot generate any specified expression animation according to user requirements. CN107944358A discloses a face generation method based on a deep convolution countermeasure network model, which cannot ensure invariance of face identity information in the expression generation process, and may have a defect that the generated face is inconsistent with the target face.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: firstly, extracting the characteristics of a target expression by using an expression migration network in a first stage, migrating the characteristics to a source face to generate a first-stage prediction graph, and naming the expression migration network in the first stage as facegan (face genetic adaptive network); in the second stage, a detail generation network is used for enriching some face details in the first-stage prediction graph, generating a fine-grained second-stage prediction graph and synthesizing video animation, wherein the second-stage detail generation network is named as FineGAN (Fine Generation adaptive network); the method of the invention overcomes the problems of fuzzy generated images or low resolution, unreasonable artifacts in the generated result and the like in the prior art.
The technical scheme adopted by the invention for solving the technical problem is as follows: in the first stage, under the drive of a target expression profile, an expression migration network faceGAN is used for capturing expression characteristics in the target expression profile and migrating the expression characteristics to a source face to generate a first-stage prediction graph; in the second stage, the detail generation network FineGAN is used as supplement to enrich the details of eyes and mouth regions which have relatively large contribution to the change of the expression in the first-stage prediction graph, generate a fine-grained second-stage prediction graph and synthesize a facial animation, and the specific steps are as follows:
firstly, acquiring a facial expression profile of each frame of image in a data set:
collecting a facial expression video sequence data set, extracting a face in each frame of image in a video sequence by using a Dlib machine learning library, simultaneously obtaining a plurality of feature points in each face, and then sequentially connecting the feature points by using line segmentsObtaining an expression profile of each frame of the video sequence, and recording as e ═ e (e)1,e2,···,ei,···,en) Wherein e represents the set of all expression contour graphs in a video sequence, namely an expression contour graph sequence; n represents the number of video frames, eiAn expression profile representing the ith frame in a video sequence;
the first stage is to build an expression migration network faceGAN, and the method comprises the following steps:
secondly, extracting the identity characteristics of the source face and the expression characteristics of the target expression profile graph, and preliminarily generating a first-stage prediction graph:
the faceshift network faceGAN comprises a generator G1And a discriminator D1Wherein the generator G1Comprising three sub-networks, respectively two encoders Encid and EncexpA decoder Dec1
Firstly, inputting a neutral non-expression image I of a source faceNAnd a sequence e of target expression profiles, then using an identity encoder EncidNeutral non-expression image I of extraction source faceNIdentity feature vector fidWhile using the expression encoder EncexpExtracting expression characteristic vector set f of target expression profile graph sequence eexp, wherein fexp=(fexp_1,fexp_2,···,fexp_i,···,fexp_n) The formula is expressed as:
fid=Encid(IN) (1),
fexp_i=Encexp(ei) (2),
the identity feature vector fidAnd the expression feature vector f of the ith frameexp_iThe serial connection is carried out to obtain a characteristic vector f, and f is fid+fexp_iThe feature vector f is fed to the decoder Dec1Decoding to generate a first-stage prediction graph Ipre-targetAnd I ispre-target=Dec1(f) Finally, will Ipre-targetInput to a discriminator D1Judging whether the image is true or false;
and thirdly, taking the prediction image of the first stage as input, and adopting the concept of cycleGAN to reconstruct a neutral image of the source face:
predicting the first stage of the picture Ipre-targetAnd the neutral non-expression image I in the second stepNCorresponding expression profile eNAs the input of the faceGAN again, using the identity encoder EncidExtracting an image Ipre-targetUsing the expression encoder EncexpExtracting an expression profile eNThe expression feature vector is repeatedly processed by the second step, and is decoded by a decoder to generate INIs reconstructed image IreconGenerating a reconstructed image IreconIs expressed as:
Irecon=Dec1(Encid(Ipre-target)+Encexp(eN)) (3);
fourthly, calculating a loss function in the faceGAN of the first-stage expression migration network:
the generator G in the FaceGAN of the first-stage expression migration network1The specific formula of the loss function is as follows:
Figure BDA0002565432860000031
wherein ,
Figure BDA0002565432860000032
Figure BDA0002565432860000033
Figure BDA0002565432860000034
Figure BDA0002565432860000035
Figure BDA0002565432860000036
wherein ,IrealFor the target true value, equation (5) is the penalty of the generator, D1(. represents) the discriminator D1The method comprises the steps that an object is true probability, an SSIM (structure description language) function in a formula (6) is used for measuring similarity between two images, a formula (7) is pixel loss, an MAE (mean square error) function is a mean square error function and is used for measuring a difference between a true value and a predicted value, a formula (8) is sensing loss, sensing characteristics of the images are extracted by using VGG-19, characteristics output by a last convolution layer in a VGG-19 network are used as sensing characteristics of the images, the sensing loss between the images and the true images is calculated and generated by the method, a formula (9) is reconstruction loss, and a neutral expressionless image I of a source face is calculatedNAnd its reconstructed image IreconThe distance between them;
discriminator D in FaceGAN of the first-stage expression migration network1The specific formula of the loss function is as follows:
Figure BDA0002565432860000037
wherein ,
Figure BDA0002565432860000041
Figure BDA0002565432860000042
equation (11) is the penalty loss, and equation (12) is the penalty loss of the reconstructed image, where λ1 and λ2For loss of similarity
Figure BDA0002565432860000043
And loss of perception
Figure BDA0002565432860000044
Generator G in faceGAN1Weight parameter of (1), λ3Is heavyCountermeasure loss for constructing images
Figure BDA0002565432860000045
Weight parameters in FaceGAN arbiter losses;
and building a detail generation network FineGAN of the second stage, wherein the steps from the fifth step to the seventh step are as follows:
fifthly, local mask vectors adaptive to individuals are generated:
using a plurality of feature points in each human face obtained in the first step to extract an eye region IeyeAnd mouth area ImouthSetting eye mask vectors M respectivelyeyeAnd mouth mask vector MmouthTaking an eye as an example, an eye mask vector M is constructed by setting the pixel value of the eye region in the image to 1 and the pixel values of the other regions to 0eyeMouth mask vector MmouthIs formed with an eye mask vector MeyeSimilarly;
and sixthly, inputting the prediction graph of the first stage into a network of a second stage to carry out detail optimization:
generator G contained in detail generation network FineGAN2And a discriminator D2,D2Is formed by a global arbiter DglobalAnd two local discriminators Deye and DmouthForming;
predicting the first stage of the picture Ipre-targetAnd a neutral blankness image I in the second stepNInput to the generator G2In the method, a second-stage prediction image I with more human face details is generatedtargetThen the second stage prediction graph ItargetSimultaneously input into three discriminators, via a global discriminator DglobalFor the second stage prediction chart ItargetPerforming global discrimination to make the second stage prediction graph ItargetWith the target real image IrealAs close as possible, by means of an eye part discriminator DeyeAnd mouth local discriminator DmouthFor the second stage prediction chart ItargetFurther optimizes the eye and mouth regions to make the second stage predict the image ItargetMore lifelike, second stage pre-processingDrawing ItargetIs expressed as:
Itarget=G2(Ipre-target,IN) (13);
and seventhly, calculating a loss function in the FinEGAN in the second stage:
generator G2The specific formula of the loss function is as follows:
Figure BDA0002565432860000046
wherein ,
Figure BDA0002565432860000047
Figure BDA0002565432860000048
Figure BDA0002565432860000049
Figure BDA0002565432860000051
Figure BDA0002565432860000052
equation (15) is a penalty, including a global penalty and a local penalty, operator
Figure BDA00025654328600000510
Is a Hadamard product, formula (16) is a pixel loss, formula (17) and formula (18) are local pixel losses, an L1 norm of a pixel difference between a local region of the generated image and a local region of the real image is calculated, formula (19) is a local perceptual loss, and generator G is a maximum value of a local perceptual loss2The total loss function is the weighted sum of the loss functions;
discriminator D2The specific formula of the loss function is as follows:
Figure BDA0002565432860000053
wherein ,
Figure BDA0002565432860000054
Figure BDA0002565432860000055
Figure BDA0002565432860000056
equation (21) is the penalty of the global arbiter, and equations (22) and (23) are the penalty of the local arbiter, where λ4 and λ5In FineGAN generator G for local antagonistic losses, respectively2Weight parameter of (1), λ6 and λ7Respectively, eye pixel loss LeyeAnd mouth pixel loss
Figure BDA0002565432860000057
In FineGAN generator G2Weight parameter of (1), λ8For local perception loss
Figure BDA0002565432860000058
In FineGAN generator G2Weight parameter of (1), λ9To combat loss of global competition
Figure BDA0002565432860000059
In FineGAN discriminator D2The weight parameter of (1);
and eighth step, synthesizing a video:
each frame is independently generated, thus completing n frames of image (I)target_1,Itarget_2,···,Itarget_i,···,Itarget_n) After the generation, the video frame sequence is synthesized into the final human face animation;
therefore, the generation of the two-stage expression animation based on the double-generation countermeasure network is completed, the expression in the face image is converted, and the image details are optimized.
In particular, the identity encoder EncidThe system comprises 4 layers of convolution blocks, and a CBAM attention module is added into the first 3 layers of convolution blocks; expression encoder EncexpIncluding 3 layers of convolution blocks, adding CBAM attention module and decoder Dec in last layer of convolution block1The method comprises a 4-layer deconvolution block, a CBAM attention module is added in a first 3-layer convolution block, and a network encoder and a decoder are connected by using a jump connection, specifically, an identity encoder Enc is usedid Layer 1 output and decoder Dec1The input of the last 1 layer is connected with an identity encoder EncidLayer 2 output and decoder Dec1The input of the 2 nd last layer is connected with an identity encoder EncidLayer 3 output and decoder Dec1The input of the last 3 layer is connected. The CBAM attention module is added, so that the network can pay more attention to the learning of important areas in the image, and meanwhile, in order to enable the network to learn detail information such as face textures of lower layers, the high layer and the lower layer of the network are combined by using jump connection.
In the two-stage expression animation generation method based on the double generation countermeasure network, the English abbreviation of the generation countermeasure network model is GAN, the generation countermeasure network model is called as general adaptive Networks, the generation countermeasure network model is a well-known algorithm in the technical field, and a Dlib library is a public database.
The invention has the beneficial effects that: compared with the prior art, the method has the advantages that,
the significant improvements of the present invention are as follows:
(1) compared with CN108288072A, the method of the invention has the advantages that the detail generation network can ensure the fine-grained generation of the human face animation, and two important areas of the mouth and eyes are optimized, so that the generation effect is more vivid and natural.
(2) Compared with CN110084121A, the method of the invention has the advantages that the expression profile is used to supervise the learning process of faceGAN network, so that the network can learn the continuous expression of the expression, learn the emotion degree and generate smooth face animation.
(3) Compared with CN105069830A, the method of the invention has the advantages that the target expression profile is used to guide the expression of the target expression of the network learning, the method is not limited to the type limitation of the expression, and the expression animation of any emotion required by the user can be generated.
(4) Compared with CN107944358A, the method of the invention has the advantages that the method utilizes the ring network structure of the cycleGAN to train the model, and simultaneously adds jump connection in the faceGAN to ensure the consistency of the identity information of the generated face and the source face.
(5) According to the method, the global discriminator, the local discriminator and the local loss function (the formula (17) and the formula (18)) are arranged, so that the real degree of the whole generated image can be ensured, and two important areas, namely eyes and a mouth, can be generated in a refined mode.
(6) According to the method, the attention module and the second-stage detail generation network are added into the faceGAN, so that the local detail generation and fine-grained expression of the image are guaranteed.
The prominent substantive features of the invention are:
1) the method includes the steps that a countermeasure network is generated in two stages to generate expression animations, the expressions are converted in the first stage, and image details are optimized in the second stage; a local loss function based on a mask is provided, a specified region of an image is extracted through a mask vector, emphasis optimization is carried out, and meanwhile, the important part generating effect is better by combining the use of a local discriminator.
2) In the application, each frame of image in the video sequence is generated by a neutral image instead of a video frame sequence generated in a recursive mode, so that the problem that the generation quality of the subsequent frame is worse and worse due to the fact that errors generated by the preorder frame are transmitted to the subsequent frame and the propagation of the errors is caused is solved; in addition, the image input mode can enable the difficulty of model training to be increased by more learning of the network from the neutral expression to the larger change of other expressions. After the predicted image is generated by using the first-stage network, the predicted image is input into the network again, and the source input image is reconstructed by using the ring network concept of the cycleGAN, so that the network can be forced to retain identity characteristics without increasing the number of parameters of the model, and loss functions of the model comprise countermeasure loss, SSIM similarity loss, pixel loss, VGG perception loss and reconstruction loss. The second stage network of the present application includes a generator and a global arbiter, two local arbiters, with the addition of mask-based local arbiters and local penalty functions.
3) In the faceGAN, the method uses the concept of cycleGAN to input the image after the expression conversion as the network again, and reconstructs and generates a source face image, so that the network can forcibly keep the identity characteristics of the face and only change the expression; meanwhile, in the faceGAN, a jump connection structure is utilized to fuse the high-level features and the low-level features of the network, so that the network can learn more face identity information in the low-level features; the method and the device can realize that the identity information of the face is not changed while the expression conversion is carried out.
4) The invention provides a detail optimization network FineGAN, which is focused on the generation of image details and emphatically optimizes important eye regions and mouth regions; proper weight is set to balance pixel loss and antagonistic loss, and perceptual loss is added to remove artifacts, so that the generated image does not contain unreasonable artifacts and the like, and the network generates a high-quality vivid image which has rich details and accords with human vision.
5) The method has the advantages of relatively less network parameter quantity, lower space and time complexity, capability of learning the migration of any expression type by using a uniform network and learning the continuous change of the emotional intensity, and good use prospect.
Drawings
The invention is further illustrated with reference to the following figures and examples.
FIG. 1 is a schematic block flow diagram of the method of the present invention.
In fig. 2, the odd rows are schematic diagrams of the facial feature points of the method of the present invention, and the even rows are facial expression contour diagrams.
Fig. 3 is a mask diagram of the present invention, wherein the first row is a face region image extracted after preprocessing the original data set, the second and fourth rows are visualizations of an eye mask vector and a mouth mask vector, respectively, and the third and fifth rows are partial region images extracted after applying the eye mask vector and the mouth mask vector to the source image.
FIG. 4 is a graph of the 3 experimental effects of the present invention, wherein the odd rows are the input to the method of the present invention, including a sequence of neutral images of the source face and a silhouette image of the target expression; the even rows are experimental results, i.e., the sequence of video frames that output the expressive animation.
Detailed Description
The embodiment shown in fig. 1 shows that the two-stage expression animation generation method based on the dual generation countermeasure network of the present invention has the following processes:
acquiring a facial expression profile of each frame of image in a data set → extracting the identity characteristic of a source face and the expression characteristic of a target expression profile, preliminarily generating a first-stage prediction image → taking the first-stage prediction image as input, adopting the concept of CycleGAN to reconstruct a neutral image of the source face → calculating a loss function in the first-stage faceGAN → generating a local mask vector adapted to an individual → inputting the first-stage prediction image into a network of a second stage, and carrying out detail optimization → calculating a loss function in the second-stage FineGAN → synthesizing a video.
Example 1
The two-stage expression animation generation method based on the double generation countermeasure network of the embodiment specifically comprises the following steps:
firstly, acquiring a facial expression profile of each frame of image in a data set:
collecting a facial expression video sequence data set, extracting a face in each frame of image in a video sequence by using a Dlib machine learning library, simultaneously obtaining 68 feature points in each face (in the expression migration field, 68 feature points form a face contour and an eye, mouth and nose contour, and 5 or 81 feature points can be arranged), as shown in an odd line in FIG. 2, and then connecting the feature points in sequence by using line segments to obtain an expression contour map of each frame of the video sequence, as shown in an even line in FIG. 2, recording the expression contour map of each frame of the video sequenceIs e ═ e (e)1,e2,···,ei,···,en) Wherein e represents the set of all facial expression profiles in a video sequence, n represents the number of video frames, eiRepresenting the facial expression profile of the ith frame in a certain video sequence;
the first stage is to build an expression migration network faceGAN, and the method comprises the following steps:
secondly, extracting the identity characteristics of the source face and the expression characteristics of the target expression profile graph, and preliminarily generating a first-stage prediction graph:
the faceGAN includes a generator G1And a discriminator D1Wherein the generator G1Comprising three sub-networks, respectively two encoders Encid and EncexpA decoder Dec1
Firstly, inputting a neutral non-expression image I of a source faceNAnd a target expression profile image sequence e, wherein the input of the embodiment is a neutral face of the S010 user, the target expression profile image sequence is a process from a facial expression to an exposed smile, and the neutral expressionless image I is a neutral expressionless image INThe extracted expression profile is marked as eNWith specific inputs as shown in the first line of FIG. 4, and then using the identity encoder EncidExtracting identity characteristic vector f of S010 useridWhile using the expression encoder EncexpExpression feature vector set f for extracting target expression profile graphexp, wherein fexp=(fexp_1,fexp_2,···,fexp_i,···,fexp_n) The formula is expressed as:
fid=Encid(IN) (1),
fexp_i=Encexp(ei) (2),
the identity feature vector fidAnd the expression feature vector f of the ith frameexp_iThe serial connection is carried out to obtain a characteristic vector f, and f is fid+fexp_iThe feature vector f is fed to the decoder Dec1Decoding to generate a first-stage prediction graph Ipre-targetAnd I ispre-target=Dec1(f) Finally, will Ipre-targetInput to a discriminator D1Judging whether the image is true or false;
and thirdly, taking the prediction image of the first stage as input, and adopting the concept of cycleGAN to reconstruct a neutral image of the source face:
predicting the first stage of the picture Ipre-targetAnd the neutral non-expression image I in the second stepNExtracted expression profile eNRepeating the second step as faceGAN input to generate S010 reconstructed image I with neutral expression of userreconGenerating IreconIs expressed as:
Irecon=Dec1(Encid(Ipre-target)+Encexp(eN)) (3);
fourthly, calculating a loss function in the FaceGAN in the first stage:
generator G in the first stage faceGAN described above1The specific formula of the loss function is as follows:
Figure BDA0002565432860000081
wherein ,
Figure BDA0002565432860000082
Figure BDA0002565432860000083
Figure BDA0002565432860000084
Figure BDA0002565432860000091
Figure BDA0002565432860000092
wherein ,IrealCalculating a target real value (a target real value, namely Groundtruth, which is a source face image with a target expression, namely a real image of a final predicted value of a model), namely an S010 real image of smile of a user, a formula (5) is countermeasure loss of a generator, an SSIM (question mark) function in a formula (6) is used for measuring similarity between two images, a formula (7) is pixel loss, an MAE (question mark) function is a mean square error function is used for measuring a difference between the real value and the predicted value, a formula (8) is sensing loss, and sensing characteristics of the images are extracted by using VGG-19NAnd reconstructing an image IreconDistance between, generator G1The loss function of (a) is a weighted sum of the loss functions of the respective portions;
discriminator D in FaceGAN of the first stage1The specific formula of the loss function is as follows:
Figure BDA0002565432860000093
wherein ,
Figure BDA0002565432860000094
Figure BDA0002565432860000095
equation (11) is the countermeasure loss, and equation (12) is the countermeasure loss of the reconstructed image;
the identity encoder EncidThe system comprises 4 layers of convolution blocks, and a CBAM attention module is added into the first 3 layers of convolution blocks; expression encoder EncexpIncluding 3 layers of convolution blocks, adding CBAM attention module and decoder Dec in last layer of convolution block1The system comprises 4 layers of deconvolution blocks, a CBAM attention module is added in the first 3 layers of convolution blocks, and the high layer and the low layer of the network are connected by using a jump connectionThe specific way is to use an identity encoder EncidLayer 1 output and decoder Dec1The input of the last 1 layer is connected with an identity encoder EncidLayer 2 output and decoder Dec1The input of the 2 nd last layer is connected with an identity encoder EncidLayer 3 output and decoder Dec1The inputs of the last 3 layers are connected and the convolution kernel size in this patent is 3 × 3.
And building a detail generation network FineGAN of the second stage, wherein the steps from the fifth step to the seventh step are as follows:
fifthly, local mask vectors adaptive to individuals are generated:
using 68 feature points in each human face obtained in the first step to extract an eye region IeyeAnd mouth area ImouthFirst, eye mask vectors M are set up separatelyeyeAnd mouth mask vector MmouthAs shown in the second row and the fourth row in fig. 3, taking the eye as an example, M is formed by setting the pixel value of the eye region in the image to 1 and the pixel values of the other regions to 0eyeMouth mask vector MmouthIs formed with MeyeSimilarly;
and sixthly, inputting the prediction graph of the first stage into a network of a second stage to carry out detail optimization:
finegan includes generator G2And a discriminator D2,D2Is formed by a global arbiter DglobalAnd two local discriminators Deye and DmouthForming;
predicting the first stage of the image Ipre-targetAnd the neutral non-expression image I in the second stepNInput to the generator G2In the method, a second-stage prediction graph I containing more face details of the S010 user is generatedtargetThen mix ItargetSimultaneously input into three discriminators, by DglobalFor the generated ItargetMaking a global discrimination totargetReal image I smiling with S010 userrealAs close as possible, by means of an eye part discriminator DeyeAnd mouth local discriminator DmouthTo ItargetFurther emphasising the optimization of the eye and mouth regions such that an image I is generatedtargetMore realistic, the formula is illustrated as follows:
Itarget=G2(Ipre-target,IN) (13);
and seventhly, calculating a loss function in the FinEGAN in the second stage:
generator G2The loss function is specifically formulated as follows:
Figure BDA0002565432860000101
wherein ,
Figure BDA0002565432860000102
Figure BDA0002565432860000103
Figure BDA0002565432860000104
Figure BDA0002565432860000105
Figure BDA0002565432860000106
equation (15) is a penalty, including a global penalty and a local penalty, operator
Figure BDA00025654328600001011
Is a Hadamard product, formula (16) is a pixel loss, formula (17) and formula (18) are local pixel losses, an L1 norm of a pixel difference between a local region of a generated image and a local region of a real image is calculated, formula (19) is a local perceptual loss, and a generator total loss function is a weighted sum of loss functions;
discriminator D2The specific formula of the loss function is as follows:
Figure BDA0002565432860000107
wherein ,
Figure BDA0002565432860000108
Figure BDA0002565432860000109
Figure BDA00025654328600001010
equation (21) is the penalty of the global arbiter, and equations (22) and (23) are the penalty of the local arbiter;
and eighth step, synthesizing a video:
each frame is independently generated, thus completing n frames of image (I)target_1,Itarget_2,···,Itarget_i,···,Itarget_n) After the generation, that is, an expression gradual change process from a non-expression to a smile of the user is generated in S010, and the video frame sequence is synthesized into a facial animation of the user in S010, as shown in the second line of fig. 4;
therefore, the generation of the two-stage expression animation based on the double-generation countermeasure network is completed, the expression in the face image is converted, and the image details are optimized.
In this embodiment, the weight parameter settings related to the steps are shown in table 1, and the whole sample database has good effects.
TABLE 1 weight parameter settings for each penalty in this example
Figure BDA0002565432860000111
In the two-stage expression animation generation method based on the double generation countermeasure network, the English abbreviation of the generation countermeasure network model is GAN, which is called as general adaptive Networks, and the method is a well-known algorithm in the technical field.
Figure 4 shows the effect diagram of 3 embodiments of the invention. Wherein the second line is a sequence of video frames generating S010 a user from neutral expression to smiling, the fourth line is a sequence of video frames generating S022 a user from neutral expression to surprised big mouth, and the sixth line is a sequence of video frames generating S032 a user from neutral expression to down-left mouth. Fig. 4 shows that the method of the present invention can complete the migration of expressions under the condition of retaining the face identity information, and can generate a continuously gradual change video frame sequence to synthesize an animation video with the specified identity and the specified expression.
Nothing in this specification is said to apply to the prior art.

Claims (6)

1. A two-stage expression animation generation method based on a double generation countermeasure network is characterized in that the method comprises the steps of firstly, extracting expression features in a target expression profile by using an expression migration network faceGAN in a first stage, migrating the expression features to a source face, and generating a first-stage prediction image; in the second stage, the detail generation network fineGAN is used for supplementing and enriching the details of the eye and mouth regions which have larger contribution to the change of the expression in the first-stage prediction graph, a fine-grained second-stage prediction graph is generated and synthesized into a face video animation, and the expression migration network faceGAN and the detail generation network fineGAN are both realized by adopting a generation confrontation network.
2. The generation method of claim 1, wherein the faceshift network FaceGAN comprises a generator G1And a discriminator D1Wherein the generator G1Comprising three sub-networks, each being an identity encoder EncidAnd an expression encoder EncexpA decoder Dec1
Generator G contained in detail generation network FineGAN2And a discriminator D2,D2Is formed by a global arbiter DglobalAn eye part discriminator DeyeAnd a local mouth discriminator DmouthAnd (4) forming.
3. The generation method of claim 1, characterized in that the method comprises the following specific steps:
firstly, acquiring a facial expression profile of each frame of image in a data set:
collecting a facial expression video sequence data set, extracting a face in each frame of image in a video sequence by using a Dlib machine learning library, simultaneously obtaining a plurality of feature points in each face, then connecting the feature points in sequence by using line segments to obtain an expression profile of each frame of the video sequence, and recording the expression profile as e ═ (e ═1,e2,…,ei,…,en) Wherein e represents the set of all expression contour graphs in a video sequence, namely an expression contour graph sequence; n represents the number of video frames, eiAn expression profile representing the ith frame in a video sequence;
the first stage is to build an expression migration network faceGAN, and the method comprises the following steps:
secondly, extracting the identity characteristics of the source face and the expression characteristics of the target expression profile graph, and preliminarily generating a first-stage prediction graph:
the faceshift network faceGAN comprises a generator G1And a discriminator D1Wherein the generator G1Comprising three sub-networks, respectively two encoders Encid and EncexpA decoder Dec1
Firstly, inputting a neutral non-expression image I of a source faceNAnd a sequence e of target expression profiles, then using an identity encoder EncidNeutral non-expression image I of extraction source faceNIdentity feature vector fidWhile using the expression encoder EncexpExtracting expression characteristic vector set f of target expression profile graph sequence eexp, wherein fexp=(fexp_1,fexp_2,…,fexp_i,…,fexp_n) The formula is expressed as:
fid=Encid(IN) (1),
fexp_i=Encexp(ei) (2),
the identity feature vector fidAnd the expression feature vector f of the ith frameexp_iThe serial connection is carried out to obtain a characteristic vector f, and f is fid+fexp_iThe feature vector f is fed to the decoder Dec1Decoding to generate a first-stage prediction graph Ipre-targetAnd I ispre-target=Dec1(f) Finally, will Ipre-targetInput to a discriminator D1Judging whether the image is true or false;
and thirdly, taking the prediction image of the first stage as input, and adopting the concept of cycleGAN to reconstruct a neutral image of the source face:
predicting the first stage of the picture Ipre-targetAnd the neutral non-expression image I in the second stepNCorresponding expression profile eNAs the input of the faceGAN again, using the identity encoder EncidExtracting an image Ipre-targetUsing the expression encoder EncexpExtracting an expression profile eNThe expression feature vector is repeatedly processed by the second step, and is decoded by a decoder to generate INIs reconstructed image IreconGenerating a reconstructed image IreconIs expressed as:
Irecon=Dec1(Encid(Ipre-target)+Encexp(eN)) (3);
fourthly, calculating a loss function in the faceGAN of the first-stage expression migration network:
the generator G in the FaceGAN of the first-stage expression migration network1The specific formula of the loss function is as follows:
Figure FDA0002565432850000021
wherein ,
Figure FDA0002565432850000022
Figure FDA0002565432850000023
Figure FDA0002565432850000024
Figure FDA0002565432850000025
Figure FDA0002565432850000026
wherein ,IrealFor the target true value, equation (5) is the penalty of the generator, D1(. represents) the discriminator D1The method comprises the steps that an object is true probability, an SSIM (structure description language) function in a formula (6) is used for measuring similarity between two images, a formula (7) is pixel loss, an MAE (mean square error) function is a mean square error function and is used for measuring a difference between a true value and a predicted value, a formula (8) is sensing loss, sensing characteristics of the images are extracted by using VGG-19, characteristics output by a last convolution layer in a VGG-19 network are used as sensing characteristics of the images, the sensing loss between the images and the true images is calculated and generated by the method, a formula (9) is reconstruction loss, and a neutral expressionless image I of a source face is calculatedNAnd its reconstructed image IreconThe distance between them;
discriminator D in FaceGAN of the first-stage expression migration network1The specific formula of the loss function is as follows:
Figure FDA0002565432850000027
wherein ,
Figure FDA0002565432850000028
Figure FDA0002565432850000029
equation (11) is the penalty loss, and equation (12) is the penalty loss of the reconstructed image, where λ1 and λ2For loss of similarity
Figure FDA0002565432850000031
And loss of perception
Figure FDA0002565432850000032
Generator G in faceGAN1Weight parameter of (1), λ3For counteracting loss of reconstructed image
Figure FDA0002565432850000033
Weight parameters in FaceGAN arbiter losses;
and building a detail generation network FineGAN of the second stage, wherein the steps from the fifth step to the seventh step are as follows:
fifthly, local mask vectors adaptive to individuals are generated:
using a plurality of feature points in each human face obtained in the first step to extract an eye region IeyeAnd mouth area ImouthSetting eye mask vectors M respectivelyeyeAnd mouth mask vector MmouthTaking an eye as an example, an eye mask vector M is constructed by setting the pixel value of the eye region in the image to 1 and the pixel values of the other regions to 0eyeMouth mask vector MmouthIs formed with an eye mask vector MeyeSimilarly;
and sixthly, inputting the prediction graph of the first stage into a network of a second stage to carry out detail optimization:
generator G contained in detail generation network FineGAN2And a discriminator D2,D2Is formed by a global arbiter DglobalAnd two local discriminators Deye and DmouthForming;
will be firstPhase prediction graph Ipre-targetAnd a neutral blankness image I in the second stepNInput to the generator G2In the method, a second-stage prediction image I with more human face details is generatedtargetThen the second stage prediction graph ItargetSimultaneously input into three discriminators, via a global discriminator DglobalFor the second stage prediction chart ItargetPerforming global discrimination to make the second stage prediction graph ItargetWith the target real image IrealAs close as possible, by means of an eye part discriminator DeyeAnd mouth local discriminator DmouthFor the second stage prediction chart ItargetFurther optimizes the eye and mouth regions to make the second stage predict the image ItargetMore vivid, second stage prediction graph ItargetIs expressed as:
Itarget=G2(Ipre-target,IN) (13);
and seventhly, calculating a loss function in the FinEGAN in the second stage:
generator G2The specific formula of the loss function is as follows:
Figure FDA0002565432850000034
wherein ,
Figure FDA0002565432850000035
Figure FDA0002565432850000036
Figure FDA0002565432850000037
Figure FDA0002565432850000038
Figure FDA0002565432850000039
equation (15) is a penalty, including a global penalty and a local penalty, operator
Figure FDA0002565432850000041
Is a Hadamard product, formula (16) is a pixel loss, formula (17) and formula (18) are local pixel losses, an L1 norm of a pixel difference between a local region of the generated image and a local region of the real image is calculated, formula (19) is a local perceptual loss, and generator G is a maximum value of a local perceptual loss2The total loss function is the weighted sum of the loss functions;
discriminator D2The specific formula of the loss function is as follows:
Figure FDA0002565432850000042
wherein ,
Figure FDA0002565432850000043
Figure FDA0002565432850000044
Figure FDA0002565432850000045
equation (21) is the penalty of the global arbiter, and equations (22) and (23) are the penalty of the local arbiter, where λ4 and λ5In FineGAN generator G for local antagonistic losses, respectively2Weight parameter of (1), λ6 and λ7Respectively eye pixel loss
Figure FDA0002565432850000046
And mouth pixel loss
Figure FDA0002565432850000047
In FineGAN generator G2Weight parameter of (1), λ8For local perception loss
Figure FDA0002565432850000048
In FineGAN generator G2Weight parameter of (1), λ9To combat loss of global competition
Figure FDA0002565432850000049
In FineGAN discriminator D2The weight parameter of (1);
and eighth step, synthesizing a video:
each frame is independently generated, thus completing n frames of image (I)target_1,Itarget_2,…,Itarget_i,…,Itarget_n) After the generation, the video frame sequence is synthesized into the final human face animation;
therefore, the generation of the two-stage expression animation based on the double-generation countermeasure network is completed, the expression in the face image is converted, and the image details are optimized.
4. Method for generating according to claim 2 or 3, characterized in that said identity encoder EncidThe system comprises 4 layers of convolution blocks, and a CBAM attention module is added into the first 3 layers of convolution blocks; expression encoder EncexpIncluding 3 layers of convolution blocks, adding CBAM attention module and decoder Dec in last layer of convolution block1The system comprises 4 layers of deconvolution blocks, a CBAM attention module is added in the first 3 layers of convolution blocks, and a hopping connection is used for combining the high layer and the low layer of the network at the same timeidLayer 1 output and decoder Dec1The input of the last 1 layer is connected with an identity encoder EncidLayer 2 output and decoder Dec1The input of the 2 nd last layer is connected with an identity encoder EncidLayer 3 output and decoder Dec1The input of the last 3 layer is connected.
5. A method according to claim 3, characterized in that the weight parameter for each loss is set as:
Figure FDA00025654328500000410
6. the generation method according to claim 2, wherein the number of feature points obtained in the first step for each face is 68, and the 68 feature points constitute a face contour and an eye, mouth, and nose contour.
CN202010621885.2A 2020-07-01 2020-07-01 Two-stage expression animation generation method based on dual-generation reactance network Active CN111783658B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010621885.2A CN111783658B (en) 2020-07-01 2020-07-01 Two-stage expression animation generation method based on dual-generation reactance network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010621885.2A CN111783658B (en) 2020-07-01 2020-07-01 Two-stage expression animation generation method based on dual-generation reactance network

Publications (2)

Publication Number Publication Date
CN111783658A true CN111783658A (en) 2020-10-16
CN111783658B CN111783658B (en) 2023-08-25

Family

ID=72761358

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010621885.2A Active CN111783658B (en) 2020-07-01 2020-07-01 Two-stage expression animation generation method based on dual-generation reactance network

Country Status (1)

Country Link
CN (1) CN111783658B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113033288A (en) * 2021-01-29 2021-06-25 浙江大学 Method for generating front face picture based on side face picture for generating confrontation network
CN113326934A (en) * 2021-05-31 2021-08-31 上海哔哩哔哩科技有限公司 Neural network training method, and method and device for generating images and videos
CN113343761A (en) * 2021-05-06 2021-09-03 武汉理工大学 Real-time facial expression migration method based on generation confrontation
CN115100329A (en) * 2022-06-27 2022-09-23 太原理工大学 Multi-mode driving-based emotion controllable facial animation generation method
CN115311261A (en) * 2022-10-08 2022-11-08 石家庄铁道大学 Method and system for detecting abnormality of cotter pin of suspension device of high-speed railway contact network
US20230154088A1 (en) * 2021-11-17 2023-05-18 Adobe Inc. Disentangling latent representations for image reenactment
US11875601B2 (en) 2020-12-24 2024-01-16 Beijing Baidu Netcom Science and Technology Co., Ltd Meme generation method, electronic device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002304638A (en) * 2001-04-03 2002-10-18 Atr Ningen Joho Tsushin Kenkyusho:Kk Device and method for generating expression animation
WO2019228317A1 (en) * 2018-05-28 2019-12-05 华为技术有限公司 Face recognition method and device, and computer readable medium
CN110689480A (en) * 2019-09-27 2020-01-14 腾讯科技(深圳)有限公司 Image transformation method and device
CN111243066A (en) * 2020-01-09 2020-06-05 浙江大学 Facial expression migration method based on self-supervision learning and confrontation generation mechanism

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002304638A (en) * 2001-04-03 2002-10-18 Atr Ningen Joho Tsushin Kenkyusho:Kk Device and method for generating expression animation
WO2019228317A1 (en) * 2018-05-28 2019-12-05 华为技术有限公司 Face recognition method and device, and computer readable medium
CN110689480A (en) * 2019-09-27 2020-01-14 腾讯科技(深圳)有限公司 Image transformation method and device
CN111243066A (en) * 2020-01-09 2020-06-05 浙江大学 Facial expression migration method based on self-supervision learning and confrontation generation mechanism

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈军波;刘蓉;刘明;冯杨;: "基于条件生成式对抗网络的面部表情迁移模型", 计算机工程, no. 04 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11875601B2 (en) 2020-12-24 2024-01-16 Beijing Baidu Netcom Science and Technology Co., Ltd Meme generation method, electronic device and storage medium
CN113033288A (en) * 2021-01-29 2021-06-25 浙江大学 Method for generating front face picture based on side face picture for generating confrontation network
CN113033288B (en) * 2021-01-29 2022-06-24 浙江大学 Method for generating front face picture based on side face picture for generating confrontation network
CN113343761A (en) * 2021-05-06 2021-09-03 武汉理工大学 Real-time facial expression migration method based on generation confrontation
CN113326934A (en) * 2021-05-31 2021-08-31 上海哔哩哔哩科技有限公司 Neural network training method, and method and device for generating images and videos
CN113326934B (en) * 2021-05-31 2024-03-29 上海哔哩哔哩科技有限公司 Training method of neural network, method and device for generating images and videos
US20230154088A1 (en) * 2021-11-17 2023-05-18 Adobe Inc. Disentangling latent representations for image reenactment
US11900519B2 (en) * 2021-11-17 2024-02-13 Adobe Inc. Disentangling latent representations for image reenactment
CN115100329A (en) * 2022-06-27 2022-09-23 太原理工大学 Multi-mode driving-based emotion controllable facial animation generation method
CN115311261A (en) * 2022-10-08 2022-11-08 石家庄铁道大学 Method and system for detecting abnormality of cotter pin of suspension device of high-speed railway contact network

Also Published As

Publication number Publication date
CN111783658B (en) 2023-08-25

Similar Documents

Publication Publication Date Title
CN111783658A (en) Two-stage expression animation generation method based on double generation countermeasure network
US11367239B2 (en) Textured neural avatars
CN112149459B (en) Video saliency object detection model and system based on cross attention mechanism
CN111275518A (en) Video virtual fitting method and device based on mixed optical flow
CN111798369A (en) Face aging image synthesis method for generating confrontation network based on circulation condition
CN113807265B (en) Diversified human face image synthesis method and system
JP2022513858A (en) Data processing methods, data processing equipment, computer programs, and computer equipment for facial image generation
CN111612687B (en) Automatic makeup method for face image
CN114581992A (en) Human face expression synthesis method and system based on pre-training StyleGAN
Zhou et al. Generative adversarial network for text-to-face synthesis and manipulation with pretrained bert model
CN116071494A (en) High-fidelity three-dimensional face reconstruction and generation method based on implicit nerve function
CA3180427A1 (en) Synthesizing sequences of 3d geometries for movement-based performance
CN114549387A (en) Face image highlight removal method based on pseudo label
Sun et al. A unified framework for biphasic facial age translation with noisy-semantic guided generative adversarial networks
CN111767842B (en) Micro-expression type discrimination method based on transfer learning and self-encoder data enhancement
Hou et al. Lifelong age transformation with a deep generative prior
Wang et al. Fine-grained image style transfer with visual transformers
CN114565624A (en) Image processing method for liver focus segmentation based on multi-phase stereo primitive generator
CN114627293A (en) Image matting method based on multi-task learning
CN113343761A (en) Real-time facial expression migration method based on generation confrontation
Quan et al. Facial Animation Using CycleGAN
CN116542292B (en) Training method, device, equipment and storage medium of image generation model
CN117036893B (en) Image fusion method based on local cross-stage and rapid downsampling
Sreekala et al. Human Imitation in Images and Videos using GANs.
Bo et al. Style Transfer Analysis Based on Generative Adversarial Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant