CN113870399B - Expression driving method and device, electronic equipment and storage medium - Google Patents
Expression driving method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN113870399B CN113870399B CN202111117185.0A CN202111117185A CN113870399B CN 113870399 B CN113870399 B CN 113870399B CN 202111117185 A CN202111117185 A CN 202111117185A CN 113870399 B CN113870399 B CN 113870399B
- Authority
- CN
- China
- Prior art keywords
- image
- facial
- expression
- sample
- dimensional
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000014509 gene expression Effects 0.000 title claims abstract description 239
- 238000000034 method Methods 0.000 title claims abstract description 39
- 230000001815 facial effect Effects 0.000 claims abstract description 258
- 230000008921 facial expression Effects 0.000 claims abstract description 62
- 238000012545 processing Methods 0.000 claims abstract description 58
- 238000009877 rendering Methods 0.000 claims abstract description 35
- 230000006870 function Effects 0.000 claims description 50
- 238000005286 illumination Methods 0.000 claims description 21
- 238000012549 training Methods 0.000 claims description 20
- 238000004590 computer program Methods 0.000 claims description 11
- 238000010606 normalization Methods 0.000 claims description 5
- 238000013473 artificial intelligence Methods 0.000 abstract description 3
- 238000013135 deep learning Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 14
- 238000005516 engineering process Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 4
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000000717 retained effect Effects 0.000 description 3
- 230000003993 interaction Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Processing Or Creating Images (AREA)
Abstract
The disclosure provides an expression driving method and device, electronic equipment and a storage medium, relates to the technical field of artificial intelligence, in particular to the technical field of computer vision and deep learning, and can be applied to scenes such as face image processing, face recognition and the like. The specific implementation scheme is as follows: respectively inputting a source image with expression and a target image without expression into a three-dimensional expression model to obtain a plurality of first facial attributes and a plurality of second facial attributes, replacing corresponding facial attributes in the second facial attributes by adopting at least part of the first facial attributes, performing three-dimensional facial reconstruction and rendering on the replaced second facial attributes, and performing expression driving on a three-dimensional facial image to be rendered through an expression driving model. Therefore, the facial expressions and facial gestures in the source image and the target image can be decoupled, and further, the facial expressions and facial gestures of the target image can be controlled independently, so that more various expression drives can be better met.
Description
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to the field of computer vision and deep learning technologies, which can be applied to scenes such as face image processing and face recognition, and in particular, to an expression driving method and apparatus, an electronic device, and a storage medium.
Background
The facial expression driving technology is one of important computer vision technologies, and the task is to drive the facial expression of a target picture through a facial expression picture so that the facial expressions of the target picture and the facial expression picture are consistent as much as possible. Facial expression-driven technology is very widespread in general entertainment applications.
Disclosure of Invention
The disclosure provides a method and a device for expression driving, an electronic device and a storage medium.
According to an aspect of the present disclosure, there is provided an expression driving method including: acquiring a source image with an expression and a target image without the expression; inputting the source image and the target image into a three-dimensional expression model respectively to obtain a plurality of first facial attributes corresponding to the source image and a plurality of second facial attributes corresponding to the target image; replacing corresponding face attributes in the second face attributes by at least part of face attributes in the first face attributes to obtain a plurality of replaced second face attributes; according to the plurality of second face attributes after the replacement processing, performing three-dimensional face reconstruction and rendering on the face in the target image to obtain a rendered three-dimensional face image; and inputting the rendered three-dimensional face image into an expression driving model so as to drive the face in the target image in an expression mode.
According to another aspect of the present disclosure, there is provided an expression driving apparatus including: the first acquisition module is used for acquiring a source image with an expression and a target image without the expression; a second obtaining module, configured to input the source image and the target image into a three-dimensional expression model respectively, so as to obtain a plurality of first facial attributes corresponding to the source image and a plurality of second facial attributes corresponding to the target image; a replacing module, configured to replace, by at least part of the plurality of first face attributes, corresponding face attributes in the plurality of second face attributes with at least part of the plurality of first face attributes, so as to obtain a plurality of second face attributes after replacement processing; the processing module is used for carrying out three-dimensional face reconstruction and rendering on the face in the target image according to the plurality of replaced second face attributes to obtain a rendered three-dimensional face image; and the driving module is used for inputting the rendered three-dimensional face image into an expression driving model so as to drive the expression of the face in the target image.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect of the present disclosure.
According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method of the first aspect of the present disclosure.
According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method of an embodiment of the first aspect of the present disclosure.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure;
FIG. 2 is a schematic diagram according to a second embodiment of the present disclosure;
FIG. 3 is a schematic diagram according to a third embodiment of the present disclosure;
FIG. 4 is a schematic diagram according to a fourth embodiment of the present disclosure;
FIG. 5 is a flow chart diagram of an expression driving method according to an embodiment of the disclosure;
FIG. 6 is a schematic diagram according to a fifth embodiment of the present disclosure;
FIG. 7 illustrates a schematic block diagram of an example electronic device 700 that can be used to implement embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The facial expression driving technology is one of important computer vision technologies, and the task is to drive the facial expression of a target picture through a facial expression picture so that the facial expressions of the target picture and the facial expression picture are consistent as much as possible. Facial expression-driven technology is very widespread in general entertainment applications.
In the related art, a face 2D key point of a driving image is detected, and the face 2D key point is expression-expressed to generate a face picture driven by a corresponding expression.
However, the expression driving technology based on the 2D facial key points cannot decouple the expression and the facial pose, and when the difference between the pose of the driving picture and the pose of the target image is large, the pose of the generated picture changes along with the driving image, the original pose of the target image cannot be maintained, and more various expression drivers cannot be satisfied.
In order to solve the above problems, the present disclosure provides an expression driving method, an expression driving apparatus, an electronic device, and a storage medium.
Fig. 1 is a schematic diagram according to a first embodiment of the present disclosure. It should be noted that the expression driving method of the embodiment of the present disclosure may be applied to the expression driving apparatus of the embodiment of the present disclosure, and the apparatus may be configured in an electronic device. The electronic device may be a mobile terminal, for example, a mobile phone, a tablet computer, a personal digital assistant, and other hardware devices with various operating systems.
As shown in fig. 1, the expression driving method may include the steps of:
In the embodiment of the disclosure, an object can be shot by using an image acquisition device to obtain a source image with an expression and a target image without the expression, or the source image with the expression and the target image without the expression are downloaded from a network. Wherein the expression in the source image may comprise: happy, angry, excited or angry facial expressions.
In order to achieve decoupling between each of the facial attributes, the source image and the target image may be respectively input into a three-dimensional expression model, and the three-dimensional expression model may output a plurality of first facial attributes corresponding to the source image and a plurality of second facial attributes corresponding to the target image. It should be noted that the first face attribute and the second face attribute include: at least one of a facial expression, a facial pose, facial lighting, and facial shape, the first facial attribute may be different from the second facial attribute.
In addition, it should be noted that the three-dimensional expression model may include an encoding layer and a decoding layer; the encoding layer is used for respectively inputting the source image and the target image into the three-dimensional expression model so as to obtain a plurality of first facial attributes corresponding to the source image and a plurality of second facial attributes corresponding to the target image, and therefore decoupling of all the facial attributes in the facial attributes is achieved; and the decoding layer is used for carrying out three-dimensional face reconstruction on the face in the target image according to the plurality of replaced second face attributes to obtain a reconstructed three-dimensional face image so as to realize face reconstruction on the plurality of replaced second face attributes.
As an application scenario, in a face image processing and face recognition scenario, the three-dimensional expression model may be a face 3D deformation statistical model (referred to as 3 DMM), and in order to achieve decoupling between each face attribute in the face attributes, the face image and the target image may be respectively input into a coding layer of the 3DMM, so as to obtain a plurality of first face attributes corresponding to the face image and a plurality of second face attributes corresponding to the target image.
And 103, replacing the corresponding face attribute in the second face attributes with at least part of the first face attributes to obtain a plurality of replaced second face attributes.
In order to make the target image keep the original facial pose and only perform expression driving on the target image, in the embodiment of the present disclosure, at least part of the plurality of first facial attributes may be used to replace the corresponding facial attributes in the plurality of second facial attributes, so as to obtain a plurality of second facial attributes after replacement processing. For example, the facial expression in the second facial attribute may be replaced with the facial expression in the first facial attribute, and the second facial attribute after replacing the facial expression may be used as the plurality of second facial attributes after the replacement processing.
And 104, performing three-dimensional face reconstruction and rendering on the face in the target image according to the plurality of replaced second face attributes to obtain a rendered three-dimensional face image.
In order to present the replaced plurality of second facial attributes, the replaced plurality of second facial attributes may be input into a decoding layer of the three-dimensional expression model to obtain a reconstructed three-dimensional face image. Further, a rendered three-dimensional face image is obtained by a 3D rendering technique.
And 105, inputting the rendered three-dimensional face image into an expression driving model so as to drive the face in the target image in an expression mode.
It can be understood that, because the rendered three-dimensional face image has poor reality, in order to make the expression-driven target image more realistic, in the embodiment of the present disclosure, the rendered three-dimensional face image may be input into the expression-driven model to perform expression driving on the face in the target image.
In conclusion, a source image with an expression and a target image without the expression are obtained; respectively inputting a source image and a target image into a three-dimensional expression model so as to obtain a plurality of first facial attributes corresponding to the source image and a plurality of second facial attributes corresponding to the target image; replacing corresponding face attributes in the second face attributes with at least part of face attributes in the first face attributes to obtain replaced second face attributes; according to the plurality of replaced second facial attributes, performing three-dimensional facial reconstruction and rendering on the face in the target image to obtain a rendered three-dimensional facial image; and inputting the rendered three-dimensional face image into an expression driving model so as to drive the expression of the face in the target image. Therefore, the decoupling of the facial expressions and facial gestures in the source image and the target image can be realized, further, the facial expressions and facial gestures of the target image can be controlled independently, and more diversified expression drivers can be better met.
In order to keep the target image in the original facial pose, only the target image is expression-driven, as shown in fig. 2, and fig. 2 is a schematic diagram according to a second embodiment of the present disclosure. In the embodiment of the present disclosure, the facial expression in the second facial attribute may be replaced with a facial expression in the plurality of first facial attributes to obtain a replacement-processed second facial attribute. The embodiment shown in fig. 2 may include the following steps:
And step 203, performing replacement processing on the facial expression in the second facial attribute according to the facial expression in the plurality of first facial attributes.
In the disclosed embodiment, each of the first and second face attributes may include: facial shape, facial pose, facial expression, and facial illumination, and the facial expression in the second facial attribute may be replaced with the facial expression in the first facial attribute.
That is, after the facial expression in the second facial attribute is subjected to the replacement processing using the facial expression in the first facial attribute, the replacement-processed facial expression in the second facial attribute, in which the facial pose, the facial shape, and the facial illumination originally retained, may be used as the plurality of second facial attributes after the replacement processing.
And step 205, performing three-dimensional face reconstruction and rendering on the face in the target image according to the plurality of replaced second face attributes to obtain a rendered three-dimensional face image.
And step 206, inputting the rendered three-dimensional face image into an expression driving model so as to drive the face in the target image in an expression mode.
It should be noted that the execution processes of steps 201 to 202 and steps 205 to 206 may be implemented by any one of the embodiments of the present disclosure, and the embodiments of the present disclosure do not limit this and are not described again.
In conclusion, the facial expression in the second facial attribute is replaced according to the facial expression in the plurality of first facial attributes; the facial expression after the replacement processing in the second facial attribute and the facial pose, the facial shape and the facial illumination retained by the replacement processing in the second facial attribute are used as a plurality of second facial attributes after the replacement processing. Therefore, the target image can keep the original facial posture, and only the expression of the target image is driven.
In order to perform face reconstruction on the replaced plurality of second facial attributes, as shown in fig. 3, fig. 3 is a schematic diagram according to a third embodiment of the present disclosure, in which a three-dimensional face reconstruction and rendering may be performed on a face in a target image according to the replaced plurality of second facial attributes to obtain a reconstructed three-dimensional face image. The embodiment shown in fig. 3 may include the following steps:
And step 303, replacing the corresponding face attribute in the second face attributes with at least part of the first face attributes to obtain a plurality of replaced second face attributes.
And 304, performing three-dimensional face reconstruction on the face in the target image according to the plurality of replaced second face attribute coefficients to obtain a reconstructed three-dimensional face image.
In the embodiment of the disclosure, the plurality of second facial attribute coefficients after the replacement processing may be input into a decoding layer of the three-dimensional expression model, and the three-dimensional expression model may output a reconstructed three-dimensional facial image.
And 305, performing three-dimensional face rendering on the reconstructed three-dimensional face image to obtain a rendered three-dimensional face image.
In order to make the acquired three-dimensional face image more accurate and real, a 3D rendering technology can be adopted to perform three-dimensional face rendering on the reconstructed three-dimensional face image so as to obtain a rendered three-dimensional face image.
It should be noted that the execution processes of steps 301 to 303 and step 306 may be implemented by any one of the embodiments of the present disclosure, and the embodiments of the present disclosure do not limit this, and are not described again.
In conclusion, the face in the target image is subjected to three-dimensional face reconstruction according to the plurality of second face attribute coefficients after the replacement processing, so that a reconstructed three-dimensional face image is obtained; and performing three-dimensional face rendering on the reconstructed three-dimensional face image to obtain a rendered three-dimensional face image, so that the plurality of replaced second face attributes can be subjected to face reconstruction.
In order to enable the expression driving model to perform expression driving on the rendered three-dimensional facial image to obtain a more real facial driving image, as shown in fig. 4, fig. 4 is a schematic diagram according to a fourth embodiment of the present disclosure, in the embodiment of the present disclosure, before the rendered three-dimensional facial image is input to the expression driving model, the expression driving model may be trained to output the more real facial driving image, and the embodiment shown in fig. 4 may include the following steps:
And step 403, replacing the corresponding face attribute in the second face attributes with at least part of the first face attributes to obtain a plurality of replaced second face attributes.
And step 404, performing three-dimensional face reconstruction and rendering on the face in the target image according to the plurality of replaced second face attributes to obtain a rendered three-dimensional face image.
In the embodiment of the present disclosure, a plurality of frames of sample images with expressions may be obtained by downloading through an image acquisition device or a network, where it should be noted that the plurality of frames of sample images with expressions may be sample images with different expressions of the same object, or the plurality of frames of sample images with expressions may be sample images with different expressions of different objects.
Further, each frame of sample image of a plurality of frames of sample images with expressions may be respectively input into an encoding layer of a three-dimensional expression model, and the three-dimensional expression model may output a sample facial attribute corresponding to each frame of sample image, where it should be noted that the sample facial attribute may include: at least one of a sample facial expression, a sample facial shape, a sample facial pose, and a sample facial illumination.
Furthermore, the sample facial expression, the sample facial shape, the sample facial posture and the sample facial illumination can be input into a decoding layer of the three-dimensional expression model, and the three-dimensional expression model can perform three-dimensional facial reconstruction on the sample facial attribute to obtain a reconstructed three-dimensional sample facial image.
And step 408, performing three-dimensional face rendering on the reconstructed three-dimensional sample face image to obtain a rendered three-dimensional sample face image.
In the embodiment of the disclosure, the three-dimensional face rendering can be performed on the reconstructed three-dimensional sample face image by using a three-dimensional rendering technology, so as to obtain a rendered three-dimensional sample face image.
And 409, training the initial expression driving model according to the rendered three-dimensional sample facial image and the rendered sample image to generate an expression driving model.
As an example, inputting a rendered three-dimensional sample facial image into an initial expression driving model to obtain an expression prediction image; determining a loss function value according to the difference between the sample image and the expression predicted image; the initial expression-driven model is trained to minimize the loss function values based on the loss function values.
That is, in order to improve the accuracy of the expression driving model, the rendered three-dimensional sample facial image may be input into an initial expression driving model, the initial expression driving model may output an expression prediction image, and further, the sample image may be compared with the expression prediction image to determine a difference between the sample image and the expression prediction image, and a loss function value may be determined according to the difference, for example, the loss function value may include a first sub-loss function value and a second sub-loss function value, wherein the first sub-loss function value may be determined according to an absolute value of a difference between the sample image and the expression prediction image, and at the same time, the sample image and the expression prediction image may be input into a trained Visual Graphics Generator (VGG), a semantic vector corresponding to the sample image and a semantic vector corresponding to the expression prediction image may be generated, and the second sub-loss function value may be determined according to an absolute value of a difference between the semantic vector corresponding to the sample image and the semantic vector corresponding to the expression. Furthermore, according to the loss function value, the initial expression driving model can be trained in a gradient feedback mode so as to minimize the loss function value.
As another example, image normalization processing is performed on the rendered three-dimensional sample face image and the sample image to obtain a target three-dimensional sample face image; inputting a target three-dimensional sample facial image into an initial expression driving model to obtain an expression predicted image; determining a loss function value according to the difference between the sample image and the expression predicted image; the initial expression-driven model is trained to minimize the loss function values based on the loss function values.
In order to distribute data of the rendered three-dimensional sample face image and the rendered sample image in the same area, reduce the difference between the rendered three-dimensional sample face image and the sample image, facilitate training of the initial expression driving model, and perform image normalization processing on the rendered three-dimensional sample face image and the rendered sample image to obtain a target three-dimensional sample face image. For example, the pixel value of each pixel in the rendered three-dimensional sample face image and sample image may be divided by 255 and then subtracted by 1, such that the pixel value of each pixel is between [ -0.5,0.5 ]. Then, the target three-dimensional sample face image can be input into an initial expression driving model, the initial expression model can output an expression predicted image, and then the sample image and the expression predicted image can be compared to determine the difference between the sample image and the expression predicted image, and a loss function value is determined according to the difference. According to the loss function value, the initial expression driving model can be trained in a gradient return mode so as to minimize the loss function value.
It should be noted that the execution processes of steps 401 to 404 and step 410 may be implemented by any one of the embodiments of the present disclosure, and the embodiments of the present disclosure do not limit this, and are not described again.
In conclusion, multiple frames of sample images with expressions are obtained; inputting the sample image into a coding layer of a three-dimensional expression model aiming at each frame of sample image so as to obtain a sample face attribute corresponding to the sample image; wherein the sample facial attributes comprise: at least one of a sample facial expression, a sample facial shape, a sample facial pose, and a sample facial illumination; inputting the sample facial expression, the sample facial shape, the sample facial posture and the sample facial illumination into a decoding layer of the three-dimensional expression model so as to carry out three-dimensional facial reconstruction on the face in the sample image and obtain a reconstructed three-dimensional sample facial image; performing three-dimensional face rendering on the reconstructed three-dimensional sample face image to obtain a rendered three-dimensional sample face image; and training the initial expression driving model according to the rendered three-dimensional sample face image and the rendered sample image to generate an expression driving model. Therefore, the expression driving model can perform expression driving on the rendered three-dimensional face image to acquire a more real face driving image.
In order to more clearly illustrate the above embodiments, the description will now be made by way of example.
For example, as shown in fig. 5, a source image (source image) may represent a source image having an expression, a target image may represent a target image without an expression, a 3DMM may represent a three-dimensional expression model, and the source image and the target image may be respectively input into an encoding layer of the 3DMM to obtain a shape (facial shape), a pos (facial pose), a light (facial illumination), and an exp (facial expression) corresponding to the source image, and a shape (facial shape), a pos (facial pose), a light (facial illumination), and an exp (facial expression) corresponding to the target image. Then, replacing exp in the target image by exp in the source image, wherein the replaced face attribute corresponding to the target image comprises: and the replaced exp, the original shape, the position and the light remained in the target image. Furthermore, the face attribute after the replacement processing corresponding to the target image may be input to a decoding layer of the 3d dm model to perform three-dimensional face reconstruction and rendering to obtain a rendered three-dimensional face image, and finally, the rendered three-dimensional face image may be input to a translator model (expression driver model) that outputs an expression driver image corresponding to the target image.
According to the expression driving method, a source image with an expression and a target image without the expression are obtained; respectively inputting a source image and a target image into a three-dimensional expression model to obtain a plurality of first facial attributes corresponding to the source image and a plurality of second facial attributes corresponding to the target image; replacing corresponding face attributes in the second face attributes with at least part of face attributes in the first face attributes to obtain replaced second face attributes; according to the plurality of replaced second facial attributes, performing three-dimensional facial reconstruction and rendering on the face in the target image to obtain a rendered three-dimensional facial image; and inputting the rendered three-dimensional face image into an expression driving model so as to drive the face in the target image in an expression mode. The method includes the steps that a source image and a target image are respectively input into a three-dimensional expression model, a plurality of first facial attributes corresponding to the source image and a plurality of second facial attributes corresponding to the target image are obtained, furthermore, at least part of the first facial attributes are adopted to replace the corresponding facial attributes in the second facial attributes, three-dimensional facial reconstruction and rendering are carried out on the replaced second facial attributes, and finally expression driving is carried out on a rendered three-dimensional facial image through an expression driving model. Therefore, the decoupling of the facial expressions and facial gestures in the source image and the target image can be realized, further, the facial expressions and facial gestures of the target image can be controlled independently, and more diversified expression drivers can be better met.
In order to realize the embodiment, the present disclosure further provides an expression driving apparatus.
Fig. 6 is a schematic diagram according to a fifth embodiment of the present disclosure, and as shown in fig. 6, an expression driving apparatus 600 includes: a first acquisition module 610, a second acquisition module 620, a replacement module 630, a processing module 640, and a driving module 650.
The first obtaining module 610 is configured to obtain a source image with an expression and a target image without the expression; a second obtaining module 620, configured to input the source image and the target image into the three-dimensional expression model respectively, so as to obtain a plurality of first facial attributes corresponding to the source image and a plurality of second facial attributes corresponding to the target image; a replacing module 630, configured to replace, by at least part of the plurality of first facial attributes, corresponding facial attributes in the plurality of second facial attributes to obtain a plurality of second facial attributes after replacement processing; the processing module 640 is configured to perform three-dimensional face reconstruction and rendering on a face in the target image according to the plurality of second face attributes after the replacement processing, so as to obtain a rendered three-dimensional face image; and the driving module 650 is configured to input the rendered three-dimensional facial image into an expression driving model, so as to perform expression driving on the face in the target image.
As a possible implementation manner of the embodiment of the present disclosure, the replacing module 630 is specifically configured to: performing replacement processing on the facial expression in the second facial attribute according to the facial expression in the plurality of first facial attributes; the facial expression after the replacement processing in the second facial attribute and the facial pose, the facial shape and the facial illumination retained by the replacement processing in the second facial attribute are used as a plurality of second facial attributes after the replacement processing.
As a possible implementation manner of the embodiment of the present disclosure, the processing module 640 is specifically configured to: according to the plurality of second face attribute coefficients after the replacement processing, performing three-dimensional face reconstruction on the face in the target image to obtain a reconstructed three-dimensional face image; and performing three-dimensional face rendering on the reconstructed three-dimensional face image to obtain a rendered three-dimensional face image.
As a possible implementation manner of the embodiment of the present disclosure, the three-dimensional expression model includes a coding layer and a decoding layer; the encoding layer is used for respectively inputting a source image and a target image into the three-dimensional expression model so as to obtain a plurality of first facial attributes corresponding to the source image and a plurality of second facial attributes corresponding to the target image; and the decoding layer is used for carrying out three-dimensional face reconstruction on the face in the target image according to the plurality of replaced second face attributes to obtain a reconstructed three-dimensional face image.
As a possible implementation manner of the embodiment of the present disclosure, the expression driving apparatus 600 further includes: the device comprises a third acquisition module, a fourth acquisition module, a reconstruction module, a rendering module and a training module.
The third acquisition module is used for acquiring a plurality of frames of sample images with expressions; the fourth acquisition module is used for inputting the sample image into a coding layer of the three-dimensional expression model aiming at each frame of sample image so as to acquire the sample facial attribute corresponding to the sample image; wherein the sample facial attributes comprise: at least one of a sample facial expression, a sample facial shape, a sample facial pose, and a sample facial illumination; the reconstruction module is used for inputting the sample facial expression, the sample facial shape, the sample facial posture and the sample facial illumination into a decoding layer of the three-dimensional expression model so as to carry out three-dimensional facial reconstruction on the face in the sample image and obtain a reconstructed three-dimensional sample facial image; the rendering module is used for performing three-dimensional face rendering on the reconstructed three-dimensional sample face image to obtain a rendered three-dimensional sample face image; and the training module is used for training the initial expression driving model according to the rendered three-dimensional sample facial image and the rendered sample image so as to generate the expression driving model.
As a possible implementation manner of the embodiment of the present disclosure, the training module is specifically configured to: inputting the rendered three-dimensional sample facial image into an initial expression driving model to obtain an expression predicted image; determining a loss function value according to the difference between the sample image and the expression predicted image; and training the initial expression driving model according to the loss function value so as to minimize the loss function value.
As a possible implementation manner of the embodiment of the present disclosure, the training module is specifically configured to: performing image normalization processing on the rendered three-dimensional face image and the sample image to obtain a target three-dimensional sample face image; inputting a target three-dimensional sample facial image into an initial expression driving model to obtain an expression predicted image; determining a loss function value according to the difference between the sample image and the expression predicted image; the initial expression-driven model is trained to minimize the loss function values based on the loss function values.
The expression driving device of the embodiment of the disclosure acquires a source image with an expression and a target image without the expression; respectively inputting a source image and a target image into a three-dimensional expression model so as to obtain a plurality of first facial attributes corresponding to the source image and a plurality of second facial attributes corresponding to the target image; replacing corresponding face attributes in the second face attributes with at least part of face attributes in the first face attributes to obtain replaced second face attributes; according to the plurality of replaced second facial attributes, performing three-dimensional facial reconstruction and rendering on the face in the target image to obtain a rendered three-dimensional facial image; and inputting the rendered three-dimensional face image into an expression driving model so as to drive the face in the target image in an expression mode. The device can achieve the purpose that a plurality of first facial attributes corresponding to a source image and a plurality of second facial attributes corresponding to a target image are obtained by respectively inputting the source image and the target image into the three-dimensional expression model, furthermore, at least part of the first facial attributes are adopted to replace the corresponding facial attributes in the second facial attributes, the replaced second facial attributes are subjected to three-dimensional facial reconstruction and rendering, and finally, expression driving is carried out on the rendered three-dimensional facial image through the expression driving model. Therefore, the decoupling of the facial expressions and facial gestures in the source image and the target image can be realized, further, the facial expressions and facial gestures of the target image can be controlled independently, and more diversified expression drivers can be better met.
In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information of the related user are all carried out on the premise of obtaining the consent of the user, and all accord with the regulation of related laws and regulations without violating the good custom of the public order.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 7 illustrates a schematic block diagram of an example electronic device 700 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the device 700 comprises a computing unit 701, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 can be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, a modem, a wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.
It should be understood that various forms of the flows shown above, reordering, adding or deleting steps, may be used. For example, the steps described in the present disclosure may be executed in parallel or sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.
Claims (13)
1. An expression driving method comprising:
acquiring a source image with an expression and a target image without the expression;
inputting the source image and the target image into a three-dimensional expression model respectively to obtain a plurality of first facial attributes corresponding to the source image and a plurality of second facial attributes corresponding to the target image;
replacing corresponding face attributes in the second face attributes with at least part of face attributes in the first face attributes to obtain replaced second face attributes;
according to the plurality of second facial attributes after the replacement processing, performing three-dimensional facial reconstruction and rendering on the face in the target image to obtain a rendered three-dimensional facial image;
inputting the rendered three-dimensional facial image into an expression driving model so as to drive the face in the target image in an expression manner;
before the inputting the rendered three-dimensional facial image into an expression driving model, the method further includes:
acquiring a plurality of frames of sample images with expressions;
inputting the sample image into a coding layer of the three-dimensional expression model aiming at each frame of the sample image so as to obtain a sample face attribute corresponding to the sample image; wherein the sample facial attributes comprise: at least one of a sample facial expression, a sample facial shape, a sample facial pose, and a sample facial illumination;
inputting the sample facial expression, the sample facial shape, the sample facial posture and the sample facial illumination into a decoding layer of the three-dimensional expression model so as to carry out three-dimensional facial reconstruction on the face in the sample image and obtain a reconstructed three-dimensional sample facial image;
performing three-dimensional face rendering on the reconstructed three-dimensional sample face image to obtain a rendered three-dimensional sample face image;
training an initial expression driving model according to the rendered three-dimensional sample facial image and the sample image to generate the expression driving model;
the training of an initial expression driving model according to the rendered three-dimensional sample facial image and the sample image to generate the expression driving model comprises:
inputting the rendered three-dimensional sample facial image into an initial expression driving model to obtain an expression predicted image;
determining a loss function value according to the difference between the sample image and the expression predicted image;
training the initial expression driving model according to the loss function value so as to minimize the loss function value;
wherein the loss function values comprise a first sub-loss function value and a second sub-loss function value, and the determining the loss function values according to the difference between the sample image and the expression predicted image comprises:
determining a first sub-loss function value according to the absolute value of the difference value between the sample image and the expression predicted image;
and meanwhile, inputting the sample image and the expression predicted image into a trained eye view image generator, generating a semantic vector corresponding to the sample image and a semantic vector corresponding to the expression predicted image, and determining a second sub-loss function value according to the absolute value of the difference between the semantic vector corresponding to the sample image and the semantic vector corresponding to the expression predicted image.
2. The method of claim 1, wherein said replacing corresponding ones of said plurality of second facial attributes with at least some of said plurality of first facial attributes to obtain replacement processed plurality of second facial attributes comprises:
performing replacement processing on the facial expression in the second facial attribute according to the facial expression in the plurality of first facial attributes;
and using the facial expressions after the replacement processing in the second facial attributes and the facial gestures, the facial shapes and the facial illumination remained by the replacement processing in the second facial attributes as a plurality of second facial attributes after the replacement processing.
3. The method of claim 1, wherein the performing three-dimensional facial reconstruction and rendering of the face in the target image according to the plurality of second facial attributes after the replacement processing to obtain a rendered three-dimensional facial image comprises:
performing three-dimensional face reconstruction on the face in the target image according to the plurality of second face attribute coefficients after the replacement processing to obtain a reconstructed three-dimensional face image;
and performing three-dimensional face rendering on the reconstructed three-dimensional face image to obtain a rendered three-dimensional face image.
4. The method of claim 3, wherein the three-dimensional expression model comprises an encoding layer and a decoding layer;
the encoding layer is used for respectively inputting the source image and the target image into a three-dimensional expression model so as to obtain a plurality of first facial attributes corresponding to the source image and a plurality of second facial attributes corresponding to the target image;
and the decoding layer is used for carrying out three-dimensional face reconstruction on the face in the target image according to the plurality of replaced second face attributes to obtain a reconstructed three-dimensional face image.
5. The method of claim 1, wherein training an initial expression-driven model from the rendered three-dimensional sample facial image and the sample image to generate the expression-driven model comprises:
performing image normalization processing on the rendered three-dimensional sample face image and the sample image to obtain a target three-dimensional sample face image;
inputting the target three-dimensional sample facial image into an initial expression driving model to obtain an expression predicted image;
determining a loss function value according to the difference between the sample image and the expression predicted image;
and training the initial expression driving model according to the loss function value so as to minimize the loss function value.
6. An expression driving apparatus comprising:
the first acquisition module is used for acquiring a source image with an expression and a target image without the expression;
the second acquisition module is used for respectively inputting the source image and the target image into the three-dimensional expression model so as to acquire a plurality of first facial attributes corresponding to the source image and a plurality of second facial attributes corresponding to the target image;
a replacing module, configured to replace, by at least part of the plurality of first face attributes, corresponding face attributes in the plurality of second face attributes with at least part of the plurality of first face attributes, so as to obtain a plurality of second face attributes after replacement processing;
the processing module is used for carrying out three-dimensional face reconstruction and rendering on the face in the target image according to the plurality of replaced second face attributes to obtain a rendered three-dimensional face image;
the driving module is used for inputting the rendered three-dimensional facial image into an expression driving model so as to drive the expression of the face in the target image;
the device further comprises:
the third acquisition module is used for acquiring a plurality of frames of sample images with expressions;
a fourth obtaining module, configured to, for each frame of the sample image, input the sample image into a coding layer of the three-dimensional expression model to obtain a sample facial attribute corresponding to the sample image; wherein the sample facial attributes comprise: at least one of a sample facial expression, a sample facial shape, a sample facial pose, and a sample facial illumination;
the reconstruction module is used for inputting the sample facial expression, the sample facial shape, the sample facial posture and the sample facial illumination into a decoding layer of the three-dimensional expression model so as to carry out three-dimensional facial reconstruction on the face in the sample image and obtain a reconstructed three-dimensional sample facial image;
the rendering module is used for performing three-dimensional face rendering on the reconstructed three-dimensional sample face image to obtain a rendered three-dimensional sample face image;
the training module is used for training an initial expression driving model according to the rendered three-dimensional sample facial image and the sample image so as to generate the expression driving model;
the training module is specifically configured to:
inputting the rendered three-dimensional sample facial image into an initial expression driving model to obtain an expression predicted image;
determining a loss function value according to the difference between the sample image and the expression predicted image;
training the initial expression driving model according to the loss function value so as to minimize the loss function value;
wherein the loss function values comprise a first sub-loss function value and a second sub-loss function value, and the determining the loss function values according to the difference between the sample image and the expression predicted image comprises:
determining a first sub-loss function value according to the absolute value of the difference value between the sample image and the expression predicted image;
and meanwhile, inputting the sample image and the expression predicted image into a trained eye view image generator, generating a semantic vector corresponding to the sample image and a semantic vector corresponding to the expression predicted image, and determining a second sub-loss function value according to the absolute value of the difference between the semantic vector corresponding to the sample image and the semantic vector corresponding to the expression predicted image.
7. The apparatus according to claim 6, wherein the replacement module is specifically configured to:
performing replacement processing on the facial expression in the second facial attribute according to the facial expression in the plurality of first facial attributes;
and using the facial expressions after the replacement processing in the second facial attributes and the facial gestures, the facial shapes and the facial illumination remained by the replacement processing in the second facial attributes as a plurality of second facial attributes after the replacement processing.
8. The apparatus according to claim 6, wherein the processing module is specifically configured to:
performing three-dimensional face reconstruction on the face in the target image according to the plurality of second face attribute coefficients after the replacement processing to obtain a reconstructed three-dimensional face image;
and performing three-dimensional face rendering on the reconstructed three-dimensional face image to obtain a rendered three-dimensional face image.
9. The apparatus of claim 8, wherein the three-dimensional expression model comprises an encoding layer and a decoding layer;
the encoding layer is used for respectively inputting the source image and the target image into a three-dimensional expression model so as to obtain a plurality of first facial attributes corresponding to the source image and a plurality of second facial attributes corresponding to the target image;
and the decoding layer is used for carrying out three-dimensional face reconstruction on the face in the target image according to the plurality of replaced second face attributes to obtain a reconstructed three-dimensional face image.
10. The apparatus of claim 6, wherein the training module is specifically configured to:
carrying out image normalization processing on the rendered three-dimensional face image and the sample image to obtain a target three-dimensional sample face image;
inputting the target three-dimensional sample facial image into an initial expression driving model to obtain an expression predicted image;
determining a loss function value according to the difference between the sample image and the expression predicted image;
and training the initial expression driving model according to the loss function value so as to minimize the loss function value.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.
12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.
13. A computer program product comprising a computer program which, when executed by a processor, carries out the steps of the method according to any one of claims 1-5.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111117185.0A CN113870399B (en) | 2021-09-23 | 2021-09-23 | Expression driving method and device, electronic equipment and storage medium |
PCT/CN2022/088311 WO2023045317A1 (en) | 2021-09-23 | 2022-04-21 | Expression driving method and apparatus, electronic device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111117185.0A CN113870399B (en) | 2021-09-23 | 2021-09-23 | Expression driving method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113870399A CN113870399A (en) | 2021-12-31 |
CN113870399B true CN113870399B (en) | 2022-12-02 |
Family
ID=78993646
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111117185.0A Active CN113870399B (en) | 2021-09-23 | 2021-09-23 | Expression driving method and device, electronic equipment and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113870399B (en) |
WO (1) | WO2023045317A1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113870399B (en) * | 2021-09-23 | 2022-12-02 | 北京百度网讯科技有限公司 | Expression driving method and device, electronic equipment and storage medium |
CN115984947B (en) * | 2023-02-21 | 2023-06-27 | 北京百度网讯科技有限公司 | Image generation method, training device, electronic equipment and storage medium |
CN117115317B (en) * | 2023-08-10 | 2024-08-16 | 北京百度网讯科技有限公司 | Avatar driving and model training method, apparatus, device and storage medium |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110298917A (en) * | 2019-07-05 | 2019-10-01 | 北京华捷艾米科技有限公司 | A kind of facial reconstruction method and system |
CN110941332A (en) * | 2019-11-06 | 2020-03-31 | 北京百度网讯科技有限公司 | Expression driving method and device, electronic equipment and storage medium |
CN111599002A (en) * | 2020-05-15 | 2020-08-28 | 北京百度网讯科技有限公司 | Method and apparatus for generating image |
CN111968203A (en) * | 2020-06-30 | 2020-11-20 | 北京百度网讯科技有限公司 | Animation driving method, animation driving device, electronic device, and storage medium |
CN112215050A (en) * | 2019-06-24 | 2021-01-12 | 北京眼神智能科技有限公司 | Nonlinear 3DMM face reconstruction and posture normalization method, device, medium and equipment |
WO2021012590A1 (en) * | 2019-07-22 | 2021-01-28 | 广州华多网络科技有限公司 | Facial expression shift method, apparatus, storage medium, and computer device |
CN112907725A (en) * | 2021-01-22 | 2021-06-04 | 北京达佳互联信息技术有限公司 | Image generation method, image processing model training method, image processing device, and image processing program |
US11055514B1 (en) * | 2018-12-14 | 2021-07-06 | Snap Inc. | Image face manipulation |
CN113221847A (en) * | 2021-06-07 | 2021-08-06 | 广州虎牙科技有限公司 | Image processing method, image processing device, electronic equipment and computer readable storage medium |
CN113313085A (en) * | 2021-07-28 | 2021-08-27 | 北京奇艺世纪科技有限公司 | Image processing method and device, electronic equipment and storage medium |
CN113344777A (en) * | 2021-08-02 | 2021-09-03 | 中国科学院自动化研究所 | Face changing and replaying method and device based on three-dimensional face decomposition |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101944238B (en) * | 2010-09-27 | 2011-11-23 | 浙江大学 | Data driving face expression synthesis method based on Laplace transformation |
US20200020173A1 (en) * | 2018-07-16 | 2020-01-16 | Zohirul Sharif | Methods and systems for constructing an animated 3d facial model from a 2d facial image |
GB2586260B (en) * | 2019-08-15 | 2021-09-15 | Huawei Tech Co Ltd | Facial image processing |
CN110868598B (en) * | 2019-10-17 | 2021-06-22 | 上海交通大学 | Video content replacement method and system based on countermeasure generation network |
CN113327278B (en) * | 2021-06-17 | 2024-01-09 | 北京百度网讯科技有限公司 | Three-dimensional face reconstruction method, device, equipment and storage medium |
CN113870399B (en) * | 2021-09-23 | 2022-12-02 | 北京百度网讯科技有限公司 | Expression driving method and device, electronic equipment and storage medium |
-
2021
- 2021-09-23 CN CN202111117185.0A patent/CN113870399B/en active Active
-
2022
- 2022-04-21 WO PCT/CN2022/088311 patent/WO2023045317A1/en unknown
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11055514B1 (en) * | 2018-12-14 | 2021-07-06 | Snap Inc. | Image face manipulation |
CN112215050A (en) * | 2019-06-24 | 2021-01-12 | 北京眼神智能科技有限公司 | Nonlinear 3DMM face reconstruction and posture normalization method, device, medium and equipment |
CN110298917A (en) * | 2019-07-05 | 2019-10-01 | 北京华捷艾米科技有限公司 | A kind of facial reconstruction method and system |
WO2021012590A1 (en) * | 2019-07-22 | 2021-01-28 | 广州华多网络科技有限公司 | Facial expression shift method, apparatus, storage medium, and computer device |
CN110941332A (en) * | 2019-11-06 | 2020-03-31 | 北京百度网讯科技有限公司 | Expression driving method and device, electronic equipment and storage medium |
CN111599002A (en) * | 2020-05-15 | 2020-08-28 | 北京百度网讯科技有限公司 | Method and apparatus for generating image |
CN111968203A (en) * | 2020-06-30 | 2020-11-20 | 北京百度网讯科技有限公司 | Animation driving method, animation driving device, electronic device, and storage medium |
CN112907725A (en) * | 2021-01-22 | 2021-06-04 | 北京达佳互联信息技术有限公司 | Image generation method, image processing model training method, image processing device, and image processing program |
CN113221847A (en) * | 2021-06-07 | 2021-08-06 | 广州虎牙科技有限公司 | Image processing method, image processing device, electronic equipment and computer readable storage medium |
CN113313085A (en) * | 2021-07-28 | 2021-08-27 | 北京奇艺世纪科技有限公司 | Image processing method and device, electronic equipment and storage medium |
CN113344777A (en) * | 2021-08-02 | 2021-09-03 | 中国科学院自动化研究所 | Face changing and replaying method and device based on three-dimensional face decomposition |
Non-Patent Citations (1)
Title |
---|
3DMM与GAN结合的实时人脸表情迁移方法;高翔等;《计算机应用与软件》;20200412(第04期);第119-126页 * |
Also Published As
Publication number | Publication date |
---|---|
WO2023045317A1 (en) | 2023-03-30 |
CN113870399A (en) | 2021-12-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113870399B (en) | Expression driving method and device, electronic equipment and storage medium | |
CN113643412A (en) | Virtual image generation method and device, electronic equipment and storage medium | |
CN113963110B (en) | Texture map generation method and device, electronic equipment and storage medium | |
CN112562069B (en) | Method, device, equipment and storage medium for constructing three-dimensional model | |
EP3876204A2 (en) | Method and apparatus for generating human body three-dimensional model, device and storage medium | |
CN114792355B (en) | Virtual image generation method and device, electronic equipment and storage medium | |
CN113052962B (en) | Model training method, information output method, device, equipment and storage medium | |
CN113658309A (en) | Three-dimensional reconstruction method, device, equipment and storage medium | |
CN114549710A (en) | Virtual image generation method and device, electronic equipment and storage medium | |
CN112989970A (en) | Document layout analysis method and device, electronic equipment and readable storage medium | |
CN112580666A (en) | Image feature extraction method, training method, device, electronic equipment and medium | |
CN115147265A (en) | Virtual image generation method and device, electronic equipment and storage medium | |
CN114937478B (en) | Method for training a model, method and apparatus for generating molecules | |
CN114549728A (en) | Training method of image processing model, image processing method, device and medium | |
CN112528995A (en) | Method for training target detection model, target detection method and device | |
CN113365146A (en) | Method, apparatus, device, medium and product for processing video | |
CN113379877A (en) | Face video generation method and device, electronic equipment and storage medium | |
CN113421335B (en) | Image processing method, image processing apparatus, electronic device, and storage medium | |
CN113962845B (en) | Image processing method, image processing apparatus, electronic device, and storage medium | |
CN113380269B (en) | Video image generation method, apparatus, device, medium, and computer program product | |
CN112862934A (en) | Method, apparatus, device, medium, and product for processing animation | |
CN115359166B (en) | Image generation method and device, electronic equipment and medium | |
CN115147547B (en) | Human body reconstruction method and device | |
CN113240780B (en) | Method and device for generating animation | |
CN114078097A (en) | Method and device for acquiring image defogging model and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |