WO2022156532A1 - 三维人脸模型重建方法、装置、电子设备及存储介质 - Google Patents
三维人脸模型重建方法、装置、电子设备及存储介质 Download PDFInfo
- Publication number
- WO2022156532A1 WO2022156532A1 PCT/CN2022/070257 CN2022070257W WO2022156532A1 WO 2022156532 A1 WO2022156532 A1 WO 2022156532A1 CN 2022070257 W CN2022070257 W CN 2022070257W WO 2022156532 A1 WO2022156532 A1 WO 2022156532A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- face
- dimensional
- parameters
- model
- camera
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 111
- 238000003860 storage Methods 0.000 title claims abstract description 28
- 238000013507 mapping Methods 0.000 claims abstract description 142
- 230000014509 gene expression Effects 0.000 claims abstract description 94
- 230000001815 facial effect Effects 0.000 claims description 31
- 238000005457 optimization Methods 0.000 claims description 29
- 230000015654 memory Effects 0.000 claims description 25
- 230000008569 process Effects 0.000 claims description 23
- 210000005252 bulbus oculi Anatomy 0.000 claims description 21
- 230000008921 facial expression Effects 0.000 claims description 20
- 238000009826 distribution Methods 0.000 claims description 12
- 210000001508 eye Anatomy 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 12
- 238000001514 detection method Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 description 12
- 238000004590 computer program Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 9
- 230000006872 improvement Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 230000008901 benefit Effects 0.000 description 7
- 238000004519 manufacturing process Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 5
- 238000010801 machine learning Methods 0.000 description 4
- 239000002245 particle Substances 0.000 description 4
- 238000010276 construction Methods 0.000 description 3
- 238000011840 criminal investigation Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 238000009434 installation Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010195 expression analysis Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000002085 persistent effect Effects 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- 230000001364 causal effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 210000004709 eyebrow Anatomy 0.000 description 1
- 210000000887 face Anatomy 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 229910021389 graphene Inorganic materials 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000011514 reflex Effects 0.000 description 1
- 239000010979 ruby Substances 0.000 description 1
- 229910001750 ruby Inorganic materials 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
Definitions
- the present application relates to the technical field of computer vision, and in particular, to a method, device, electronic device and storage medium for reconstructing a three-dimensional face model.
- Facial expression capture plays an important role in many fields, typically, such as movies, games, criminal investigation, video surveillance, and so on.
- relatively accurate facial expressions can indeed be captured by adopting some methods with high cost and complicated process.
- accurate facial expressions can be obtained by reflectively collecting marker points on the human face.
- the huge cost of this method and the discomfort it brings to users seriously hinder its development and promotion.
- Single-camera face image acquisition has the characteristics of low cost, easy installation, and user-friendliness, but single-camera face images generally have two-dimensional information with only one perspective, and it is difficult to provide three-dimensional information. Therefore, in order to obtain vivid facial expressions, it is necessary to reconstruct a 3D face model from a single-camera face image.
- the current 3D face models reconstructed based on single-camera face images often have low accuracy and are difficult to capture realistic facial expressions.
- the purpose of the embodiments of the present application is to provide a three-dimensional face model reconstruction method, apparatus, electronic device and storage medium, which can quickly and accurately reconstruct a three-dimensional face model.
- the three-dimensional face model reconstruction method, device, electronic device, and storage medium provided by the embodiments of the present application are implemented as follows:
- a three-dimensional face model reconstruction method comprises:
- the three-dimensional face model is determined by using the identity parameters, face posture parameters and expression parameters, so that the two-dimensional feature points of the face determined by the three-dimensional face model match the two-dimensional feature points of the target face, and the determined texture mapping information Matches the target texture mapping information.
- the acquiring a single-camera face image includes:
- Face detection is performed on the single-camera image, and a single-camera face image is intercepted from the single-camera image.
- the face information prediction model is set to be obtained by training in the following manner:
- the face information prediction model Inputting the single-camera face sample image into the face information prediction model to generate a prediction result, where the prediction result includes the predicted two-dimensional feature points of the face and texture mapping information;
- the model parameters are iteratively adjusted until the difference meets a preset requirement.
- the single-camera face sample image is set to be acquired in the following manner:
- the two-dimensional feature points of the face and/or the texture mapping information separate face images from the multiple single-camera images, and use the multiple face images as the training face information Single-camera face sample images used by the prediction model.
- the three-dimensional face model is determined by using the identity parameters, face posture parameters and expression parameters, so that the two-dimensional feature points of the face determined by the three-dimensional face model are consistent with the The two-dimensional feature points of the target face are matched, and the determined texture mapping information is matched with the target texture mapping information, including:
- the feature points are matched with the two-dimensional feature points of the target face, and the determined texture mapping information is matched with the target texture mapping information.
- one or two of the parameters of the identity parameter, the face posture parameter and the expression parameter are alternately fixed, and the other two or one of the parameters are adjusted to generate a three-dimensional face.
- models including:
- the predicted texture mapping information and the target texture mapping information are Iterative adjustment is performed until at least one of the difference or the number of iterations meets a preset requirement.
- the method is based on the prediction between the two-dimensional feature points of the predicted face and the two-dimensional feature points of the target face, the predicted texture mapping information and the target texture mapping information.
- the difference between the two or one parameter is iteratively adjusted until at least one of the difference or the number of iterations meets the preset requirements, including:
- the use of identity parameters, Face pose parameters and expression parameters determine a three-dimensional face model, so that the two-dimensional feature points of the face determined by the three-dimensional face model match the two-dimensional feature points of the target face, the determined texture mapping information and the target texture mapping information to match, including:
- the three-dimensional face model of the parameters makes the two-dimensional feature points of the face determined by the N three-dimensional face models match the two-dimensional feature points of the target face, and the determined texture mapping information and the target texture mapping information match.
- a 3D face model with the same identity parameters including:
- the parameter or the identity parameter is iteratively adjusted until at least one of the difference or the number of iterations satisfies a preset requirement.
- the method further includes:
- the identity parameters of the N predicted 3D face models are used and fixed in the joint optimization, and the face pose parameters and the expression parameters, until at least one of the difference or the number of iterations satisfies a preset requirement.
- the three-dimensional face model includes a three-dimensional model composed of a preset number of multiple polygonal meshes connected to each other, and the positions of the mesh vertices of the polygonal meshes are determined by the The identity parameter, the face gesture parameter and the expression parameter are determined.
- the method further includes:
- the three-dimensional eyeball model includes eye gaze information
- the three-dimensional face model and the three-dimensional eyeball model are combined into a new three-dimensional face model.
- a three-dimensional face model reconstruction device comprising:
- the information prediction module is used to input the single-camera face image into the face information prediction model, and output the two-dimensional feature points of the target face and the target in the single-camera face image through the face information prediction model texture mapping information;
- a model determination module is used to determine a three-dimensional face model using identity parameters, face posture parameters and expression parameters, so that the two-dimensional feature points of the face determined by the three-dimensional face model match the two-dimensional feature points of the target face ,
- the determined texture mapping information matches the target texture mapping information.
- An electronic device includes a processor and a memory for storing instructions executable by the processor, and the processor implements the three-dimensional face model reconstruction method when the processor executes the instructions.
- a non-transitory computer-readable storage medium when the instructions in the storage medium are executed by a processor, enable the processor to execute the three-dimensional face model reconstruction method.
- the three-dimensional face model reconstruction method provided by the present application can reconstruct a three-dimensional face model based on a single-camera face image, and takes advantage of the advantages of low-cost, easy installation, and user-friendliness of single-camera image acquisition. Based on this, the advantage of easy acquisition of single-camera face images can not only reduce the construction cost of face information prediction models, but also make reconstruction of 3D face models faster and easier. In the reconstruction process, the two-dimensional face feature points and texture mapping information output by the face information prediction model can effectively improve the accuracy and robustness of the model reconstruction.
- the three-dimensional face model is obtained by reconstructing identity parameters, face pose parameters and expression parameters, providing accurate and reliable technology for single-camera virtual live broadcast, single-camera intelligent interaction technology, face recognition, criminal investigation monitoring, movie games, expression analysis and other technical fields Program.
- the present invention focuses on solving the above pain points, and proposes a high-precision, real-time running single-camera facial expression capture scheme.
- FIG. 1 is a schematic flowchart of a method for reconstructing a three-dimensional face model according to an exemplary embodiment.
- FIG. 2 is a schematic flowchart of a method for reconstructing a three-dimensional face model according to an exemplary embodiment.
- FIG. 3 is a schematic flowchart of a method for reconstructing a three-dimensional face model according to an exemplary embodiment.
- FIG. 4 is a schematic flowchart of a method for reconstructing a three-dimensional face model according to an exemplary embodiment.
- Fig. 5 is a schematic flowchart of a method for reconstructing a three-dimensional face model according to an exemplary embodiment.
- FIG. 6 is a schematic flowchart of a method for reconstructing a three-dimensional face model according to an exemplary embodiment.
- Fig. 7 is a block diagram of an apparatus for reconstructing a three-dimensional face model according to an exemplary embodiment.
- Fig. 8 is a block diagram of an apparatus for reconstructing a three-dimensional face model according to an exemplary embodiment.
- Fig. 1 is a schematic flowchart of an embodiment of a three-dimensional face model reconstruction method provided by the present application.
- the present application provides method operation steps as shown in the following embodiments or drawings, more or less operation steps may be included in the method based on routine or without creative effort. In steps that logically do not have a necessary causal relationship, the execution order of these steps is not limited to the execution order provided by the embodiments of the present application.
- the method may be executed sequentially or in parallel (eg, in a parallel processor or multi-thread processing environment) according to the methods shown in the embodiments or the accompanying drawings.
- FIG. 1 an embodiment of the three-dimensional face model reconstruction method provided by the present application is shown in FIG. 1 , and the method may include:
- S103 Input the single-camera face image into a face information prediction model, and output the target face two-dimensional feature points and target texture mapping information in the single-camera face image through the face information prediction model.
- S105 Determine a three-dimensional face model using identity parameters, face posture parameters and expression parameters, so that the two-dimensional feature points of the face determined by the three-dimensional face model match the two-dimensional feature points of the target face, and the determined texture
- the mapping information matches the target texture mapping information.
- the single-camera face image may include a face image captured by a single camera
- the single camera may include a single camera device, such as a single-lens reflex camera, a smart device with a camera function (such as Smartphones, tablet computers, smart wearable devices, etc.), the camera can be an RGB camera, or an RGBD camera, etc.
- the single-camera face image may include an image in any format, such as an RGB image, a grayscale image, and the like. In an actual application environment, the image captured by the camera not only includes a face image, but also includes a background image other than the face.
- a single-camera face image that only contains human faces as much as possible can be cut out from the images captured by the single-camera.
- a single-camera image including a face image may be acquired.
- face detection can be performed on the single-camera image, and a single-camera face image can be intercepted from the single-camera image.
- a face detection algorithm based on machine learning can be used to detect the face image of the single-camera image, and the single-camera face image can be cut out from the single-camera image.
- the face detection algorithm may include algorithms such as R-CNN, Fast R-CNN, Faster R-CNN, TCDCN, MTCNN, YOLOV3, SSD, etc., which are not limited here.
- the single-camera face image after the single-camera face image is acquired, the single-camera face image can be input into a face information prediction model, and the single-camera face image is output through the face information prediction model.
- the two-dimensional feature points of the human face include key points used to characterize the facial features of the human face.
- 15 face contour feature points and 16 eye feature points may be included. (8 for each of the left and right eyes), 12 eyebrow feature points (6 for each left and right), 12 nose feature points and 18 mouth feature points.
- the edges of the polygon meshes are shared with the adjacent polygon meshes. Since the number of the polygon meshes is fixed, the number of mesh vertices of the three-dimensional face model is also fixed. Since the identity parameters, face pose parameters and expression parameters are unknown in the initial stage, the positions of the mesh vertices in the three-dimensional face model are default positions. In subsequent embodiments, the process of reconstructing the three-dimensional face model is the process of adjusting the positions of the vertices of the mesh.
- the mesh vertices may have unique identifiers, for example, the unique identifiers may include (u, v) coordinates of texture mapping, and thus, the texture mapping information may include face image pixels to the mesh
- the mapping relationship of the unique identification of lattice vertices may include that the pixel point whose coordinate position is (34, 17) in the single-camera image corresponds to the mesh vertex whose texture coordinate is (0.2, 0.5) in the three-dimensional face model.
- the face information prediction model may include a multi-task machine learning model, and the multi-task machine learning model may implement various tasks, for example, including a multi-task deep learning network.
- Tasks Deep learning networks can implement two kinds of prediction tasks.
- S201 Acquire a plurality of single-camera face sample images, where the single-camera face sample images are marked with face two-dimensional feature points and texture mapping information.
- S203 Build a face information prediction model, where model parameters are set in the face information prediction model.
- S205 Input the single-camera face sample image into the face information prediction model to generate a prediction result, where the prediction result includes the predicted two-dimensional feature points of the face and texture mapping information.
- the preset requirement may include, for example, that the value of the difference is smaller than a preset threshold.
- the prediction result may include various information such as two-dimensional feature points of the face, texture mapping information, etc., and the difference may include multiple prediction results and the corresponding two-dimensional feature points and texture mapping of the face, respectively.
- the face information that can be output by the face information prediction model is not limited to the above-mentioned two-dimensional face feature points and the texture mapping information, and can also include any other face information, which is not limited in this application.
- the machine learning algorithm for training the face information prediction model may include Resnet backbone network, MobileNet backbone network, VGG backbone network, etc., which is not limited here.
- the single-camera face sample image can be obtained in the following manner:
- S301 Using multiple cameras to acquire multiple single-camera images of the same face from different angles at the same time.
- S305 Project the three-dimensional face model into the multiple single-camera images, and obtain the two-dimensional face feature points and texture mapping information in the multiple single-camera images respectively.
- S307 According to the two-dimensional feature points of the face and/or the texture mapping information, separate face images from the multiple single-camera images respectively, and use the multiple face images as the training The single-camera face sample image used by the face information prediction model.
- multiple cameras can be used to photograph the same face from multiple angles at the same time, so that multiple single-camera images of the face can be acquired.
- 5 images are obtained by shooting with 5 cameras, so that 5 single-camera images can be obtained at one time.
- a three-dimensional face model of the human face can be obtained by reconstructing the multiple single-camera images, and identity parameters, face posture parameters and expression parameters can be determined through the three-dimensional face model.
- the three-dimensional face model can be projected back into multiple single-camera images, respectively, to obtain two-dimensional face feature points and texture mapping information in each single-camera image, respectively.
- the 3D face model used for multi-camera reconstruction and the 3D face model used for subsequent single-camera reconstruction need to have the same topology, that is, the same vertex connection relationship.
- the single-camera image includes face information
- the single-camera image can be segmented according to the face information
- the segmented face image can be used as the face information prediction for training
- the single-camera face sample image used by the model, where the face information can include two-dimensional face feature points and texture mapping information.
- the two-dimensional feature points and texture mapping information of the face of the single-camera image can be obtained. Therefore, a single-camera face sample image can be obtained by segmenting the face image from the single-camera image according to the two-dimensional feature points of the face.
- the BoundingBox algorithm can be utilized for image separation.
- the method of generating the single-camera face sample image in this embodiment can greatly save the cost of manual labeling, and can acquire a large amount of sample data in less time, thereby reducing the cost of acquiring training samples.
- the three-dimensional data determined by the identity parameters, the face pose parameters and the expression parameters can be reconstructed.
- a face model so that the two-dimensional feature points of the face determined from the three-dimensional face model are matched with the two-dimensional feature points of the target face, and the determined texture mapping information and the target texture mapping information are matched.
- the two-dimensional feature points of the face determined by the generated three-dimensional face model can be continuously adjusted by adjusting the identity parameters, the face posture parameters and the expression parameters. Matching with the two-dimensional feature points of the target face, the determined texture mapping information is matched with the target texture mapping information.
- an analysis-by-synthesis algorithm can be used to adjust parameters to determine the three-dimensional face model.
- one or two of the identity parameters, face posture parameters and expression parameters can be fixed alternately, and the other two or one of the parameters can be adjusted to generate a three-dimensional face model, so that the three-dimensional human face
- the two-dimensional feature points of the face determined by the face model are matched with the two-dimensional feature points of the target face, and the determined texture mapping information is matched with the target texture mapping information.
- alternative optimization strategies such as "optimize facial posture parameters and expression parameters with fixed identity parameters” and "optimize identity parameters with fixed facial posture parameters and facial expression parameters” can be adopted.
- the strategy of posture parameters and expression parameters, this alternate optimization method can make the three-dimensional face model converge quickly and improve the optimization efficiency.
- one or two parameters among the identity parameters, face posture parameters and expression parameters are alternately fixed, and the other two or one parameters are adjusted.
- a 3D face model which can include:
- S401 Alternately fix one or two of the identity parameters, the face posture parameters and the expression parameters, and adjust the other two or one of the parameters to generate a predicted three-dimensional face model;
- S403 Project the predicted three-dimensional face model into the single-camera face image, and obtain predicted two-dimensional feature points of the face and predicted texture mapping information;
- S405 Based on the difference between the predicted two-dimensional feature point of the face and the two-dimensional feature point of the target face, the predicted texture mapping information and the target texture mapping information, for the other two or one The parameters are iteratively adjusted until at least one of the difference or the number of iterations satisfies a preset requirement.
- an initial three-dimensional face model can be provided.
- the initial three-dimensional face model is the three-dimensional face model when parameters have not yet been optimized.
- the 3D face model can be generated based on default identity parameters, default face pose parameters and default expression parameters.
- the default parameters can be determined according to the average value of the identity parameters, facial posture parameters and expression parameters stored in the preset database, or can be determined by using the identity parameters, facial posture parameters and facial expressions reconstructed from the previous frame of a single-camera face image.
- the parameters are determined, which is not limited in this application. In addition, this application does not make any limitation on whether to “fix the identity parameters to optimize the facial posture parameters and expression parameters” or to “fix the facial posture parameters and the facial expression parameters to optimize the identity parameters” first.
- the face pose parameters and expression parameters can be optimized by fixing the identity parameters first.
- the two-dimensional feature points of the target face and the target texture mapping information of the single-camera face image 1 can be determined, such as 73 target feature points and texture mapping information.
- the initial three-dimensional face model can be projected into the single-camera face image 1, the predicted face two-dimensional feature points and predicted texture mapping information can be obtained, and the predicted two-dimensional face feature points and the predicted face can be determined.
- the facial gesture parameter and the expression parameter may be adjusted.
- the face posture parameters and expression parameters are fixed to optimize the identity parameters, and the adjustment method is the same as the adjustment method in the fixed identity parameters to optimize the face posture parameters and the expression parameters, and will not be repeated here.
- the identity parameter, the face pose parameter and the expression parameter are adjusted alternately and iteratively until the predicted face two-dimensional feature points and the target face two-dimensional feature points, the predicted texture mapping information and all At least one of the difference between the target texture mapping information or the number of iterations satisfies the preset requirement.
- the iterative adjustment method may include a gradient descent optimization algorithm (Gradient-based Optimization), a particle swarm optimization algorithm (Particle Swarm Optimization), etc., which is not limited in this application.
- the preset requirement corresponding to the difference may include that the value of the difference is less than or equal to a preset threshold, and the preset threshold may be set to a value such as 0 or 0.01.
- the preset requirement corresponding to the number of iterations may include that the number of iterations is less than the preset number of times, and the preset number of times may be set to, for example, 5 times, 7 times, and the like.
- identity parameter 1, face pose parameter 1, expression parameter 1 If the set of parameters determined when at least one of the difference or the number of iterations meets the preset requirements is (identity parameter 1, face pose parameter 1, expression parameter 1), then (identity parameter 1, face pose parameter 1) Parameter 1, expression parameter 1) can determine and predict the three-dimensional face model.
- the reconstructed 3D face model has many possibilities, and therefore, a certain degree of ambiguity may occur.
- the reconstructed 3D face model is not a natural and realistic face state.
- the prior probability distribution result and prior probability target value of at least one of the identity parameter, the face posture parameter, and the expression parameter may also be obtained, and by comparing the prior The prior probability distribution result and the prior probability target value are used to avoid the prior probability distribution result from exceeding a reasonable range.
- the prior probability target value can be determined according to a large amount of real collected face data, therefore, the ambiguity of the reconstructed three-dimensional face model can be effectively reduced.
- S501 Obtain a priori probability distribution result and priori probability target value of at least one of the identity parameter, the face posture parameter, and the expression parameter;
- S503 Based on the difference between the predicted face two-dimensional feature point and the target face two-dimensional feature point, the predicted texture mapping information and the target texture mapping information, the identity parameter, the face The difference between the prior probability distribution result of at least one of the posture parameters and the expression parameter and the prior probability target value, and iteratively adjust the other two or one parameter until the difference or the number of iterations At least one of them meets the preset requirements.
- the difference between the prior probability distribution result and the prior probability target value of at least one of the identity parameter, the face posture parameter, and the expression parameter may also be used as the three-dimensional human
- the convergence condition of the face model can effectively reduce the ambiguity of the reconstructed 3D face model.
- the identity parameters are fixed.
- the identity parameters of the single-camera face image 1 can be used continuously, and the optimization During the process, only the face posture parameters and the expression parameters can be optimized, so as to simplify the optimization process and improve the reconstruction efficiency of the face model.
- multiple single-camera face images of the user can be jointly optimized at the same time to improve the reconstruction efficiency.
- multiple single-camera face images of the same user may be acquired, such as 20 frames of face images with different expressions captured in real time.
- alternate optimization strategies such as "optimize face pose parameters and expression parameters with fixed identity parameters” and “optimize identity parameters with fixed face pose parameters and expression parameters” can be used.
- N the number of single-camera face images participating in the joint optimization is N, and the N single-camera face images belong to the same face.
- N single-camera face images alternately fix the identity parameters or the facial posture parameters and the expression parameters, adjust the facial posture parameters, the facial expression parameters or the identity parameters, and jointly optimize to generate N A three-dimensional face model with the same identity parameters, respectively making the two-dimensional feature points of the face determined by the N three-dimensional face models match the two-dimensional feature points of the target face, the determined texture mapping information and the target texture mapping information to match.
- it includes:
- S601 Alternately fix the identity parameter or the facial posture parameter and the expression parameter, adjust the facial posture parameter, the facial expression parameter or the identity parameter, and generate N predicted three-dimensional facial models, wherein, in the fixed facial posture parameter In the case of adjusting the identity parameters by the expression parameters, jointly optimize the identity parameters of the N predicted three-dimensional face models;
- S603 Project the N predicted three-dimensional face models into the corresponding single-camera face images, respectively, to obtain predicted face two-dimensional feature points and predicted texture mapping information;
- S605 Based on the difference between the predicted face two-dimensional feature point and the target face two-dimensional feature point, the predicted texture mapping information and the target texture mapping information, perform a Iteratively adjusts the expression parameter or the identity parameter until at least one of the difference or the number of iterations satisfies a preset requirement.
- the identity parameters of the same face are the same, the identity parameters of the N predicted three-dimensional face models are jointly optimized under the condition that the facial posture parameters and the expression parameters are fixed to adjust the identity parameters.
- the technical solutions of the above-mentioned embodiments not only have the advantages of fast convergence speed and high reconstruction efficiency of alternate optimization, but also jointly optimize multiple single-camera face images by utilizing the same characteristics of the same face identity parameters. In this way, through one optimization, Multiple 3D face models can be reconstructed, which greatly improves the reconstruction efficiency.
- the joint optimization method is used and fixed.
- the identity parameters of the N predicted three-dimensional face models are adjusted, and the face pose parameters and the expression parameters are adjusted until at least one of the difference or the number of iterations meets a preset requirement.
- the face pose parameters and expression parameters can be optimized by fixing the identity parameters.
- the two-dimensional feature points of the target face and the target texture mapping information of the N single-camera face images are respectively determined by using the implementations of S101 and S103.
- N initial three-dimensional face models may be obtained, and the manner of obtaining the initial three-dimensional face models may refer to the foregoing embodiment, which is not limited herein. Projecting the N initial three-dimensional face models into the corresponding single-camera face images, respectively, can obtain N first predicted face two-dimensional feature points and first predicted texture mapping information, and determine N respectively.
- the face pose parameters and the expression parameters of the N models are adjusted respectively, and N groups of parameters are obtained as (identity parameter 1, face pose parameter 1, and expression parameter 1), (identity parameter 1, face posture parameter 2, the expression parameter 2) ... (identity parameter 1, face posture parameter N, the expression parameter N), according to N groups of parameters, N first predicted three-dimensional parameters can be determined face model. Then, the facial pose parameters and expression parameters can be fixed, and the identity parameters can be optimized.
- the N first predicted three-dimensional face models can be projected into the corresponding single-camera face images, and N second predicted face two-dimensional feature points and N second predicted textures can be obtained.
- N the identity parameters of the N models are adjusted respectively, and N groups of parameters are obtained as (identity parameter X, face pose parameter 1, the expression parameter 1), (identity parameter X, Face pose parameter 2, the expression parameter 2)...(identity parameter X, face pose parameter N, the expression parameter N), according to N groups of parameters, N second predicted three-dimensional face models can be determined . Adjust the face pose parameter, the expression parameter or the identity parameter alternately and iteratively until the predicted face two-dimensional feature point and the target face two-dimensional feature point, the predicted texture mapping information and the At least one of the difference between the target texture mapping information or the number of iterations satisfies the preset requirement.
- the identity parameter X obtained in the above joint optimization process can be used, and the single-camera obtained after optimization can be used.
- the face image process only the face posture parameters and the expression parameters may be optimized, which simplifies the optimization process and improves the reconstruction efficiency of the face model.
- the iterative adjustment method may include a gradient descent optimization algorithm (Gradient-based Optimization), a particle swarm optimization algorithm (Particle Swarm Optimization), etc., which is not limited in this application.
- the preset requirement corresponding to the difference may include that the value of the difference is less than or equal to a preset threshold, and the preset threshold may be set to a value such as 0 or 0.01.
- the preset requirement corresponding to the number of iterations may include that the number of iterations is less than the preset number of times, and the preset number of times may be set to, for example, 5 times, 7 times, and the like.
- the prior probability may also be used to constrain the identity parameter, the face pose parameter, and the expression parameter in a scenario where N single-camera face images are jointly optimized, so that the reconstructed three-dimensional The face model is more realistic.
- the existing single-camera capture technology still has problems such as poor accuracy and inability to capture eyeball states.
- the capture of eyeball state plays a decisive role in restoring the fidelity of facial expressions.
- a three-dimensional eyeball model can be acquired, the three-dimensional eyeball model includes eye gaze information, and then the three-dimensional face model and the three-dimensional eyeball model are combined into a new three-dimensional face model. In this way, a face model with an eyeball state can be captured, which is more realistic.
- a three-dimensional eyeball model can be obtained, wherein the three-dimensional eyeball model includes eyes.
- the methods for establishing the 3D eyeball model may include but are not limited to the following methods: eye capture based on infrared devices, the user needs to wear specified infrared glasses or install a specific infrared device, and determine the state of the eyes by comparing the intensity of the reflected infrared light. Reconstructing the eyeball; based on the eye capture of a single camera, using the synthesis-analysis method, the final eye is obtained by comparing the difference between the synthesized eye and the eye observed in the picture; the specific method is not limited here.
- a vivid and realistic human face image is obtained according to the three-dimensional human face model.
- a three-dimensional face model of a background actor can be rendered into an animated character to generate a vivid scene of live broadcast of the animated character.
- the three-dimensional face model of the player can be rendered into the game character to generate a vivid game scene.
- animation production and movie production which is not limited in this application.
- the three-dimensional face model reconstruction method provided by this application can be used in offline mode and real-time mode.
- the offline mode includes a method for reconstructing a three-dimensional face model according to an offline video. It does not need to output the three-dimensional face model immediately, and can be used in the post-production animation film and television.
- the real-time mode can also run in areas that require real-time interaction with users, such as interactive games and live broadcasts. After GPU acceleration in real-time applications, it can run in real-time (that is, after obtaining a picture, immediately output a 3D face model, the delay between these not easily perceived by the user).
- the three-dimensional face model reconstruction method can have offline mode and real-time mode, so that it can be more widely used.
- the three-dimensional face model reconstruction method can reconstruct a three-dimensional face model based on a single-camera face image, and takes advantage of the advantages of low-cost, easy installation, and user-friendliness of single-camera image acquisition. Based on this, the advantage of easy acquisition of single-camera face images can not only reduce the construction cost of face information prediction models, but also make reconstruction of 3D face models faster and easier.
- the two-dimensional face feature points and texture mapping information output by the face information prediction model can effectively improve the accuracy and robustness of the model reconstruction.
- the three-dimensional face model is obtained by reconstructing identity parameters, face pose parameters and expression parameters, providing accurate and reliable technology for single-camera virtual live broadcast, single-camera intelligent interaction technology, face recognition, criminal investigation monitoring, movie games, expression analysis and other technical fields Program.
- the present application also provides an electronic device, comprising a processor and a memory for storing executable instructions of the processor, and the processor can execute the instructions.
- an electronic device comprising a processor and a memory for storing executable instructions of the processor, and the processor can execute the instructions.
- the three-dimensional face model reconstruction method described in any of the above embodiments is implemented.
- the device 800 may include:
- an acquisition module 801 configured to acquire a single-camera face image
- the information prediction module 803 is used to input the single-camera face image into the face information prediction model, and output the two-dimensional feature points of the target face in the single-camera face image through the face information prediction model and the target texture mapping information;
- the model determination module 805 is used to determine a three-dimensional face model by using the identity parameters, the face posture parameters and the expression parameters, so that the two-dimensional feature points of the face determined by the three-dimensional face model are consistent with the two-dimensional feature points of the target face. Matching, the determined texture mapping information matches the target texture mapping information.
- the acquisition module includes:
- Image acquisition sub-module used to acquire single-camera images including face images
- the face detection submodule is used for performing face detection on the single-camera image, and intercepting the single-camera face image from the single-camera image.
- the face information prediction model is set to be obtained by training the following sub-modules:
- a sample acquisition sub-module used for acquiring a plurality of single-camera face sample images, wherein the single-camera face sample images are marked with face two-dimensional feature points and texture mapping information;
- a model construction submodule used for constructing a face information prediction model, wherein model parameters are set in the face information prediction model
- a prediction result generation sub-module is used to input the single-camera face sample image into the face information prediction model to generate a prediction result, and the prediction result includes the predicted face two-dimensional feature points and texture mapping information ;
- the iterative adjustment sub-module is configured to iteratively adjust the model parameters based on the difference between the prediction result, the two-dimensional feature points of the face, and the texture mapping information, until the difference meets a preset requirement.
- the single-camera face sample image is set to be acquired according to the following modules:
- the image acquisition sub-module is used to acquire multiple single-camera images of the same face from different angles simultaneously by using multiple cameras;
- model reconstruction sub-module for reconstructing a three-dimensional face model of the face by using the multiple single-camera images
- a face information acquisition sub-module used to project the three-dimensional face model of the human face into the multiple single-camera images respectively, and obtain the two-dimensional feature points of the face in the multiple single-camera images respectively and texture mapping information;
- the image segmentation sub-module is used for segmenting a face image from the multiple single-camera images according to the two-dimensional feature points of the face and/or the texture mapping information, and dividing the multiple face images
- the single-camera face sample image used for training the face information prediction model.
- the model determination module includes:
- Alternately adjusting the sub-module used to alternately fix one or two of the identity parameters, face posture parameters and expression parameters, and adjust the other two or one of the parameters to generate a three-dimensional face model, so that the three-dimensional face
- the two-dimensional feature points of the face determined by the model are matched with the two-dimensional feature points of the target face, and the determined texture mapping information is matched with the target texture mapping information.
- the alternate adjustment sub-module includes:
- the prediction model generation unit is used to alternately fix one or two of the parameters of the identity parameter, the face posture parameter and the expression parameter, and adjust the other two or one of the parameters to generate a predicted three-dimensional face model;
- a face information acquisition unit configured to project the predicted three-dimensional face model into the single-camera face image, and obtain predicted two-dimensional feature points of the face and predicted texture mapping information
- the iterative adjustment unit is configured to, based on the difference between the predicted two-dimensional feature point of the face and the two-dimensional feature point of the target face, the predicted texture mapping information and the target texture mapping information, adjust the difference between the other two Iteratively adjusts one or one parameter until at least one of the difference or the number of iterations meets a preset requirement.
- the iterative adjustment unit includes:
- a priori result obtaining subunit configured to obtain a priori probability distribution result and a priori probability target value of at least one of the identity parameter, the face posture parameter and the expression parameter;
- an iterative adjustment subunit used for the difference between the predicted face two-dimensional feature point and the target face two-dimensional feature point, the predicted texture mapping information and the target texture mapping information, and the identity parameter , the difference between the a priori probability distribution result of at least one parameter in the face pose parameter and the expression parameter and the prior probability target value, the other two or one parameter is iteratively adjusted until all the At least one of the difference or the number of iterations satisfies the preset requirement.
- the model determination module when the number N of the single-camera face images is greater than or equal to 2, and the N single-camera face images belong to the same face, the model determination module, include:
- a multi-model determination submodule configured to alternately fix the identity parameter or the facial posture parameter and the expression parameter based on the N single-camera face images, and adjust the facial posture parameter, the facial expression parameter or the identity parameter, Joint optimization generates N three-dimensional face models with the same identity parameters, so that the two-dimensional feature points of the face determined by the N three-dimensional face models are matched with the two-dimensional feature points of the target face, and the determined texture mapping The information matches the target texture mapping information.
- the multi-model determination submodule includes:
- the prediction model generation unit is used to alternately fix identity parameters or face posture parameters and expression parameters, adjust the face posture parameters, the expression parameters or the identity parameters, and generate N predicted three-dimensional face models, wherein, in Under the condition that the facial posture parameters and the facial expression parameters are adjusted to adjust the identity parameters, jointly optimize the identity parameters of the N predicted three-dimensional face models;
- a predicted face information acquisition unit configured to respectively project the N predicted three-dimensional face models into the corresponding single-camera face images to obtain predicted two-dimensional feature points of the predicted face and predicted texture mapping information
- the iterative adjustment unit is configured to, based on the difference between the predicted face two-dimensional feature point and the target face two-dimensional feature point, the predicted texture mapping information and the target texture mapping information, adjust the face
- the posture parameter, the expression parameter or the identity parameter are iteratively adjusted until at least one of the difference or the number of iterations satisfies a preset requirement.
- the multi-model determination submodule further includes:
- the optimization and adjustment unit is used to use and fix the identity parameters of the jointly optimized N predicted three-dimensional face models in the process of reconstructing a three-dimensional face for a subsequent single-camera face image, and adjust the face pose parameters and the expression parameters, until at least one of the difference or the number of iterations satisfies a preset requirement.
- the three-dimensional face model includes a three-dimensional model composed of a preset number of multiple polygonal meshes connected to each other, and the positions of the mesh vertices of the polygonal meshes are determined by the The identity parameter, the face gesture parameter and the expression parameter are determined.
- the device further includes:
- an eyeball model acquiring module used for acquiring a three-dimensional eyeball model, where the three-dimensional eyeball model includes eyeball information
- the model combining module is used for combining the three-dimensional face model and the three-dimensional eyeball model into a new three-dimensional face model.
- Another aspect of the present application further provides a computer-readable storage medium, on which computer instructions are stored, and when the instructions are executed, implement the steps of the method in any of the foregoing embodiments.
- the computer-readable storage medium may include a physical device for storing information, usually after digitizing the information and then storing it in a medium using electrical, magnetic or optical means.
- the computer-readable storage medium described in this embodiment may include: devices that use electrical energy to store information, such as various memories, such as RAM, ROM, etc.; devices that use magnetic energy to store information, such as hard disks, floppy disks, magnetic tapes, magnetic Core memory, magnetic bubble memory, U disk; devices that use optical means to store information such as CD or DVD.
- devices that use electrical energy to store information such as various memories, such as RAM, ROM, etc.
- devices that use magnetic energy to store information such as hard disks, floppy disks, magnetic tapes, magnetic Core memory, magnetic bubble memory, U disk
- devices that use optical means to store information such as CD or DVD.
- there are other readable storage media such as quantum memory, graphene memory, and so on.
- a Programmable Logic Device (such as a Field Programmable Gate Array (FPGA)) is an integrated circuit whose logic function is determined by user programming of the device.
- HDL Hardware Description Language
- ABEL Advanced Boolean Expression Language
- AHDL Altera Hardware Description Language
- HDCal JHDL
- Lava Lava
- Lola MyHDL
- PALASM RHDL
- VHDL Very-High-Speed Integrated Circuit Hardware Description Language
- Verilog Verilog
- the controller may be implemented in any suitable manner, for example, the controller may take the form of eg a microprocessor or processor and a computer readable medium storing computer readable program code (eg software or firmware) executable by the (micro)processor , logic gates, switches, application specific integrated circuits (ASICs), programmable logic controllers and embedded microcontrollers, examples of controllers include but are not limited to the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicon Labs C8051F320, the memory controller can also be implemented as part of the control logic of the memory.
- the controller may take the form of eg a microprocessor or processor and a computer readable medium storing computer readable program code (eg software or firmware) executable by the (micro)processor , logic gates, switches, application specific integrated circuits (ASICs), programmable logic controllers and embedded microcontrollers
- ASICs application specific integrated circuits
- controllers include but are not limited to
- the controller in addition to implementing the controller in the form of pure computer-readable program code, the controller can be implemented as logic gates, switches, application-specific integrated circuits, programmable logic controllers and embedded devices by logically programming the method steps.
- the same function can be realized in the form of a microcontroller, etc. Therefore, this kind of controller can be regarded as a hardware component, and the devices included in it for realizing various functions can also be regarded as a structure in the hardware component. Or even, the means for implementing various functions can be regarded as both a software module implementing a method and a structure within a hardware component.
- a typical implementation device is a computer.
- the computer can be, for example, a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or A combination of any of these devices.
- embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
- computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
- These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
- the apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.
- a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
- Memory may include forms of non-persistent memory, random access memory (RAM) and/or non-volatile memory in computer readable media, such as read only memory (ROM) or flash memory (flash RAM).
- RAM random access memory
- ROM read only memory
- flash RAM flash memory
- Computer-readable media includes both persistent and non-permanent, removable and non-removable media, and storage of information may be implemented by any method or technology.
- Information may be computer readable instructions, data structures, modules of programs, or other data.
- Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
- computer-readable media does not include transitory computer-readable media, such as modulated data signals and carrier waves.
- the embodiments of the present application may be provided as a method, a system or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
- computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
- the application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer.
- program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
- the application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network.
- program modules may be located in both local and remote computer storage media including storage devices.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Geometry (AREA)
- Software Systems (AREA)
- Computer Graphics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Collating Specific Patterns (AREA)
- Image Processing (AREA)
Abstract
Description
Claims (15)
- 一种三维人脸模型重建方法,其特征在于,所述方法包括:获取单相机人脸图像;将所述单相机人脸图像输入至人脸信息预测模型中,经所述人脸信息预测模型输出所述单相机人脸图像中的目标人脸二维特征点和目标纹理映射信息;利用身份参数、人脸姿态参数和表情参数确定三维人脸模型,使得所述三维人脸模型确定的人脸二维特征点与所述目标人脸二维特征点相匹配、确定的纹理映射信息和目标纹理映射信息相匹配。
- 根据权利要求1所述的方法,其特征在于,所述获取单相机人脸图像,包括:获取包含人脸图像在内的单相机图像;对所述单相机图像进行人脸检测,并从所述单相机图像中截取单相机人脸图像。
- 根据权利要求1所述的方法,其特征在于,所述人脸信息预测模型被设置为按照下述方式训练得到:获取多个单相机人脸样本图像,所述单相机人脸样本图像中标注有人脸二维特征点和纹理映射信息;构建人脸信息预测模型,所述人脸信息预测模型中设置有模型参数;将所述单相机人脸样本图像输入至所述人脸信息预测模型中,生成预测结果,所述预测结果包括预测得到的人脸二维特征点和纹理映射信息;基于所述预测结果与所述人脸二维特征点、所述纹理映射信息之间的差异,对所述模型参数进行迭代调整,直至所述差异满足预设要求。
- 根据权利要求3所述的方法,其特征在于,所述单相机人脸样本图像被设置为按照下述方式获取:利用多相机同时从不同角度采集得到同一人脸的多个单相机图像;利用所述多个单相机图像重建得到所述人脸的三维人脸模型;将所述人脸的所述三维人脸模型分别投影至所述多个单相机图像中,分别获取所述多个单相机图像中的人脸二维特征点和纹理映射信息;根据所述人脸二维特征点和/或所述纹理映射信息,分别从所述多个单相机图像中分割出人脸图像,并将多个所述人脸图像作为训练所述人脸信息预测模型所使用的单相机人脸样本图像。
- 根据权利要求1所述的方法,其特征在于,所述利用身份参数、人脸姿态参数和表情参数确定三维人脸模型,使得所述三维人脸模型确定的人脸二维特征点与所述目标人脸二维特征点相匹配、确定的纹理映射信息和目标纹理映射信息相匹配,包括:交替固定身份参数、人脸姿态参数和表情参数中的其中一种或两种参数,调整另外两种或一种参数,生成三维人脸模型,使得所述三维人脸模型确定的人脸二维特征点与所述目标人脸二维特征点相匹配、确定的纹理映射信息和目标纹理映射信息相匹配。
- 根据权利要求5所述的方法,其特征在于,所述交替固定身份参数、人脸姿态参数和表情参数中的其中一种或两种参数,调整另外两种或一种参数,生成三维人脸模型,包括:交替固定身份参数、人脸姿态参数和表情参数中的其中一种或两种参数,调整另外两种或一种参数,生成预测三维人脸模型;将所述预测三维人脸模型投影至所述单相机人脸图像中,获取预测人脸二维特征点和预测纹理映射信息;基于所述预测人脸二维特征点与所述目标人脸二维特征点、所述预测纹理映射信息与所述目标纹理映射信息之间的差异,对所述另外两种或一种参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求。
- 根据权利要求6所述的方法,其特征在于,所述基于所述预测人脸二维特征点与所述目标人脸二维特征点、所述预测纹理映射信息与所述目标纹理映射信息之间的差异,对所述另外两种或一种参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求,包括:获取所述身份参数、所述人脸姿态参数和所述表情参数中至少一种参数的先验概率分布结果和先验概率目标值;基于所述预测人脸二维特征点与所述目标人脸二维特征点、所述预测纹理映射信息与所述目标纹理映射信息之间的差异以及所述身份参数、所述人脸姿态参数和所述表情参数 中至少一种参数的先验概率分布结果与所述先验概率目标值的差异,对所述另外两种或一种参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求。
- 根据权利要求1所述的方法,其特征在于,在所述单相机人脸图像的数量N大于等于2,且N个单相机人脸图像属于同一人脸的情况下,所述利用身份参数、人脸姿态参数和表情参数确定三维人脸模型,使得所述三维人脸模型确定的人脸二维特征点与所述目标人脸二维特征点相匹配、确定的纹理映射信息和目标纹理映射信息相匹配,包括:基于所述N个单相机人脸图像,交替固定身份参数或者人脸姿态参数、表情参数,调整所述人脸姿态参数、所述表情参数或者所述身份参数,联合优化生成N个具有相同身份参数的三维人脸模型,分别使得所述N个三维人脸模型确定的人脸二维特征点与所述目标人脸二维特征点相匹配、确定的纹理映射信息和目标纹理映射信息相匹配。
- 根据权利要求8所述的方法,其特征在于,所述交替固定身份参数或者人脸姿态参数、表情参数,调整所述人脸姿态参数、所述表情参数或者所述身份参数,联合优化生成N个具有相同身份参数的三维人脸模型,包括:交替固定身份参数或者人脸姿态参数、表情参数,调整所述人脸姿态参数、所述表情参数或者所述身份参数,生成N个预测三维人脸模型,其中,在固定人脸姿态参数、表情参数调整身份参数的情况下,联合优化所述N个预测三维人脸模型的身份参数;分别将所述N个预测三维人脸模型投影至对应的所述单相机人脸图像中,获取预测人脸二维特征点和预测纹理映射信息;基于所述预测人脸二维特征点与所述目标人脸二维特征点、所述预测纹理映射信息与所述目标纹理映射信息之间的差异,对所述人脸姿态参数、所述表情参数或者所述身份参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求。
- 根据权利要求9所述的方法,其特征在于,在所述联合优化所述N个预测三维人脸模型的身份参数后,所述方法还包括:针对于后续的单相机人脸图像进行重建三维人脸的过程中,使用并固定所述联合优化所述N个预测三维人脸模型的所述身份参数,调整所述人脸姿态参数和所述表情参数,直至所述差异或者迭代次数中的至少一个满足预设要求。
- 根据权利要求1所述的方法,其特征在于,所述三维人脸模型包括由预设数量的多个多边形网格相互连接组成的三维模型,所述多边形网格的网格顶点的位置由所述身份参数、所述人脸姿态参数和所述表情参数确定。
- 根据权利要求1所述的方法,其特征在于,所述方法还包括:获取三维眼球模型,所述三维眼球模型包括眼神信息;将所述三维人脸模型和所述三维眼球模型组合成新的三维人脸模型。
- 一种三维人脸模型重建装置,其特征在于,包括:获取模块,用于获取单相机人脸图像;信息预测模块,用于将所述单相机人脸图像输入至人脸信息预测模型中,经所述人脸信息预测模型输出所述单相机人脸图像中的目标人脸二维特征点和目标纹理映射信息;模型确定模块,用于利用身份参数、人脸姿态参数和表情参数确定三维人脸模型,使得所述三维人脸模型确定的人脸二维特征点与所述目标人脸二维特征点相匹配、确定的纹理映射信息和目标纹理映射信息相匹配。
- 一种电子设备,其特征在于,包括处理器以及用于存储处理器可执行指令的存储器,所述处理器执行所述指令时实现权利要求1-12任一项所述的三维人脸模型重建方法。
- 一种非临时性计算机可读存储介质,其特征在于,当所述存储介质中的指令由处理器执行时,使得处理器能够执行权利要求1-12任意一项所述的三维人脸模型重建方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110082037.3A CN112884881B (zh) | 2021-01-21 | 2021-01-21 | 三维人脸模型重建方法、装置、电子设备及存储介质 |
CN202110082037.3 | 2021-01-21 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022156532A1 true WO2022156532A1 (zh) | 2022-07-28 |
Family
ID=76051743
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/070257 WO2022156532A1 (zh) | 2021-01-21 | 2022-01-05 | 三维人脸模型重建方法、装置、电子设备及存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112884881B (zh) |
WO (1) | WO2022156532A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117315154A (zh) * | 2023-10-12 | 2023-12-29 | 北京汇畅数宇科技发展有限公司 | 一种可量化的人脸模型重建方法及系统 |
CN117409466A (zh) * | 2023-11-02 | 2024-01-16 | 之江实验室 | 一种基于多标签控制的三维动态表情生成方法及装置 |
CN117593447A (zh) * | 2023-04-25 | 2024-02-23 | 上海任意门科技有限公司 | 基于2d关键点的三维人脸构建方法、系统、装置及介质 |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112884881B (zh) * | 2021-01-21 | 2022-09-27 | 魔珐(上海)信息科技有限公司 | 三维人脸模型重建方法、装置、电子设备及存储介质 |
CN113256799A (zh) * | 2021-06-07 | 2021-08-13 | 广州虎牙科技有限公司 | 一种三维人脸模型训练方法和装置 |
CN113327278B (zh) * | 2021-06-17 | 2024-01-09 | 北京百度网讯科技有限公司 | 三维人脸重建方法、装置、设备以及存储介质 |
CN113469091B (zh) * | 2021-07-09 | 2022-03-25 | 北京的卢深视科技有限公司 | 人脸识别方法、训练方法、电子设备及存储介质 |
CN113762147B (zh) * | 2021-09-06 | 2023-07-04 | 网易(杭州)网络有限公司 | 人脸表情迁移方法、装置、电子设备及存储介质 |
CN114339190B (zh) * | 2021-12-29 | 2023-06-23 | 中国电信股份有限公司 | 通讯方法、装置、设备及存储介质 |
CN114783022B (zh) * | 2022-04-08 | 2023-07-21 | 马上消费金融股份有限公司 | 一种信息处理方法、装置、计算机设备及存储介质 |
CN114898244B (zh) * | 2022-04-08 | 2023-07-21 | 马上消费金融股份有限公司 | 一种信息处理方法、装置、计算机设备及存储介质 |
CN114782864B (zh) * | 2022-04-08 | 2023-07-21 | 马上消费金融股份有限公司 | 一种信息处理方法、装置、计算机设备及存储介质 |
CN117689801B (zh) * | 2022-09-02 | 2024-07-30 | 影眸科技(上海)有限公司 | 人脸模型的建立方法、装置及电子设备 |
CN115393532B (zh) * | 2022-10-27 | 2023-03-14 | 科大讯飞股份有限公司 | 脸部绑定方法、装置、设备及存储介质 |
CN116503524B (zh) * | 2023-04-11 | 2024-04-12 | 广州赛灵力科技有限公司 | 一种虚拟形象的生成方法、系统、装置及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104966316A (zh) * | 2015-05-22 | 2015-10-07 | 腾讯科技(深圳)有限公司 | 一种3d人脸重建方法、装置及服务器 |
CN110956691A (zh) * | 2019-11-21 | 2020-04-03 | Oppo广东移动通信有限公司 | 一种三维人脸重建方法、装置、设备及存储介质 |
CN111274944A (zh) * | 2020-01-19 | 2020-06-12 | 中北大学 | 一种基于单张图像的三维人脸重建方法 |
CN112819944A (zh) * | 2021-01-21 | 2021-05-18 | 魔珐(上海)信息科技有限公司 | 三维人体模型重建方法、装置、电子设备及存储介质 |
CN112884881A (zh) * | 2021-01-21 | 2021-06-01 | 魔珐(上海)信息科技有限公司 | 三维人脸模型重建方法、装置、电子设备及存储介质 |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104978764B (zh) * | 2014-04-10 | 2017-11-17 | 华为技术有限公司 | 三维人脸网格模型处理方法和设备 |
CN105096377B (zh) * | 2014-05-14 | 2019-03-19 | 华为技术有限公司 | 一种图像处理方法和装置 |
CN108596827B (zh) * | 2018-04-18 | 2022-06-17 | 太平洋未来科技(深圳)有限公司 | 三维人脸模型生成方法、装置及电子设备 |
CN109191507B (zh) * | 2018-08-24 | 2019-11-05 | 北京字节跳动网络技术有限公司 | 三维人脸图像重建方法、装置和计算机可读存储介质 |
WO2020037676A1 (zh) * | 2018-08-24 | 2020-02-27 | 太平洋未来科技(深圳)有限公司 | 三维人脸图像生成方法、装置及电子设备 |
CN109377544B (zh) * | 2018-11-30 | 2022-12-23 | 腾讯科技(深圳)有限公司 | 一种人脸三维图像生成方法、装置和可读介质 |
CN110428491B (zh) * | 2019-06-24 | 2021-05-04 | 北京大学 | 基于单帧图像的三维人脸重建方法、装置、设备及介质 |
-
2021
- 2021-01-21 CN CN202110082037.3A patent/CN112884881B/zh active Active
-
2022
- 2022-01-05 WO PCT/CN2022/070257 patent/WO2022156532A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104966316A (zh) * | 2015-05-22 | 2015-10-07 | 腾讯科技(深圳)有限公司 | 一种3d人脸重建方法、装置及服务器 |
CN110956691A (zh) * | 2019-11-21 | 2020-04-03 | Oppo广东移动通信有限公司 | 一种三维人脸重建方法、装置、设备及存储介质 |
CN111274944A (zh) * | 2020-01-19 | 2020-06-12 | 中北大学 | 一种基于单张图像的三维人脸重建方法 |
CN112819944A (zh) * | 2021-01-21 | 2021-05-18 | 魔珐(上海)信息科技有限公司 | 三维人体模型重建方法、装置、电子设备及存储介质 |
CN112884881A (zh) * | 2021-01-21 | 2021-06-01 | 魔珐(上海)信息科技有限公司 | 三维人脸模型重建方法、装置、电子设备及存储介质 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117593447A (zh) * | 2023-04-25 | 2024-02-23 | 上海任意门科技有限公司 | 基于2d关键点的三维人脸构建方法、系统、装置及介质 |
CN117315154A (zh) * | 2023-10-12 | 2023-12-29 | 北京汇畅数宇科技发展有限公司 | 一种可量化的人脸模型重建方法及系统 |
CN117409466A (zh) * | 2023-11-02 | 2024-01-16 | 之江实验室 | 一种基于多标签控制的三维动态表情生成方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
CN112884881A (zh) | 2021-06-01 |
CN112884881B (zh) | 2022-09-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022156532A1 (zh) | 三维人脸模型重建方法、装置、电子设备及存储介质 | |
WO2022156533A1 (zh) | 三维人体模型重建方法、装置、电子设备及存储介质 | |
Kartynnik et al. | Real-time facial surface geometry from monocular video on mobile GPUs | |
US11238606B2 (en) | Method and system for performing simultaneous localization and mapping using convolutional image transformation | |
US11756223B2 (en) | Depth-aware photo editing | |
WO2022156640A1 (zh) | 一种图像的视线矫正方法、装置、电子设备、计算机可读存储介质及计算机程序产品 | |
JP2023545199A (ja) | モデル訓練方法、人体姿勢検出方法、装置、デバイスおよび記憶媒体 | |
JP2022503647A (ja) | クロスドメイン画像変換 | |
US20240046557A1 (en) | Method, device, and non-transitory computer-readable storage medium for reconstructing a three-dimensional model | |
WO2014117446A1 (zh) | 基于单个视频摄像机的实时人脸动画方法 | |
CN113762461B (zh) | 使用可逆增强算子采用有限数据训练神经网络 | |
WO2023071790A1 (zh) | 目标对象的姿态检测方法、装置、设备及存储介质 | |
WO2022148248A1 (zh) | 图像处理模型的训练方法、图像处理方法、装置、电子设备及计算机程序产品 | |
Wang et al. | Instance shadow detection with a single-stage detector | |
Baudron et al. | E3d: event-based 3d shape reconstruction | |
US20240161391A1 (en) | Relightable neural radiance field model | |
US10783704B2 (en) | Dense reconstruction for narrow baseline motion observations | |
CN116977547A (zh) | 一种三维人脸重建方法、装置、电子设备和存储介质 | |
Jiang et al. | Complementing event streams and rgb frames for hand mesh reconstruction | |
Zhang et al. | 3D Gesture Estimation from RGB Images Based on DB-InterNet | |
Shamalik et al. | Effective and efficient approach for gesture detection in video through monocular RGB frames | |
WO2017173977A1 (zh) | 一种移动终端目标跟踪方法、装置和移动终端 | |
WO2023172353A1 (en) | Probabilistic keypoint regression with uncertainty | |
Ru et al. | Action recognition based on binocular vision. | |
Holynski | Augmenting Visual Memories |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22742011 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22742011 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 11.12.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22742011 Country of ref document: EP Kind code of ref document: A1 |