CN113870420A - Three-dimensional face model reconstruction method and device, storage medium and computer equipment - Google Patents

Three-dimensional face model reconstruction method and device, storage medium and computer equipment Download PDF

Info

Publication number
CN113870420A
CN113870420A CN202111184666.3A CN202111184666A CN113870420A CN 113870420 A CN113870420 A CN 113870420A CN 202111184666 A CN202111184666 A CN 202111184666A CN 113870420 A CN113870420 A CN 113870420A
Authority
CN
China
Prior art keywords
dimensional
face
model
dimensional image
cost function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111184666.3A
Other languages
Chinese (zh)
Inventor
王顺飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN202111184666.3A priority Critical patent/CN113870420A/en
Publication of CN113870420A publication Critical patent/CN113870420A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The application discloses a three-dimensional face model reconstruction method, a three-dimensional face model reconstruction device, a storage medium and computer equipment, wherein the method comprises the following steps: inputting a target two-dimensional image into a trained face reconstruction model, and outputting a corresponding three-dimensional face model, wherein the face reconstruction model is generated based on cost function training, the cost function comprises a first cost function, a second cost function and a third cost function, the first cost function is obtained based on a predicted three-dimensional face model corresponding to a sample two-dimensional image and a standard three-dimensional face model corresponding to the sample two-dimensional image, the second cost function is obtained based on a smooth three-dimensional face model after smoothing processing is carried out on the predicted three-dimensional face model and the predicted three-dimensional face model, and the third cost function is obtained based on face key points after two-dimensional projection is carried out on the predicted three-dimensional face model and face key points in the sample two-dimensional image. By adopting the embodiment of the application, the three-dimensional face model reconstructed by using the pre-trained face reconstruction model has a smooth effect with outstanding effect.

Description

Three-dimensional face model reconstruction method and device, storage medium and computer equipment
Technical Field
The present application relates to the field of computer device application technologies, and in particular, to a three-dimensional face model reconstruction method, an apparatus, a storage medium, and a computer device.
Background
The three-dimensional face model reconstruction refers to reconstructing a three-dimensional model of a face from one or more two-dimensional face images, has one more dimension compared with the two-dimensional face images, and is widely applied to the fields of movies, games and the like. The three-dimensional face reconstruction technology can restore the three-dimensional shape of the face in the two-dimensional face image, has higher application value in the fields of animation production, online games and the like, and has wide application prospect.
Disclosure of Invention
The embodiment of the application provides a three-dimensional face model reconstruction method, a three-dimensional face model reconstruction device, a storage medium and computer equipment. The technical scheme is as follows:
in a first aspect, an embodiment of the present application provides a three-dimensional face model reconstruction method, where the method includes:
acquiring a target two-dimensional image;
inputting the target two-dimensional image into a trained face reconstruction model, and outputting a three-dimensional face model corresponding to the target two-dimensional image;
the face reconstruction model is generated based on cost function training, the cost function comprises a first cost function, a second cost function and a third cost function, the first cost function is obtained based on a predicted three-dimensional face model corresponding to a sample two-dimensional image and a standard three-dimensional face model corresponding to the sample two-dimensional image, the second cost function is obtained based on a smooth three-dimensional face model obtained after smoothing processing is carried out on the predicted three-dimensional face model and the predicted three-dimensional face model, and the third cost function is obtained based on a face key point obtained after two-dimensional projection is carried out on the predicted three-dimensional face model and a face key point in the sample two-dimensional image;
the prediction three-dimensional face model is obtained by predicting the sample two-dimensional image based on the created initial face reconstruction model, and the standard three-dimensional face model is obtained by performing constrained reconstruction on the average face three-dimensional model based on the face key points in the sample two-dimensional image.
In a second aspect, an embodiment of the present application provides a three-dimensional face model reconstruction apparatus, where the three-dimensional face model reconstruction apparatus includes:
the image acquisition module is used for acquiring a target two-dimensional image;
the model prediction module is used for inputting the target two-dimensional image into a trained human face reconstruction model and outputting a three-dimensional human face model corresponding to the target two-dimensional image;
the face reconstruction model is generated based on cost function training, the cost function comprises a first cost function, a second cost function and a third cost function, the first cost function is obtained based on a predicted three-dimensional face model corresponding to a sample two-dimensional image and a standard three-dimensional face model corresponding to the sample two-dimensional image, the second cost function is obtained based on a smooth three-dimensional face model obtained after smoothing processing is carried out on the predicted three-dimensional face model and the predicted three-dimensional face model, and the third cost function is obtained based on a face key point obtained after two-dimensional projection is carried out on the predicted three-dimensional face model and a face key point in the sample two-dimensional image;
the prediction three-dimensional face model is obtained by predicting the sample two-dimensional image based on the created initial face reconstruction model, and the standard three-dimensional face model is obtained by performing constrained reconstruction on the average face three-dimensional model based on the face key points in the sample two-dimensional image.
In a third aspect, embodiments of the present application provide a storage medium having at least one instruction stored thereon, where the at least one instruction is adapted to be loaded by a processor and to perform the above-mentioned method steps.
In a fourth aspect, an embodiment of the present application provides a computer device, which may include: a processor and a memory; wherein the memory stores at least one instruction adapted to be loaded by the processor and to perform the above-mentioned method steps.
The beneficial effects brought by the technical scheme provided by some embodiments of the application at least comprise:
by adopting the three-dimensional face model reconstruction method provided by the embodiment of the application, the acquired target two-dimensional image is input into a face reconstruction model generated based on cost function training, and then a three-dimensional face model corresponding to the target two-dimensional image is output, wherein the face reconstruction model is generated by using a first cost function constructed based on a predicted three-dimensional face model corresponding to a sample two-dimensional image and a standard three-dimensional face model corresponding to the sample two-dimensional image, a second cost function constructed based on a smooth three-dimensional face model after smoothing processing of the predicted three-dimensional face model and the predicted three-dimensional face model, and a third cost function constructed based on face key points after two-dimensional projection of the predicted three-dimensional face model and face key points in the sample two-dimensional image. Because the face reconstruction model is subjected to smoothing processing in the training process, the three-dimensional face model reconstructed by using the face reconstruction model has a remarkable smoothing effect, and a better smoothing effect can be achieved without adding extra smoothing calculation amount in the application process of the face reconstruction model.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a three-dimensional face model reconstruction method according to an embodiment of the present application;
fig. 2 is an exemplary schematic diagram of three-dimensional face reconstruction provided in an embodiment of the present application;
fig. 3 is a schematic flow chart of a three-dimensional face model reconstruction method according to an embodiment of the present application;
fig. 4 is a schematic flow chart of a three-dimensional face model reconstruction method according to an embodiment of the present application;
fig. 5 is a flowchart of reconstructing a standard three-dimensional face model according to an embodiment of the present application;
fig. 6 is a flowchart of training a face reconstruction model according to an embodiment of the present application;
fig. 7 is a schematic flow chart of a three-dimensional face model reconstruction method according to an embodiment of the present application;
fig. 8 is a schematic flow chart of a three-dimensional face model reconstruction method according to an embodiment of the present application;
FIG. 9 is a schematic diagram illustrating an example stylization process provided by an embodiment of the present application;
fig. 10 is an exemplary schematic diagram of reconstructing a three-dimensional face model by using face key point constraint according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of a three-dimensional human face model reconstruction apparatus according to an embodiment of the present application;
fig. 12 is a schematic structural diagram of a three-dimensional human face model reconstruction apparatus according to an embodiment of the present application;
FIG. 13 is a schematic structural diagram of a model training module according to an embodiment of the present disclosure;
fig. 14 is a schematic structural diagram of a cost function constructing unit according to an embodiment of the present application;
fig. 15 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the description of the present application, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In the description of the present application, it is noted that, unless explicitly stated or limited otherwise, "including" and "having" and any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus. The specific meaning of the above terms in the present application can be understood in a specific case by those of ordinary skill in the art. Further, in the description of the present application, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
Before describing the embodiments of the present invention more clearly, some concepts of the present invention will be described in detail to better understand the present invention.
The cost function is: usually, in order to obtain the parameters for training the logistic regression model, a cost function is required, and the parameters are obtained by training the cost function.
Stylizing: changing the two-dimensional image from one style field to another style field, such as converting a realistic two-dimensional image into a cartoon two-dimensional image.
Face key points: some important feature point positions such as eyes, nose tip, mouth corner points, eyebrows, and contour points of various parts of the human face.
Most of the conventional three-dimensional face reconstruction methods are based on image information, such as three-dimensional face reconstruction based on one or more information modeling techniques of image brightness, edge information, linear perspective, color, relative height, parallax and the like. The three-dimensional face reconstruction method based on the model is a popular three-dimensional face reconstruction method at present; the popular models at present comprise a universal face model (CANDIDE-3), a three-dimensional deformation model (three-dimensional MM) and a variant model thereof, and the three-dimensional face reconstruction algorithm based on the universal face model and the three-dimensional deformation model comprises a traditional algorithm and a deep learning algorithm.
In the prior art, no matter three-dimensional face reconstruction based on a traditional algorithm or three-dimensional face reconstruction based on a deep learning algorithm, smoothing processing is mostly performed on a reconstructed three-dimensional face model in a smoothing processing mode, namely smoothing processing is performed after the reconstruction of the three-dimensional face model is completed, and the smoothing processing mode lacks image prior characteristic information and cannot achieve a good smoothing effect on the face model, so that some regions with serious flaws still exist.
The application provides a three-dimensional face model reconstruction method based on a deep learning algorithm, a face reconstruction model based on deep learning is trained, three-dimensional face reconstruction is carried out on a two-dimensional image based on the face reconstruction model, and smoothness training is added in the process of training the face reconstruction model based on deep learning, so that the trained face reconstruction model has a good smoothing effect when the three-dimensional face model is reconstructed.
The following is a detailed description with reference to specific examples. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the application, as detailed in the appended claims. The flow diagrams depicted in the figures are merely exemplary and need not be performed in the order of the steps shown. For example, some steps are parallel, and there is no strict sequence relationship in logic, so the actual execution sequence is variable.
Fig. 1 is a schematic flow chart of a three-dimensional face model reconstruction method according to an embodiment of the present disclosure. As shown in fig. 1, the three-dimensional face model reconstruction method may include the following steps S101 to S103.
S101, acquiring a target two-dimensional image;
the target two-dimensional image is a two-dimensional face image needing three-dimensional face reconstruction, and can be obtained through offline shooting, cloud downloading or calling in a local gallery and the like.
And S102, inputting the target two-dimensional image into a trained human face reconstruction model, and outputting a three-dimensional human face model corresponding to the target two-dimensional image.
Specifically, the target two-dimensional image is input into a trained face reconstruction model, and the face reconstruction reconstructs a three-dimensional face model corresponding to the target two-dimensional image based on face image information in the target two-dimensional image.
Optionally, before inputting the target two-dimensional image into the trained face reconstruction model, preprocessing the target two-dimensional image, where the preprocessing may include: if the target two-dimensional image is not the face-up direction, rotating the target two-dimensional image to enable the face-up direction in the target two-dimensional image; cutting the target two-dimensional image, and cutting a background area except the face area to reduce the data amount required by the calculation of the face reconstruction model; and normalizing the cut target two-dimensional image, and normalizing the gray value of each pixel point in the target two-dimensional image, so that a face reconstruction model can be calculated conveniently.
Optionally, the face reconstruction model may include a back bone based on a Convolutional Neural Network (CNN) and a Multi-layer Perceptron (MLP) model. The backbone is an encoder network based on the CNN and used for extracting image features of a target two-dimensional image, and the backbone can adopt, but is not limited to, a backbone network such as a mobilene, a rescet, and an xception. Inputting the target two-dimensional image into a trained human face reconstruction model, and outputting a three-dimensional human face model corresponding to the target two-dimensional image, including: inputting a target two-dimensional image into a CNN-based backbone network, extracting image characteristics from the backbone target two-dimensional image, inputting the extracted image characteristics into an MLP model, outputting regression parameters from the MLP model, transforming the regression parameters to generate a deformation parameter matrix, and obtaining a three-dimensional face model corresponding to the target two-dimensional image based on the deformation parameter matrix.
Fig. 2 is a schematic diagram illustrating an example of three-dimensional face reconstruction according to an embodiment of the present disclosure.
As shown in fig. 2, the three-dimensional face model as shown in the figure can be obtained by inputting the target two-dimensional image into the face reconstruction model. The face reconstruction model shown includes a Convolutional Neural Network (CNN) based backbone and a Multi-layer Perceptron (MLP) model.
It is understood that the face reconstruction model is obtained based on complex training, in the embodiment of the present application, the face reconstruction model is generated based on cost function training, the cost function includes a first cost function, a second cost function and a third cost function, the first cost function is obtained based on a predicted three-dimensional face model corresponding to a sample two-dimensional image and a standard three-dimensional face model corresponding to the sample two-dimensional image, the second cost function is obtained based on the predicted three-dimensional face model and a smoothed three-dimensional face model after smoothing processing the predicted three-dimensional face model, the third cost function is obtained based on a face key point after two-dimensional projection of the predicted three-dimensional face model and a face key point in the sample two-dimensional image, the predicted three-dimensional face model is obtained based on the created initial face reconstruction model, and the standard three-dimensional face model is obtained by performing constrained reconstruction on the average face three-dimensional model based on the face key points in the sample two-dimensional image.
By adopting the three-dimensional face model reconstruction method provided by the embodiment of the application, the acquired target two-dimensional image is input into a face reconstruction model generated based on cost function training, and then a three-dimensional face model corresponding to the target two-dimensional image is output, wherein the face reconstruction model is generated by using a first cost function constructed based on a predicted three-dimensional face model corresponding to a sample two-dimensional image and a standard three-dimensional face model corresponding to the sample two-dimensional image, a second cost function constructed based on a smooth three-dimensional face model after smoothing processing of the predicted three-dimensional face model and the predicted three-dimensional face model, and a third cost function constructed based on face key points after two-dimensional projection of the predicted three-dimensional face model and face key points in the sample two-dimensional image. Because the face reconstruction model is subjected to smoothing processing in the training process, the three-dimensional face model reconstructed by using the face reconstruction model has a remarkable smoothing effect, and a better smoothing effect can be achieved without adding extra smoothing calculation amount in the application process of the face reconstruction model.
Fig. 3 is a schematic flow chart of a three-dimensional face model reconstruction method according to an embodiment of the present disclosure. As shown in fig. 3, the three-dimensional face model reconstruction method may include the following steps.
S201, collecting a two-dimensional sample image;
the sample two-dimensional image comprises a face image, and the sample two-dimensional image is used as training data to assist in training a three-dimensional face model.
Optionally, the face image area in the sample two-dimensional image should exceed a certain proportion of the total area of the sample two-dimensional image, and the proportion may be 75%.
Optionally, if the face image area in the sample two-dimensional image does not exceed the preset proportion of the total area of the sample two-dimensional image, the sample two-dimensional image is cut to obtain a new sample two-dimensional image, so that the face image area in the new sample two-dimensional image exceeds the preset proportion of the total area of the sample two-dimensional image.
Optionally, if the face image in the sample two-dimensional image is skewed, the sample two-dimensional image is rotated to obtain a new sample two-dimensional image, so that the face image in the new sample two-dimensional image is an orthographic face image.
The sample two-dimensional image can be acquired by offline shooting.
S202, extracting key points of the human face in the sample two-dimensional image, and acquiring an average face three-dimensional model;
specifically, the face key point detection is performed on the sample two-dimensional image to obtain the face key points in the sample two-dimensional image, and an average face three-dimensional model is obtained.
The face key points refer to some important feature points such as eyes, nose tip, mouth corner points, eyebrows, contour points of each part of the face and the like.
The face key point detection refers to positioning key area positions of a face and extracting key points of the key area positions, and the detection method of the face key points includes but is not limited to an Active Shape Model (ASM) algorithm, an Active Appearance Model (AAM) algorithm, a deep learning algorithm and the like.
The average face three-dimensional model is that a real person adopts a depth camera to collect a face to obtain three-dimensional point cloud data of the face, and the face three-dimensional model is generated according to the three-dimensional point cloud data. Multiple real persons are required to be collected and multiple face three-dimensional models are generated, and finally the multiple face three-dimensional models are averaged to obtain an average face three-dimensional model.
S203, performing constrained reconstruction on the average face three-dimensional model based on the face key points in the sample two-dimensional image to obtain a standard three-dimensional face model corresponding to the sample two-dimensional image;
specifically, an initial deformation matrix and projection parameters are obtained, the average face three-dimensional model is reconstructed based on the deformation matrix to obtain an initial three-dimensional face model, the initial three-dimensional face model is projected to a two-dimensional space based on the projection parameters to obtain a two-dimensional face image, face key points in the two-dimensional face image are extracted, each face key point in the two-dimensional face image and each face key point in the sample two-dimensional image are compared to obtain a difference value, the deformation matrix and the projection parameters are updated based on the difference value, the above processes are iteratively executed until the difference value between each face key point in the two-dimensional face image and each face key point in the sample two-dimensional image is smaller than a preset threshold value, the iteration is stopped, the deformation matrix at the moment is the deformation matrix to be obtained, and the average face three-dimensional model is constrained and reconstructed based on the deformation matrix, and obtaining a standard three-dimensional face model corresponding to the sample two-dimensional image.
Optionally, after performing constrained reconstruction on the average face three-dimensional model based on the face key points in the sample two-dimensional image to obtain a standard three-dimensional face model, performing smoothing processing on the standard three-dimensional face model to obtain a new standard three-dimensional face model, so that the newly obtained standard three-dimensional face model has a better smoothing effect. In this way, in step S207, based on the first cost function constructed by each vertex in the predicted three-dimensional face model and each vertex in the standard three-dimensional face model, the role of the first cost function in the training process is not limited to performing vertex correction on the face reconstruction model, and the smoothing effect of the face reconstruction model can also be enhanced.
Specifically, the smoothing of the standard three-dimensional face model to obtain a new standard three-dimensional face model includes:
Figure BDA0003298189470000051
wherein, the ViFor the ith three-dimensional vertex in the standard three-dimensional face model before smoothing, the method
Figure BDA0003298189470000052
Is the said ViDirectly adjacent first order neighborhood points, said i representing the ith three-dimensional vertex, said j representing the index of the first order neighborhood point, said
Figure BDA0003298189470000053
And the ith three-dimensional vertex in the standard three-dimensional face model after the smoothing processing is performed, wherein alpha represents a smoothing coefficient, and belongs to (0, 1).
Optionally, the average face three-dimensional model is constrained and reconstructed based on the face key points in the sample two-dimensional image to obtain a standard three-dimensional face model corresponding to the sample two-dimensional image, and the following steps may be referred to in a manner of obtaining an optimal value through an energy function:
first, an initial deformation matrix is created along with projection parameters. The iteration is then performed as follows.
Equation 1: minR,t,P′,w,ΠEdef(P′,w)+λElan(Π,R,t,P′)
W is a deformation matrix, pi is a scale parameter in the projection parameters, R is a rotation parameter in the projection parameters, t is a translation parameter in the projection parameters, and P' is a coordinate of a point in the standard three-dimensional human face model.
Judging the time E between the current iteration and the last iterationdefWhether the difference of (c) is less than a set threshold, i.e. | Edef j-Edef j-1If the iteration times are not met, the following process is continuously executed, and iteration is performed.
Equation 2: edef=||P′-w.P||2
Wherein, P is the coordinate of the midpoint of the three-dimensional model of the average face, P 'is calculated by using formula 1, P' in the formula is fixed, and the deformation matrix w is solved by using the least square method.
Equation 3:
Figure BDA0003298189470000054
wherein q isiIs the coordinate, p ', of the ith key point in the two-dimensional image of the sample'iFor q in the two-dimensional image of the sampleiSubstituting the P' calculated by the formula 1 into the formula 3 according to the coordinates of the points in the standard three-dimensional human face model, and updating the projection parameters pi, R and t.
It will be appreciated that equation 2 is essentially an update of the deformation matrix during each iteration, and equation 3 is an update of the projection parameters during each iteration.
Optionally, the standard three-dimensional face model is obtained by performing smoothing processing on the standard three-dimensional face model, and may also be obtained by performing smoothing processing based on curvature, a Taubin smoothing algorithm and the like, and the embodiment of the present application is not limited to the specific implementation manner of the smoothing processing.
S204, creating an initial face reconstruction model, and inputting the sample two-dimensional image into the initial face reconstruction model for prediction to obtain a predicted three-dimensional face model;
the initial face reconstruction model is an untrained and rough deep learning model.
Specifically, an initial face reconstruction model is created, a sample two-dimensional image is input into the initial face reconstruction model, and a predicted three-dimensional face model is obtained by using the initial face model for prediction.
It can be understood that, since the initial face reconstruction model is an untrained and coarse deep learning model, the predicted three-dimensional face model obtained by prediction has a high distortion degree and is poor in smoothness.
S205, smoothing the predicted three-dimensional face model to obtain a smooth three-dimensional face model;
specifically, refer to the specific implementation manner of performing the smoothing process on the standard three-dimensional face model in step S203 to obtain a new standard three-dimensional face model.
S206, performing two-dimensional projection on the predicted three-dimensional face model to obtain a predicted two-dimensional image, and extracting face key points in the predicted two-dimensional image;
specifically, projection parameters in the initial face reconstruction model are obtained, the predicted three-dimensional face model is subjected to two-dimensional projection based on the projection parameters to obtain a predicted two-dimensional image, and face key points in the predicted two-dimensional image are extracted.
It can be understood that the embodiment of the present application is substantially an iterative training process, in each iterative process, parameters of the initial face reconstruction model are adjusted and updated, then the updated initial face reconstruction model is used to perform the next iteration process, and after parameters of the initial face reconstruction model are adjusted and updated, the updated initial face reconstruction model also updates projection parameters.
S207, constructing a cost function based on the predicted three-dimensional face model, the standard three-dimensional face model, the smooth three-dimensional face model, the face key points in the predicted two-dimensional image and the face key points in the sample two-dimensional image;
illustratively, on the basis of fig. 3, as shown in fig. 4, step S207 may include S2071, S2072, S2073 and S2074.
S2071, constructing a first cost function based on each vertex in the predicted three-dimensional face model and each vertex in the standard three-dimensional face model;
Figure BDA0003298189470000061
wherein N is the number of vertexes, and V isiFor the ith vertex in the predicted three-dimensional face model, the VGT,iIs the ith vertex in the standard three-dimensional face model.
It is understood that the predicted three-dimensional face model is obtained by directly predicting a sample two-dimensional image by an initial face reconstruction model, and the standard three-dimensional face model can be obtained by performing constrained reconstruction on an average face three-dimensional model based on face key points in the sample two-dimensional image and then performing smoothing. Thus, the standard three-dimensional face model is more accurate and smoother than the predicted three-dimensional face model, whether in model shape. And then constructing a first cost function based on each vertex in the predicted three-dimensional face model and each vertex in the standard three-dimensional face model, and performing model internal parameter adjustment training on the initial face reconstruction model based on the first cost function, so that the accuracy and the smooth effect of the initial face reconstruction model on the reconstruction model can be improved.
S2072, constructing a second cost function based on each vertex in the predicted three-dimensional face model and each vertex in the smooth three-dimensional face model;
Figure BDA0003298189470000062
wherein N is the number of vertices, ViFor the ith vertex in the predicted three-dimensional face model, the
Figure BDA0003298189470000071
Is the ith vertex in the smooth three-dimensional face model.
It is understood that the predicted three-dimensional face model is obtained by directly predicting a sample two-dimensional image by an initial face reconstruction model, and the smooth three-dimensional face model is obtained by smoothing the predicted three-dimensional face model. Therefore, the smoothed three-dimensional face model has a significant smoothing effect compared to the predicted three-dimensional face model. And then constructing a second cost function based on each vertex in the predicted three-dimensional face model and each vertex in the smooth three-dimensional face model, and performing model internal parameter adjustment training on the initial face reconstruction model based on the second cost function, so that the smoothing effect of the initial face reconstruction model on the reconstruction model can be optimized.
S2073, constructing a third cost function based on the human face key points in the predicted two-dimensional image and the human face key points in the sample two-dimensional image;
Figure BDA0003298189470000072
wherein M is the number of key points of the face, lmkiFor the prediction of ith personal face keypoints in the two-dimensional image, lmkGT,iIs the ith personal face keypoint in the sample two-dimensional image.
It is understood that the predicted two-dimensional image is obtained by performing two-dimensional projection on the predicted three-dimensional face model. Therefore, compared with the sample two-dimensional image, the prediction two-dimensional image has larger difference on the key points of the human face, a third cost function is constructed based on the key points of the human face in the prediction two-dimensional image and the key points of the human face in the sample two-dimensional image, and the initial human face reconstruction model is adjusted and trained on the parameters in the model based on the third cost function, so that the accuracy of the initial human face reconstruction model on the reconstruction model can be optimized, and the reconstructed three-dimensional human face model is closer to the standard three-dimensional human face model corresponding to the sample two-dimensional image.
S2074, carrying out weighted summation on the first cost function, the second cost function and the third cost function to obtain a cost function.
Specifically, different weight values are set for the first cost function, the second cost function, and the third cost function, and weighted summation is performed to obtain a total cost function including the three cost functions.
Further, in the embodiment of the present application, a cost function based on MSE is used for calculation and back propagation. Alternatively, a cost function in another form different from the MSE may also be used, and the embodiment of the present application is not limited thereto. For example, a cost function based on SAD may be used, and then the first cost function, the second cost function, and the third cost function may be respectively expressed as:
first cost function:
Figure BDA0003298189470000073
the second cost function:
Figure BDA0003298189470000074
third cost function:
Figure BDA0003298189470000075
s208, training the initial face reconstruction model based on the cost function to obtain a trained face reconstruction model;
in the embodiment of the present application, steps S204 to S208 are iterative loop training processes, and steps S201 to S203 are used to provide training data for the iterative loop training processes of steps S204 to S208.
Please refer to fig. 5, which is a flowchart illustrating a standard three-dimensional face model reconstruction method according to an embodiment of the present disclosure.
As shown in fig. 5, a standard three-dimensional face model can be obtained by extracting face key points from a sample two-dimensional image, performing constrained reconstruction on an average face three-dimensional model by using the face key points extracted from the sample two-dimensional image, and performing smoothing.
Please refer to fig. 6, which is a flowchart illustrating training a face reconstruction model according to an embodiment of the present disclosure.
As shown in fig. 6, inputting the sample two-dimensional image into an initial face reconstruction model, which predicts a predicted three-dimensional face model, and constructing a first cost function based on the predicted three-dimensional face model and a standard three-dimensional face model found in the flowchart shown in fig. 6; smoothing the predicted three-dimensional face model to obtain a smooth three-dimensional face model, and constructing a second cost function based on the smooth three-dimensional face model and the predicted three-dimensional face model; projecting the projection parameters given by the three-dimensional prediction human face model and the initial human face reconstruction model to two dimensions to obtain a two-dimensional prediction image, extracting human face key points in the two-dimensional prediction image, as shown in the figure, human face key points 2, wherein the human face key points 1 are human face key points in a sample two-dimensional image, and constructing a third cost function based on the human face key points 2 in the two-dimensional prediction image and the human face key points 1 in the sample two-dimensional image; and carrying out weighted summation on the first cost function, the second cost function and the third cost function and updating the initial face reconstruction model. The above process is a complete training process, the process is iterative training, the training end mark can be that the iteration number reaches the preset number or the weighted sum value of the first cost function, the second cost function and the third cost function is smaller than the preset threshold value, and the human face reconstruction model can be obtained after the training is finished.
S209, acquiring a target two-dimensional image;
s210, inputting the target two-dimensional image into a trained human face reconstruction model, and outputting a three-dimensional human face model corresponding to the target two-dimensional image.
In the embodiment of the application, an initial face reconstruction model is firstly created, a sample two-dimensional image is collected, then face key points in the sample two-dimensional image are extracted, an average face three-dimensional model is obtained, the average face three-dimensional model is subjected to constrained reconstruction and smooth processing based on the face key points in the sample two-dimensional image, a standard three-dimensional face model corresponding to the sample two-dimensional image is obtained, then the sample two-dimensional image is input into the initial face reconstruction model for prediction to obtain a predicted three-dimensional face model, the predicted three-dimensional face model is subjected to smooth processing to obtain a smooth three-dimensional face model, then the predicted three-dimensional face model is subjected to two-dimensional projection to obtain a predicted two-dimensional image, face key points in the predicted two-dimensional image are further extracted, and finally, the standard three-dimensional face model, the face key points in the predicted two-dimensional image are extracted based on the predicted three-dimensional face model, Three cost functions are constructed by the smooth three-dimensional face model, the face key points in the predicted two-dimensional image and the face key points in the sample two-dimensional image, iterative training is carried out on the initial face reconstruction model by using the three cost functions, and finally the trained face reconstruction model is obtained; then, the trained face reconstruction model can be used for reconstructing a three-dimensional face model with high accuracy and remarkable smoothing effect on the target two-dimensional image, and the reconstructed three-dimensional face model is output.
By adopting the three-dimensional face model reconstruction method provided by the embodiment of the application, the face reconstruction model is trained on the basis of the smoothing effect and the accuracy, the target two-dimensional image can be predicted by utilizing the trained face reconstruction model, the three-dimensional face model with obvious smoothing effect and accuracy can be reconstructed, the face reconstruction model only needs to be trained once, and then the three-dimensional face model can be repeatedly used aiming at any target two-dimensional image.
Fig. 7 is a schematic flow chart of a three-dimensional face model reconstruction method according to an embodiment of the present application. As shown in fig. 7, the three-dimensional face model reconstruction method includes the following steps.
S301, acquiring a target two-dimensional image;
s302, inputting the target two-dimensional image into a trained human face reconstruction model, and outputting a three-dimensional human face model corresponding to the target two-dimensional image;
the face reconstruction model is generated based on cost function training, the cost function comprises a first cost function, a second cost function and a third cost function, the first cost function is obtained based on a predicted three-dimensional face model corresponding to a sample two-dimensional image and a standard three-dimensional face model corresponding to the sample two-dimensional image, the second cost function is obtained based on a smooth three-dimensional face model obtained after smoothing processing is carried out on the predicted three-dimensional face model and the predicted three-dimensional face model, and the third cost function is obtained based on a face key point obtained after two-dimensional projection is carried out on the predicted three-dimensional face model and a face key point in the sample two-dimensional image;
the prediction three-dimensional face model is obtained by predicting the sample two-dimensional image based on the created initial face reconstruction model, and the standard three-dimensional face model is obtained by performing constrained reconstruction on the average face three-dimensional model based on the face key points in the sample two-dimensional image.
S303, smoothing the three-dimensional face model to obtain a target three-dimensional face model.
Specifically, the method comprises the following steps:
Figure BDA0003298189470000081
wherein, the Vi' is the ith three-dimensional vertex in the three-dimensional face model before smoothing, and the
Figure BDA0003298189470000082
Is the said Vi' directly adjacent first order neighborhood points, said i representing the ith three-dimensional vertex, said j representing the index of the first order neighborhood point, said
Figure BDA0003298189470000083
And the ith three-dimensional vertex in the target three-dimensional face model after the smoothing processing is performed, wherein alpha represents a smoothing coefficient and belongs to (0, 1).
By adopting the three-dimensional face model reconstruction method provided by the embodiment of the application, the acquired target two-dimensional image is input into a face reconstruction model generated based on cost function training, and then a three-dimensional face model corresponding to the target two-dimensional image is output, wherein the face reconstruction model is generated by using a first cost function constructed based on a predicted three-dimensional face model corresponding to a sample two-dimensional image and a standard three-dimensional face model corresponding to the sample two-dimensional image, a second cost function constructed based on a smooth three-dimensional face model after smoothing processing of the predicted three-dimensional face model and the predicted three-dimensional face model, and a third cost function constructed based on face key points after two-dimensional projection of the predicted three-dimensional face model and face key points in the sample two-dimensional image. Because the face reconstruction model is subjected to smoothing processing in the training process, the three-dimensional face model reconstructed by using the face reconstruction model has a remarkable smoothing effect, the three-dimensional face model with a good smoothing effect corresponding to the target two-dimensional image can be output without adding extra smoothing calculation amount in the application process of the face reconstruction model, and after the three-dimensional face model is output by the face reconstruction model, the output three-dimensional face model is subjected to smoothing processing to obtain the target three-dimensional face model, so that the smoothing effect is further increased.
Optionally, the stylization is added in the training process of the face reconstruction model, so that the face reconstruction model can reconstruct a stylized three-dimensional face model.
Please refer to fig. 8, which is a flowchart illustrating a three-dimensional human face model reconstruction method according to an embodiment of the present application. As shown in fig. 8, the three-dimensional face model reconstruction method includes the following steps.
S401, collecting a two-dimensional image of a sample;
s402, performing stylization processing on the sample two-dimensional image to obtain a stylized sample two-dimensional image, and extracting key points of the human face in the stylized sample two-dimensional image;
the stylization processing refers to converting the sample two-dimensional image into a specific stylized sample two-dimensional image, such as a sketch portrait style, a cartoon image (animation) style, an oil painting style and the like.
Fig. 9 is a schematic diagram illustrating an exemplary stylization process according to an embodiment of the disclosure.
As shown in fig. 9, the sample two-dimensional image is stylized by the cartoon character to obtain a stylized sample two-dimensional image of the cartoon character as shown in the figure.
S403, acquiring an average face three-dimensional model;
s404, performing constrained reconstruction on the average face three-dimensional model based on the face key points in the stylized sample two-dimensional image to obtain a standard three-dimensional face model corresponding to the sample two-dimensional image;
please refer to fig. 10, which is a schematic diagram illustrating an example of reconstructing a three-dimensional face model by using face key point constraint according to an embodiment of the present application.
As shown in fig. 10, in the process of reconstructing the three-dimensional face model, the average face three-dimensional model is constrained by the face key points in the stylized sample two-dimensional image, so as to obtain a standard three-dimensional face model corresponding to the stylized sample two-dimensional image.
S405, creating an initial face reconstruction model, and inputting the sample two-dimensional image into the initial face reconstruction model for prediction to obtain a predicted three-dimensional face model;
s406, smoothing the predicted three-dimensional face model to obtain a smooth three-dimensional face model;
s407, performing two-dimensional projection on the predicted three-dimensional face model to obtain a predicted two-dimensional image, and extracting face key points in the predicted two-dimensional image;
s408, constructing a cost function based on the predicted three-dimensional face model, the standard three-dimensional face model, the smooth three-dimensional face model, the face key points in the predicted two-dimensional image and the face key points in the stylized sample two-dimensional image;
s409, training the initial face reconstruction model based on the cost function to obtain a trained face reconstruction model;
s410, acquiring a target two-dimensional image;
s411, inputting the target two-dimensional image into a trained human face reconstruction model, and outputting a stylized three-dimensional human face model corresponding to the target two-dimensional image.
By adopting the three-dimensional face model reconstruction method provided by the embodiment of the application, stylized processing is added in the process of training the face reconstruction model, the trained face reconstruction model based on the smoothing effect and the accuracy can output the stylized three-dimensional face model, and the interestingness and the functionality of the reconstruction of the three-dimensional face model are increased.
Fig. 11 is a schematic structural diagram of a three-dimensional human face model reconstruction apparatus according to an embodiment of the present application. As shown in fig. 11, the three-dimensional face model reconstruction apparatus 1 may be implemented by software, hardware or a combination of both as all or part of a computer device. According to some embodiments, the three-dimensional human face model reconstruction apparatus 1 includes an image acquisition module 11 and a model prediction module 12, and specifically includes:
the image acquisition module 11 is used for acquiring a target two-dimensional image;
the model prediction module 12 is configured to input the target two-dimensional image into a trained face reconstruction model, and output a three-dimensional face model corresponding to the target two-dimensional image;
the face reconstruction model is generated based on cost function training, the cost function comprises a first cost function, a second cost function and a third cost function, the first cost function is obtained based on a predicted three-dimensional face model corresponding to a sample two-dimensional image and a standard three-dimensional face model corresponding to the sample two-dimensional image, the second cost function is obtained based on a smooth three-dimensional face model obtained after smoothing processing is carried out on the predicted three-dimensional face model and the predicted three-dimensional face model, and the third cost function is obtained based on a face key point obtained after two-dimensional projection is carried out on the predicted three-dimensional face model and a face key point in the sample two-dimensional image;
the prediction three-dimensional face model is obtained by predicting the sample two-dimensional image based on the created initial face reconstruction model, and the standard three-dimensional face model is obtained by performing constrained reconstruction on the average face three-dimensional model based on the face key points in the sample two-dimensional image.
Optionally, the model prediction module 12 is specifically configured to:
inputting the target two-dimensional image into a trained human face reconstruction model, and outputting a stylized three-dimensional human face model corresponding to the target two-dimensional image;
the face reconstruction model is generated based on cost function training, the cost function comprises a first price function, a second price function and a third price function, the first price function is obtained based on a predicted stylized three-dimensional face model corresponding to a sample two-dimensional image and a standard stylized three-dimensional face model corresponding to the sample two-dimensional image, the second price function is obtained based on a smoothed stylized three-dimensional face model obtained after smoothing processing is carried out on the predicted stylized three-dimensional face model and the predicted stylized three-dimensional face model, and the third price function is obtained based on a face key point obtained after two-dimensional projection is carried out on the predicted stylized three-dimensional face model and a face key point obtained in the stylized sample two-dimensional image corresponding to the sample two-dimensional image;
the standard stylized three-dimensional face model is obtained by performing constrained reconstruction on an average face three-dimensional model based on a face key point in a stylized sample two-dimensional image corresponding to the sample two-dimensional image.
Optionally, please refer to fig. 12, which is a schematic structural diagram of a three-dimensional human face model reconstruction apparatus according to an embodiment of the present application. As shown in fig. 12, the apparatus further includes:
and the smoothing processing module 13 is configured to perform smoothing processing on the three-dimensional face model to obtain a target three-dimensional face model.
Optionally, the smoothing module 13 is specifically configured to:
Figure BDA0003298189470000101
wherein, the Vi' is the ith three-dimensional vertex in the three-dimensional face model before smoothing, and the
Figure BDA0003298189470000102
Is the said Vi' directly adjacent first order neighborhood points, said i representing the ith three-dimensional vertex, said j representing the index of the first order neighborhood point, said
Figure BDA0003298189470000103
And the ith three-dimensional vertex in the target three-dimensional face model after the smoothing processing is performed, wherein alpha represents a smoothing coefficient and belongs to (0, 1).
Optionally, as shown in fig. 12, the apparatus further comprises a model training module 14.
Optionally, please refer to fig. 13, which is a schematic structural diagram of a model training module provided in the embodiment of the present application. As shown in fig. 13, the model training module 14 includes:
an image acquisition unit 141, configured to create an initial face reconstruction model and acquire a sample two-dimensional image;
a first key point extracting unit 142, configured to extract a face key point in the sample two-dimensional image, and obtain an average face three-dimensional model;
a model reconstruction unit 143, configured to perform constrained reconstruction on the average face three-dimensional model based on a face key point in the sample two-dimensional image, to obtain a standard three-dimensional face model corresponding to the sample two-dimensional image;
a model prediction unit 144, configured to input the sample two-dimensional image into the initial face reconstruction model for prediction to obtain a predicted three-dimensional face model;
a smoothing unit 145, configured to perform smoothing on the predicted three-dimensional face model to obtain a smoothed three-dimensional face model;
a second key point extracting unit 146, configured to perform two-dimensional projection on the predicted three-dimensional face model to obtain a predicted two-dimensional image, and extract a face key point in the predicted two-dimensional image;
a cost function constructing unit 147, configured to construct a cost function based on the predicted three-dimensional face model, the standard three-dimensional face model, the smooth three-dimensional face model, the face key points in the predicted two-dimensional image, and the face key points in the sample two-dimensional image;
and the model training unit 148 is configured to train the initial face reconstruction model based on the cost function to obtain a trained face reconstruction model.
Optionally, the first keypoint extracting unit 142 is specifically configured to:
performing stylization processing on the sample two-dimensional image to obtain a stylized sample two-dimensional image, and extracting key points of the human face in the stylized sample two-dimensional image;
the cost function constructing unit 147 is specifically configured to:
and constructing a cost function based on the predicted three-dimensional face model, the standard three-dimensional face model, the smooth three-dimensional face model, the face key points in the predicted two-dimensional image and the face key points in the stylized sample two-dimensional image.
Optionally, the second keypoint extracting unit 146 is specifically configured to:
acquiring projection parameters in the initial face reconstruction model, performing two-dimensional projection on the predicted three-dimensional face model based on the projection parameters to obtain a predicted two-dimensional image, and extracting face key points in the predicted two-dimensional image.
Optionally, please refer to fig. 14, which is a schematic structural diagram of a cost function constructing unit provided in the embodiment of the present application.
As shown in fig. 14, the cost function constructing unit 147 includes:
a first constructing subunit 1471, configured to construct a first cost function based on each vertex in the predicted three-dimensional face model and each vertex in the standard three-dimensional face model:
Figure BDA0003298189470000111
wherein N is the number of vertexes, and V isiFor the ith in the predicted three-dimensional face modelVertex, the VGT,iThe ith vertex in the standard three-dimensional face model is taken as the vertex;
a second constructing subunit 1472, configured to construct a second cost function based on the vertices in the predicted three-dimensional face model and the vertices in the smoothed three-dimensional face model:
Figure BDA0003298189470000112
wherein N is the number of vertices, ViFor the ith vertex in the predicted three-dimensional face model, the
Figure BDA0003298189470000113
The ith vertex in the smooth three-dimensional face model is taken as the vertex;
a third constructing subunit 1473, configured to construct a third cost function based on the face keypoint in the predicted two-dimensional image and the face keypoint in the sample two-dimensional image:
Figure BDA0003298189470000114
wherein M is the number of key points of the face, lmkiFor the prediction of ith personal face keypoints in the two-dimensional image, lmkGT,iThe ith personal face key point in the sample two-dimensional image is taken as the key point;
a weighted summation subunit 1474, configured to perform weighted summation on the first cost function, the second cost function, and the third cost function to obtain a cost function.
Optionally, the weighted summation subunit 1474 is specifically configured to:
L=β·L1+μ·L2+λ·L3
wherein β is a first cost function L1Corresponding weight, said μ being a second cost function L2Corresponding weight, said λ being a third cost function L3The corresponding weight.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
By adopting the three-dimensional face model reconstruction method provided by the embodiment of the application, the acquired target two-dimensional image is input into a face reconstruction model generated based on cost function training, and then a three-dimensional face model corresponding to the target two-dimensional image is output, wherein the face reconstruction model is generated by using a first cost function constructed based on a predicted three-dimensional face model corresponding to a sample two-dimensional image and a standard three-dimensional face model corresponding to the sample two-dimensional image, a second cost function constructed based on a smooth three-dimensional face model after smoothing processing of the predicted three-dimensional face model and the predicted three-dimensional face model, and a third cost function constructed based on face key points after two-dimensional projection of the predicted three-dimensional face model and face key points in the sample two-dimensional image. Because the face reconstruction model is subjected to smoothing processing in the training process, the three-dimensional face model reconstructed by using the face reconstruction model has a remarkable smoothing effect, the three-dimensional face model with a good smoothing effect corresponding to the target two-dimensional image can be output without adding extra smoothing calculation amount in the application process of the face reconstruction model, and after the three-dimensional face model is output by the face reconstruction model, the output three-dimensional face model is subjected to smoothing processing to obtain the target three-dimensional face model, so that the smoothing effect is further increased; optionally, stylized processing is added in the process of training the face reconstruction model, the face reconstruction model trained based on the smoothing effect and the accuracy can output the stylized three-dimensional face model, and the interestingness and the functionality of three-dimensional face model reconstruction are increased.
An embodiment of the present application further provides a computer storage medium, where the computer storage medium may store a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing the three-dimensional face model reconstruction method according to the embodiment shown in fig. 1 to 10, and a specific execution process may refer to specific descriptions of the embodiment shown in fig. 1 to 10, which is not described herein again.
The present application further provides a computer program product, where at least one instruction is stored in the computer program product, and the at least one instruction is loaded by the processor and executes the three-dimensional face model reconstruction method according to the embodiment shown in fig. 1 to 10, where a specific execution process may refer to specific descriptions of the embodiment shown in fig. 1 to 10, and is not described herein again.
Referring to fig. 15, a block diagram of a computer device according to an exemplary embodiment of the present application is shown. The computer device in the present application may comprise one or more of the following components: a processor 110, a memory 120, an input device 130, an output device 140, and a bus 150. The processor 110, memory 120, input device 130, and output device 140 may be connected by a bus 150.
Processor 110 may include one or more processing cores. The processor 110 connects various parts within the overall computer device using various interfaces and lines, and performs various functions of the computer device 100 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 120 and calling data stored in the memory 120. Alternatively, the processor 110 may be implemented in hardware using at least one of Digital Signal Processing (DSP), field-programmable gate Array (FPGA), and Programmable Logic Array (PLA). The processor 110 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 110, but may be implemented by a communication chip.
The Memory 120 may include a Random Access Memory (RAM) or a read-only Memory (ROM). Optionally, the memory 120 includes a non-transitory computer-readable medium. The memory 120 may be used to store instructions, programs, code sets, or instruction sets.
The input device 130 is used for receiving input instructions or data, and the input device 130 includes, but is not limited to, a keyboard, a mouse, a camera, a microphone, or a touch device. The output device 140 is used for outputting instructions or data, and the output device 140 includes, but is not limited to, a display device, a speaker, and the like. In the embodiment of the present application, the input device 130 may be a temperature sensor for acquiring the operating temperature of the computer device. The output device 140 may be a speaker for outputting audio signals.
In addition, those skilled in the art will appreciate that the configurations of the computer apparatus shown in the above-described figures do not constitute limitations on the computer apparatus, and that a computer apparatus may include more or less components than those shown, or some of the components may be combined, or a different arrangement of components. For example, the computer device further includes a radio frequency circuit, an input unit, a sensor, an audio circuit, a wireless fidelity (WiFi) module, a power supply, a bluetooth module, and other components, which are not described herein again.
In the embodiment of the present application, the execution subject of each step may be the computer device described above. Optionally, the execution subject of each step is an operating system of the computer device. The operating system may be an android system, an IOS system, or another operating system, which is not limited in this embodiment of the present application.
In the computer device shown in fig. 15, the processor 110 may be configured to call the three-dimensional face model reconstruction program stored in the memory 120 and execute the program to implement the three-dimensional face model reconstruction method according to the various method embodiments of the present application.
By adopting the three-dimensional face model reconstruction method provided by the embodiment of the application, the acquired target two-dimensional image is input into a face reconstruction model generated based on cost function training, and then a three-dimensional face model corresponding to the target two-dimensional image is output, wherein the face reconstruction model is generated by using a first cost function constructed based on a predicted three-dimensional face model corresponding to a sample two-dimensional image and a standard three-dimensional face model corresponding to the sample two-dimensional image, a second cost function constructed based on a smooth three-dimensional face model after smoothing processing of the predicted three-dimensional face model and the predicted three-dimensional face model, and a third cost function constructed based on face key points after two-dimensional projection of the predicted three-dimensional face model and face key points in the sample two-dimensional image. Because the face reconstruction model is subjected to smoothing processing in the training process, the three-dimensional face model reconstructed by using the face reconstruction model has a remarkable smoothing effect, the three-dimensional face model with a good smoothing effect corresponding to the target two-dimensional image can be output without adding extra smoothing calculation amount in the application process of the face reconstruction model, and after the three-dimensional face model is output by the face reconstruction model, the output three-dimensional face model is subjected to smoothing processing to obtain the target three-dimensional face model, so that the smoothing effect is further increased; optionally, stylized processing is added in the process of training the face reconstruction model, the face reconstruction model trained based on the smoothing effect and the accuracy can output the stylized three-dimensional face model, and the interestingness and the functionality of three-dimensional face model reconstruction are increased.
It is clear to a person skilled in the art that the solution of the present application can be implemented by means of software and/or hardware. The "unit" and "module" in this specification refer to software and/or hardware that can perform a specific function independently or in cooperation with other components, where the hardware may be, for example, a Field-ProgrammaBLE Gate Array (FPGA), an Integrated Circuit (IC), or the like.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some service interfaces, devices or units, and may be an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program, which is stored in a computer-readable memory, and the memory may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The above description is only an exemplary embodiment of the present disclosure, and the scope of the present disclosure should not be limited thereby. That is, all equivalent changes and modifications made in accordance with the teachings of the present disclosure are intended to be included within the scope of the present disclosure. Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (12)

1. A three-dimensional human face model reconstruction method is characterized by comprising the following steps:
acquiring a target two-dimensional image;
inputting the target two-dimensional image into a trained face reconstruction model, and outputting a three-dimensional face model corresponding to the target two-dimensional image;
the face reconstruction model is generated based on cost function training, the cost function comprises a first cost function, a second cost function and a third cost function, the first cost function is obtained based on a predicted three-dimensional face model corresponding to a sample two-dimensional image and a standard three-dimensional face model corresponding to the sample two-dimensional image, the second cost function is obtained based on a smooth three-dimensional face model obtained after smoothing processing is carried out on the predicted three-dimensional face model and the predicted three-dimensional face model, and the third cost function is obtained based on a face key point obtained after two-dimensional projection is carried out on the predicted three-dimensional face model and a face key point in the sample two-dimensional image;
the prediction three-dimensional face model is obtained by predicting the sample two-dimensional image based on the created initial face reconstruction model, and the standard three-dimensional face model is obtained by performing constrained reconstruction on the average face three-dimensional model based on the face key points in the sample two-dimensional image.
2. The method according to claim 1, wherein the inputting the target two-dimensional image into a trained human face reconstruction model and outputting a three-dimensional human face model corresponding to the target two-dimensional image comprises:
inputting the target two-dimensional image into a trained human face reconstruction model, and outputting a stylized three-dimensional human face model corresponding to the target two-dimensional image;
the face reconstruction model is generated based on cost function training, the cost function comprises a first price function, a second price function and a third price function, the first price function is obtained based on a predicted stylized three-dimensional face model corresponding to a sample two-dimensional image and a standard stylized three-dimensional face model corresponding to the sample two-dimensional image, the second price function is obtained based on a smoothed stylized three-dimensional face model obtained after smoothing processing is carried out on the predicted stylized three-dimensional face model and the predicted stylized three-dimensional face model, and the third price function is obtained based on a face key point obtained after two-dimensional projection is carried out on the predicted stylized three-dimensional face model and a face key point obtained in the stylized sample two-dimensional image corresponding to the sample two-dimensional image;
the standard stylized three-dimensional face model is obtained by performing constrained reconstruction on an average face three-dimensional model based on a face key point in a stylized sample two-dimensional image corresponding to the sample two-dimensional image.
3. The method according to claim 1, wherein after inputting the target two-dimensional image into a trained human face reconstruction model and outputting a three-dimensional human face model corresponding to the target two-dimensional image, the method further comprises:
and smoothing the three-dimensional face model to obtain a target three-dimensional face model.
4. The method of claim 1, wherein smoothing the three-dimensional face model to obtain a target three-dimensional face model comprises:
Figure FDA0003298189460000011
wherein, the V'iFor the ith three-dimensional vertex in the three-dimensional face model before smoothing, the method
Figure FDA0003298189460000012
Is the V'iDirectly adjacent first order neighborhood points, said i representing the ith three-dimensional vertex, said j representing the index of the first order neighborhood point, said
Figure FDA0003298189460000013
And the ith three-dimensional vertex in the target three-dimensional face model after the smoothing processing is performed, wherein alpha represents a smoothing coefficient and belongs to (0, 1).
5. The method of claim 1, wherein prior to acquiring the target two-dimensional image, further comprising:
collecting a two-dimensional image of a sample;
extracting key points of the human face in the sample two-dimensional image, and acquiring an average face three-dimensional model;
performing constrained reconstruction on the average face three-dimensional model based on the face key points in the sample two-dimensional image to obtain a standard three-dimensional face model corresponding to the sample two-dimensional image;
creating an initial face reconstruction model, and inputting the sample two-dimensional image into the initial face reconstruction model for prediction to obtain a predicted three-dimensional face model;
smoothing the predicted three-dimensional face model to obtain a smooth three-dimensional face model;
performing two-dimensional projection on the predicted three-dimensional face model to obtain a predicted two-dimensional image, and extracting face key points in the predicted two-dimensional image;
constructing a cost function based on the predicted three-dimensional face model, the standard three-dimensional face model, the smooth three-dimensional face model, the face key points in the predicted two-dimensional image and the face key points in the sample two-dimensional image;
and training the initial face reconstruction model based on the cost function to obtain a trained face reconstruction model.
6. The method of claim 5, wherein the extracting key points of the face from the sample two-dimensional image comprises:
performing stylization processing on the sample two-dimensional image to obtain a stylized sample two-dimensional image, and extracting key points of the human face in the stylized sample two-dimensional image;
the constructing a cost function based on the predicted three-dimensional face model, the standard three-dimensional face model, the smooth three-dimensional face model, the face key points in the predicted two-dimensional image and the face key points in the sample two-dimensional image includes:
and constructing a cost function based on the predicted three-dimensional face model, the standard three-dimensional face model, the smooth three-dimensional face model, the face key points in the predicted two-dimensional image and the face key points in the stylized sample two-dimensional image.
7. The method of claim 5, wherein the two-dimensional projecting the predicted three-dimensional face model to obtain a predicted two-dimensional image and extracting face key points in the predicted two-dimensional image comprises:
acquiring projection parameters in the initial face reconstruction model, performing two-dimensional projection on the predicted three-dimensional face model based on the projection parameters to obtain a predicted two-dimensional image, and extracting face key points in the predicted two-dimensional image.
8. The method of claim 5, wherein constructing a cost function based on the predicted three-dimensional face model, the standard three-dimensional face model, the smoothed three-dimensional face model, the face keypoints in the predicted two-dimensional image, and the face keypoints in the sample two-dimensional image comprises:
constructing a first cost function based on each vertex in the predicted three-dimensional face model and each vertex in the standard three-dimensional face model:
Figure FDA0003298189460000021
wherein N is the number of vertexes, and V isiFor the ith vertex in the predicted three-dimensional face model, the VGT,iThe ith vertex in the standard three-dimensional face model is taken as the vertex;
constructing a second cost function based on each vertex in the predicted three-dimensional face model and each vertex in the smooth three-dimensional face model:
Figure FDA0003298189460000022
wherein N is the number of vertices, ViFor the ith vertex in the predicted three-dimensional face model, the
Figure FDA0003298189460000024
The ith vertex in the smooth three-dimensional face model is taken as the vertex;
constructing a third cost function based on the human face key points in the predicted two-dimensional image and the human face key points in the sample two-dimensional image:
Figure FDA0003298189460000023
wherein M is the number of key points of the face, lmkiFor the prediction of ith personal face keypoints in the two-dimensional image, lmkGT,iThe ith personal face key point in the sample two-dimensional image is taken as the key point;
and carrying out weighted summation on the first cost function, the second cost function and the third cost function to obtain a cost function.
9. The method of claim 8, wherein the weighted summation of the first cost function, the second cost function, and the third cost function to obtain a cost function comprises:
L=β·L1+μ·L2+λ·L3
wherein β is a first cost function L1Corresponding weight, said μ being a second cost function L2Corresponding weight, said λ being a third cost function L3The corresponding weight.
10. An apparatus for reconstructing a three-dimensional face model, the apparatus comprising:
the image acquisition module is used for acquiring a target two-dimensional image;
the model prediction module is used for inputting the target two-dimensional image into a trained human face reconstruction model and outputting a three-dimensional human face model corresponding to the target two-dimensional image;
the face reconstruction model is generated based on cost function training, the cost function comprises a first cost function, a second cost function and a third cost function, the first cost function is obtained based on a predicted three-dimensional face model corresponding to a sample two-dimensional image and a standard three-dimensional face model corresponding to the sample two-dimensional image, the second cost function is obtained based on a smooth three-dimensional face model obtained after smoothing processing is carried out on the predicted three-dimensional face model and the predicted three-dimensional face model, and the third cost function is obtained based on a face key point obtained after two-dimensional projection is carried out on the predicted three-dimensional face model and a face key point in the sample two-dimensional image;
the prediction three-dimensional face model is obtained by predicting the sample two-dimensional image based on the created initial face reconstruction model, and the standard three-dimensional face model is obtained by performing constrained reconstruction on the average face three-dimensional model based on the face key points in the sample two-dimensional image.
11. A storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, performs the steps of the method of any one of claims 1 to 9.
12. A computer device, comprising: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the steps of the method according to any of claims 1-9.
CN202111184666.3A 2021-10-11 2021-10-11 Three-dimensional face model reconstruction method and device, storage medium and computer equipment Pending CN113870420A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111184666.3A CN113870420A (en) 2021-10-11 2021-10-11 Three-dimensional face model reconstruction method and device, storage medium and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111184666.3A CN113870420A (en) 2021-10-11 2021-10-11 Three-dimensional face model reconstruction method and device, storage medium and computer equipment

Publications (1)

Publication Number Publication Date
CN113870420A true CN113870420A (en) 2021-12-31

Family

ID=78998529

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111184666.3A Pending CN113870420A (en) 2021-10-11 2021-10-11 Three-dimensional face model reconstruction method and device, storage medium and computer equipment

Country Status (1)

Country Link
CN (1) CN113870420A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114202654A (en) * 2022-02-17 2022-03-18 广东皓行科技有限公司 Entity target model construction method, storage medium and computer equipment
CN114663410A (en) * 2022-03-31 2022-06-24 清华大学 Heart three-dimensional model generation method, device, equipment and storage medium
CN117523136A (en) * 2023-11-13 2024-02-06 书行科技(北京)有限公司 Face point position corresponding relation processing method, face reconstruction method, device and medium
CN117523136B (en) * 2023-11-13 2024-05-14 书行科技(北京)有限公司 Face point position corresponding relation processing method, face reconstruction method, device and medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114202654A (en) * 2022-02-17 2022-03-18 广东皓行科技有限公司 Entity target model construction method, storage medium and computer equipment
CN114202654B (en) * 2022-02-17 2022-04-19 广东皓行科技有限公司 Entity target model construction method, storage medium and computer equipment
CN114663410A (en) * 2022-03-31 2022-06-24 清华大学 Heart three-dimensional model generation method, device, equipment and storage medium
CN117523136A (en) * 2023-11-13 2024-02-06 书行科技(北京)有限公司 Face point position corresponding relation processing method, face reconstruction method, device and medium
CN117523136B (en) * 2023-11-13 2024-05-14 书行科技(北京)有限公司 Face point position corresponding relation processing method, face reconstruction method, device and medium

Similar Documents

Publication Publication Date Title
CN114037802A (en) Three-dimensional face model reconstruction method and device, storage medium and computer equipment
CN110163054B (en) Method and device for generating human face three-dimensional image
CN107392984B (en) Method for generating animation based on face image and computing equipment
US11049310B2 (en) Photorealistic real-time portrait animation
CN113870420A (en) Three-dimensional face model reconstruction method and device, storage medium and computer equipment
CN112419487A (en) Three-dimensional hair reconstruction method and device, electronic equipment and storage medium
US20220284678A1 (en) Method and apparatus for processing face information and electronic device and storage medium
CN113111861A (en) Face texture feature extraction method, 3D face reconstruction method, device and storage medium
CN115239861A (en) Face data enhancement method and device, computer equipment and storage medium
CN112102480A (en) Image data processing method, apparatus, device and medium
CN114998490B (en) Virtual object generation method, device, equipment and storage medium
WO2017054652A1 (en) Method and apparatus for positioning key point of image
CN114782645B (en) Virtual digital person making method, related equipment and readable storage medium
CN110580677A (en) Data processing method and device and data processing device
CN109697748A (en) Model compression processing method, model pinup picture processing method device, storage medium
CN110751026B (en) Video processing method and related device
CN116703992A (en) Accurate registration method, device and equipment for three-dimensional point cloud data and storage medium
CN115496864B (en) Model construction method, model reconstruction device, electronic equipment and storage medium
US11748943B2 (en) Cleaning dataset for neural network training
WO2021197230A1 (en) Three-dimensional head model constructing method, device, system, and storage medium
CN114299206A (en) Three-dimensional cartoon face generation method and device, electronic equipment and storage medium
CN114373034A (en) Image processing method, image processing apparatus, image processing device, storage medium, and computer program
CN113689325A (en) Method for digitizing beautiful eyebrows, electronic device and storage medium
CN116912433B (en) Three-dimensional model skeleton binding method, device, equipment and storage medium
CN111784805B (en) Virtual character interaction feedback method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination