WO2020173442A1 - 用于三维人脸模型生成的计算机应用方法、装置、计算机设备及存储介质 - Google Patents

用于三维人脸模型生成的计算机应用方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2020173442A1
WO2020173442A1 PCT/CN2020/076650 CN2020076650W WO2020173442A1 WO 2020173442 A1 WO2020173442 A1 WO 2020173442A1 CN 2020076650 W CN2020076650 W CN 2020076650W WO 2020173442 A1 WO2020173442 A1 WO 2020173442A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional face
face image
model
sample
similarity
Prior art date
Application number
PCT/CN2020/076650
Other languages
English (en)
French (fr)
Inventor
陈雅静
宋奕兵
凌永根
暴林超
刘威
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP20763064.1A priority Critical patent/EP3933783A4/en
Publication of WO2020173442A1 publication Critical patent/WO2020173442A1/zh
Priority to US17/337,909 priority patent/US11636613B2/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/169Holistic features and representations, i.e. based on the facial image taken as a whole
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/08Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • This application relates to the field of computer technology, and in particular to a computer application method, device, computer equipment, and storage medium for generating a three-dimensional face model.
  • the technology of generating three-dimensional face models based on images has been applied in many fields.
  • the technology has been widely used in fields such as face recognition, public security, medical treatment, games, film and entertainment.
  • the 3D face model generation method usually extracts the global features of the 2D face image, and calculates the 3D face model parameters based on the global features, so that the 3D face model can be calculated according to the 3D face model parameters.
  • a computer application method for generating a three-dimensional face model comprising: obtaining a two-dimensional face image;
  • the two-dimensional face image is input into the face model generation model, the global feature and the local feature of the two-dimensional face image are extracted through the face model generation model, based on the global feature and the local feature, Acquire three-dimensional face model parameters, and output a three-dimensional face model corresponding to the two-dimensional face image based on the three-dimensional face model parameters.
  • a computer application device for generating a three-dimensional face model the device comprising: an acquisition module for acquiring a two-dimensional face image;
  • the calling module is used to call the face model generation model
  • a generation module configured to input the two-dimensional face image into the face model generation model, extract global features and local features of the two-dimensional face image through the face model generation model, and based on the global Features and local features, acquiring three-dimensional face model parameters, and outputting a three-dimensional face model corresponding to the two-dimensional face image based on the three-dimensional face model parameters.
  • a computer device includes a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the processor, the one or more processors perform the following steps:
  • the two-dimensional face image is input into the face model generation model, the global feature and the local feature of the two-dimensional face image are extracted through the face model generation model, based on the global feature and the local feature, Acquire three-dimensional face model parameters, and output a three-dimensional face model corresponding to the two-dimensional face image based on the three-dimensional face model parameters.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors perform the following steps:
  • the two-dimensional face image is input into the face model generation model, the global feature and the local feature of the two-dimensional face image are extracted through the face model generation model, based on the global feature and the local feature, Acquire three-dimensional face model parameters, and output a three-dimensional face model corresponding to the two-dimensional face image based on the three-dimensional face model parameters.
  • Figure 1 is an implementation environment of a computer application method for generating a three-dimensional face model provided by an embodiment of the present application
  • Fig. 2 is a flowchart of a method for training a face model generation model provided by an embodiment of the present application
  • Fig. 3 is a schematic structural diagram of a face model generation model provided by an embodiment of the present application
  • Fig. 4 is provided by an embodiment of the present application
  • Figure 5 is a schematic structural diagram of a computer application device for generating a three-dimensional face model provided by an embodiment of the present application
  • FIG. 6 is a schematic structural diagram of a terminal provided by an embodiment of the present application.
  • Fig. 7 is a schematic structural diagram of a server provided by an embodiment of the present application. detailed description
  • FIG. 1 is an implementation environment of a computer application method for generating a three-dimensional face model provided by an embodiment of the present application.
  • the implementation environment may include at least one computer device.
  • the implementation environment includes multiple computer devices. Take an example. Among them, the multiple computer devices can realize data interaction through wired connection, or through wireless network connection, which is not limited in the embodiment of the present application.
  • the computer device 101 may generate a three-dimensional face model based on a two-dimensional face image, and obtain a three-dimensional face model corresponding to the two-dimensional face image. This process is a three-dimensional face reconstruction process.
  • the computer device 101 may store a face model generation model, and the computer device 101 may process a two-dimensional face image based on the stored face model generation model to implement a three-dimensional face model. Model generation process.
  • the computer device 101 may also call the face model generation model on other computer devices to perform the process of face model generation when the face model generation is required, which is not limited in the embodiment of this application.
  • the computer device 101 stores a human face model generation model as an example.
  • the face model generation model may be trained on the computer device 101, or may be trained on other computer devices.
  • the other computer device may be the computer device 102.
  • the computer device 102 may also encapsulate the trained face model generation model and send it to the computer device 101, so that the computer device 101 can receive and store the trained face model generation model.
  • the embodiment of the application does not limit the training equipment for the face model generation model.
  • the computer device 101 can receive a face model generation request sent by other computer devices, and the face model generation request carries a two-dimensional face image.
  • the computer device 101 can implement the above Step of generating a three-dimensional face model, and sending the generated three-dimensional face model to the other computer device.
  • the embodiments of the present application do not limit the specific implementation manner.
  • both the computer device 101 and the computer device 102 may be provided as a terminal or a server, which is not limited in the embodiment of the present application.
  • FIG. 2 is a flowchart of a method for training a face model generation model provided by an embodiment of the present application.
  • the face model generation model training method can be applied to a computer device, and the computer device may be the above-mentioned computer device 101 or The foregoing computer device 102 is not limited in the embodiment of the present application.
  • the computer device may be a terminal or a server, which is not limited in the embodiment of the present application.
  • the method for training the face model generation model may include the following steps:
  • a computer device obtains a plurality of sample two-dimensional face images.
  • the computer device can train the initial model based on the sample two-dimensional face image to obtain the face model generation model, so that the model can be subsequently generated based on the trained face model to generate the three-dimensional face model .
  • the multiple sample two-dimensional face images may be stored in the computer device, and when the computer device needs to perform face model generation model training, the multiple samples may be obtained from the local storage Two-dimensional face image.
  • the multiple sample two-dimensional face images may also be stored in other computer equipment, and the computer equipment needs to perform When the face model generation model is trained, the multiple sample two-dimensional face images are obtained from the other computer equipment.
  • the computer device may obtain multiple two-dimensional face images as sample images from an image database. The embodiment of the present application does not limit the acquisition method of the sample two-dimensional face image.
  • the training process of the face model generation model can adopt an unsupervised learning method.
  • the computer device can obtain multiple sample two-dimensional face images, and the model training process can be completed based on the sample two-dimensional face images.
  • the sample two-dimensional face image may not need to carry tag information.
  • the tag information generally refers to artificially synthesized three-dimensional face model parameters, which are pseudo-real data.
  • the computer device calls the initial model, and inputs the multiple sample two-dimensional face images into the initial model.
  • the computer device After the computer device obtains the sample two-dimensional face image, it can train the initial model based on the sample two-dimensional face image. Therefore, the computer device can call the initial model and input the sample two-dimensional face image into the initial model.
  • the model parameters in the model are initial values, and the computer device can adjust the model parameters of the initial model based on the initial model processing the sample two-dimensional face image, so that the adjusted initial model is based on the two-dimensional face
  • the image can obtain a three-dimensional face model that is more similar to a two-dimensional face image and has better face details.
  • the initial model may be stored in the computer device, or may be stored in other computer devices.
  • the computer device obtains the initial model from other computer devices, which is not limited in the embodiment of the present application.
  • the initial model in the computer device extracts the global features and local features of the sample two-dimensional face image.
  • the initial model can process the input multiple sample two-dimensional face images. Specifically, the initial model may perform a three-dimensional face model generation step based on each sample two-dimensional face image to obtain a three-dimensional face model.
  • the initial model can first extract the features of the sample two-dimensional face image, and generate a three-dimensional face model based on the features of the sample two-dimensional face image,
  • the three-dimensional face model obtained in this way and the sample two-dimensional face image can have the same characteristics, and the two are more similar.
  • the initial model can obtain global features and local features of the sample two-dimensional face image, where the global features refer to all features obtained by feature extraction on the sample two-dimensional face image.
  • the local feature refers to the feature extracted from the local area of the sample two-dimensional face image.
  • the global feature may reflect all areas of the sample two-dimensional face image
  • the local features may reflect local areas of the sample two-dimensional face image, for example, the facial features of the face in the sample two-dimensional face image.
  • the local area may be eyes and nose, or eyes and mouth, of course, it may also be other areas, which is not limited in the embodiment of the present application.
  • both global features and local features are taken into consideration. This way, while gaining an overall grasp of the sample two-dimensional face image, the face details can be further optimized, so that the computer equipment can integrate global features. The effect of the three-dimensional face model obtained with local features is better.
  • Step 1 The computer device may perform feature extraction on the sample two-dimensional face image based on multiple convolutional layers to obtain the global feature of the sample two-dimensional face image.
  • the computer device can extract the global features of the sample two-dimensional face image by performing multiple convolutions on the sample two-dimensional face image.
  • the initial model may use a human face visual geometry group (Visaal Geometry Group-Face, VGG-Face) network, and the initial model may use multiple convolutions in the VGG-Face network.
  • the layer performs feature extraction on the sample two-dimensional face image.
  • the initial model may also be implemented by other face recognition networks, for example, FaceNet may be used, which is a face recognition network. This embodiment of the application does not limit which face recognition network the initial model uses.
  • this step one can be implemented by an encoder, and the global feature can be expressed in the form of a global feature vector.
  • This step one may be: the computer device may encode the sample two-dimensional face image based on multiple convolutional layers of the encoder to obtain the global feature vector of the sample two-dimensional face image.
  • the global feature vector may also be in the form of a feature map, for example, it may be in the form of a matrix, of course, it may also be in the form of other forms, such as an array, which is not limited in the embodiment of the present application.
  • the previous convolutional layer can process the sample two-dimensional face image to obtain a feature map, and input the feature map into the next convolutional layer, and the next convolutional layer will continue to process the input feature map , Get a feature map.
  • Step 2 The computer device obtains the center position of the key point of the sample two-dimensional face image.
  • the computer device may perform key point detection on the sample two-dimensional face image to obtain the position of the key point of the sample two-dimensional face image, where the key point may refer to the face Facial features and facial contours.
  • a human face can include 68 key points.
  • the computer device may obtain the center position of the key point based on the obtained position of the key point. For example, when the computer device obtains the positions of 68 key points, it can calculate the center positions of the 68 key points.
  • the process of performing key point detection by the computer device can be implemented by using any key point detection technology, which is not limited in the embodiment of the present application.
  • Step 3 Based on the center position, the computer device extracts some features from the features obtained by at least one target convolution layer in the multiple convolution layers as local features of the sample two-dimensional face image.
  • the second and third steps are the process of obtaining the local features of the sample two-dimensional face image by the computer device.
  • the computer device first obtains the entire feature of the sample two-dimensional face image based on the center position of the key points of the face. By intercepting some features in the middle, you can get the facial features or the local features of the facial contour parts, so that when the 3D face model is subsequently generated, based on the acquired local features, the facial features in the generated 3D face model can be made Or more detailed facial contour processing.
  • One-dimensional face image is processed to obtain a feature map, and the feature map is input to the next convolutional layer, and the next convolutional layer continues to process the input feature map to obtain a feature map.
  • the computer device obtains the local features, it can cut out some of the features from the feature map obtained by one or several of the above-mentioned multiple convolution layers.
  • the one or several convolutional layers are at least one target convolutional layer.
  • the at least one target convolutional layer may be a convolutional layer con2_2 and a convolutional layer con33, and the at least one target convolutional layer may be set by a relevant technician or Adjustments are not limited in the embodiments of this application.
  • partial feature extraction is performed on the features obtained by the different levels of convolutional layers, and the obtained local features also include the low-level information and high-level information of the face.
  • the local features are richer and are finally reflected in the face details in the three-dimensional face model. More detailed.
  • different target convolutional layers may correspond to different target sizes.
  • the computer device obtains the feature map from the target convolutional layer, and intercepts the target volume with the center position as the center.
  • the feature map of the target size corresponding to the stack is used as the local feature of the sample two-dimensional face image.
  • the computer device takes the center position as the center, and cuts a 64x64 feature map from the feature map obtained from con2_2, and cuts a 32x32 feature map from the feature map obtained from con3_3.
  • the feature map with a size of 64x64 and the feature map with a size of 32x32 may reflect the features corresponding to the facial features or facial contours of the face in the sample two-dimensional face image.
  • the target convolutional layer can be regarded as the convolutional layer in the local encoder, and the local features are obtained based on the local encoder.
  • step one can be implemented by an encoder, and the global feature can be expressed in the form of a global feature vector.
  • step three local features can also be extracted from the target convolutional layer in the encoder. It can also be expressed in the form of a local feature vector.
  • the third step may be: from the global feature vector obtained by at least one target convolution layer of the multiple convolution layers of the encoder, the computer device extracts a partial feature value of the global feature vector, and obtains the partial feature value based on the partial feature value.
  • the first local feature vector of the two-dimensional face image correspondingly, in the following step 204, the process of calculating the parameters of the three-dimensional face model can be implemented based on the first decoder. For details, refer to the following step 204, which is not repeated in the embodiment of the present application.
  • the vector composed of partial eigenvalues extracted from the global eigenvector is different from the global eigenvector obtained through multiple convolutional layers in the above step 1, and it needs to be based on the global feature later.
  • the vector and the local feature vector calculate the parameters of the three-dimensional face model. Therefore, after extracting the partial feature value, the computer device can further process the partial feature value based on the second decoder, so that the obtained local feature vector and the global feature
  • the vector form is the same, and it is easier to fuse to calculate the 3D face model parameters.
  • the process of obtaining the first local feature vector of the two-dimensional face image by the computer device based on the partial feature value may be implemented based on the second decoder, and the process of obtaining the local feature may be: computer device extraction Part of the feature value in the global feature vector obtained by the at least one target convolution layer; the computer device decodes the extracted part of the feature value based on the second decoder to obtain the first local feature vector of the two-dimensional face image.
  • the first local feature vector is used to combine with the global feature vector to obtain the three-dimensional face model parameters.
  • a human face may include multiple parts, for example, eyes, nose, mouth, etc., when the computer device extracts some of the global features, it can obtain some features corresponding to multiple regions, and then The partial features corresponding to the multiple regions can be integrated to obtain the local features of the sample two-dimensional face image.
  • the partial features obtained for different target convolutional layers may also include partial features corresponding to the multiple regions. If partial feature extraction is performed on multiple target convolutional layers, the computer device also needs to convolve the multiple target convolutional layers. Some features corresponding to layers are integrated.
  • the computer device can extract partial feature values corresponding to multiple regions in the global feature vector in the target convolutional layer, and based on the second decoder, The obtained partial feature values are decoded to obtain the first local feature vectors of multiple regions, and each region corresponds to an organ part of the human face; the computer device can splice the first local feature vectors of the multiple regions to obtain The first local feature vector of the two-dimensional face image.
  • the computer device can obtain the features of the left eye, right eye, and mouth in con2_2, and can also obtain it in con3_3
  • the computer device has acquired the local features of multiple regions in multiple levels.
  • the local features of each region of each level can correspond to a second decoder, and the computer device can decode the local features extracted from each region of each level to obtain the corresponding The first local feature vector of.
  • the computer device may splice together multiple first local feature vectors corresponding to the multiple regions to obtain the first local feature vector of the two-dimensional face image.
  • the initial model in the computer device obtains three-dimensional face model parameters based on the global feature and the local feature.
  • the three-dimensional face model parameters can be Three-dimensional variable face model (three di mens i ona I morphab I e mode I, 3DMM) parameters.
  • the initial model is implemented based on the encoder when the global feature and the local feature are obtained.
  • the initial model can decode the global feature and the local feature based on the first decoder to obtain a three-dimensional face Model parameters.
  • the global feature may be a global feature vector obtained by encoding
  • the local feature may be a first local feature vector obtained by encoding and decoding.
  • the computer device may be based on the first decoding The device decodes the global feature vector and the first local feature vector to obtain three-dimensional face model parameters.
  • the first decoder may include a fully connected layer, and the computer device may calculate the global feature vector and the first local feature vector based on the fully connected layer to obtain a three-dimensional human Face model parameters.
  • the parameters of the three-dimensional face model may include face information such as texture information, expression information, and shape information of the face.
  • the parameters of the three-dimensional face model may also include other face information, for example, posture information, which is not limited in the embodiment of the present application.
  • the texture, expression, and shape of the face can be known through the parameters of the three-dimensional face model. Therefore, the computer equipment The following step 205 may be performed to process the parameters of the three-dimensional face model to obtain the three-dimensional face model.
  • the initial model in the computer device is based on the three-dimensional face model parameters, and outputs the three-dimensional face model corresponding to the sample two-dimensional face image.
  • the parameters of the three-dimensional face model include multiple types of face information, and the initial model can generate a three-dimensional face model based on the multiple types of face information, so that the face information of the generated three-dimensional face model is consistent with the three-dimensional face model parameters.
  • the indicated face information is the same.
  • the texture information of the three-dimensional face model should be the same as the texture information included in the parameters of the three-dimensional face model, and the expression of the face in the three-dimensional face model should be the same as the expression corresponding to the expression information included in the three-dimensional face model parameters.
  • the shape of the human face is the same, so I won't repeat it here.
  • the three-dimensional face model may be a combination of multiple face models
  • the computer device may obtain the three-dimensional face model according to the parameters of the three-dimensional face model and the multiple initial face models.
  • the computer device may calculate the coefficients of the multiple face models according to the three-dimensional face model parameters, so as to calculate the multiple initial face models and the corresponding coefficients to obtain multiple face models,
  • the multiple face models can be joined to obtain the three-dimensional face model.
  • the three-dimensional face model may be determined based on the average face model and the three-dimensional face model parameters, and the computer device may obtain the coefficients of multiple principal component parts based on the three-dimensional face model parameters, and then The coefficient is used as the weight of the principal component part, and the multiple principal component parts are weighted and summed, so that the average face model and the weighted sum result are summed to obtain the final three-dimensional face model.
  • each principal component part may only refer to the shape of a human face, or may refer to texture, etc., which is not limited in the embodiment of the present application.
  • the process of generating a three-dimensional face model based on the three-dimensional face model parameters can also be implemented in other ways, and the embodiment of the present application does not limit the specific implementation method.
  • steps 202 to 205 are calling the initial model, and inputting the multiple sample two-dimensional face images into the initial model.
  • the second sample is extracted from the initial model.
  • the process of obtaining the three-dimensional face model parameters based on the global and local features of the face image, and outputting the three-dimensional face model corresponding to the sample two-dimensional face image based on the three-dimensional face model parameters The initial model processes each sample two-dimensional face image to obtain a corresponding three-dimensional face model.
  • the initial model in the computer device projects the three-dimensional face model to obtain a two-dimensional face image corresponding to the three-dimensional face model.
  • the similarity between the three-dimensional face model and the input sample two-dimensional face image can be determined, so as to determine whether the effect of the current face model generation is good or bad, to measure the initial model Face model generation function.
  • the model parameters of the initial model can be adjusted, and the adjustment can be stopped when the face model generation function of the adjusted initial model meets the conditions. This completes the model training process.
  • the 3D face model can be rendered as a 2D face image, and then the rendered 2D face image can be compared with the input sample The similarity of the two-dimensional face image.
  • the rendering process may be: the initial model obtains shooting information of the sample two-dimensional face image based on the global feature, and the shooting information is used to indicate the shooting posture, lighting or shooting background when the sample two-dimensional face image is shot At least one of: the initial model is based on the shooting information, and the three-dimensional face model is projected to obtain a two-dimensional face image corresponding to the three-dimensional face model.
  • the content reflected by the shooting information needs to be derived from the entire sample two-dimensional face image, so the initial model can perform the shooting information acquisition step based on global features.
  • the computer device obtains the global feature, it can obtain the shooting information of the sample two-dimensional face image according to the global feature analysis, that is, it can know the posture of the photographer taking the sample two-dimensional face image, or it can know the The sample two-dimensional face image is captured under what kind of lighting conditions, or it can be known what the shooting background of the sample two-dimensional face image is.
  • the projection of the 3D face model based on the same shooting posture, the same lighting situation or the same shooting background during projection can improve the difference between the projected 2D face image and the input sample 2D face image.
  • the comparability of can also make the obtained similarity more accurate.
  • the computer device may use orthogonal projection to project the three-dimensional face model, and may also use perspective projection to project the three-dimensional face model.
  • orthogonal projection may be: the computer device rotates the three-dimensional face model face according to the shooting posture in the shooting information according to the shooting information, and then the computer device uses the orthogonal projection to The three-dimensional face model is projected to two dimensions, and the pixel value of each pixel in the two-dimensional face image is calculated according to the normal vector, texture information, and illumination model of the three-dimensional face model.
  • the pixel value may be red The value of the green blue color mode (Red Green B lue, RGB).
  • the illumination model can adopt a spherical harmonic care model, or a Phong reflection model (Phong ref l ect i on mode 1), of course, other illumination models may also be used, which is not limited in the embodiment of the present application.
  • the initial model in the computer device acquires the similarity between the two-dimensional face image corresponding to the three-dimensional face model and the sample two-dimensional face image.
  • the two-dimensional face image can be compared with the input sample two-dimensional face image to determine whether the three-dimensional face image obtained after the initial model processes the sample two-dimensional face image It can restore the features of the face in the sample two-dimensional face image.
  • the computer device can compare two face images from multiple angles to obtain the similarity of the two face images at multiple angles. Specifically, it is possible to focus not only on the low-level information of the face, for example, the shape, expression, texture, etc. of the face, but also the high-level semantic information of the face, for example, whether the identities of the faces in the two images are consistent.
  • the process of obtaining the similarity of the initial model in step 207 can be implemented through the following steps 1 to 4:
  • Step 1 The initial model in the computer device obtains the first similarity based on the positions of the key points of the two-dimensional face image corresponding to the three-dimensional face model and the key points corresponding to the sample two-dimensional face image.
  • the initial model can focus on the underlying information of the image, and the initial model can determine whether the key points of the faces in the two images are the same, so as to judge the similarity of the two images.
  • the first similarity may be determined based on a first loss function, and the initial model may be based on the first loss function, the two-dimensional face image corresponding to the three-dimensional face model, and the sample two-dimensional face Image, obtain the first similarity.
  • the first loss function may be a landmark loss (Landmark Loss) function.
  • the first loss number can also be another loss function, which is only an example for description, and the embodiment of the present application does not limit this.
  • the first degree of similarity may be expressed in L2 distance, that is, the first degree of similarity may be L2 loss, and the L2 loss is also called Mean Squared Error (Mean Squared Error, MSE). ), that is, the initial model can calculate the difference between the positions of the key points of the two images, and calculate the expected value of the square of the difference.
  • MSE Mean Squared Error
  • L1 distance which is not limited in the embodiment of the present application.
  • Step 2 The initial model in the computer device obtains the second similarity based on the pixel values of the pixels of the two-dimensional face image corresponding to the three-dimensional face model and the pixel values of the corresponding pixels of the sample two-dimensional face image.
  • the initial model can focus on the underlying information of the image, and the initial model can determine the difference in the pixel values of the pixels in the two images. If the difference is large, the similarity of the two images is low. If it is smaller, the similarity of the two images is higher.
  • the second similarity may be determined based on a second loss function, and the initial model may be based on the second loss function, the two-dimensional face image corresponding to the three-dimensional face model, and the sample two-dimensional face Image to obtain the second degree of similarity.
  • the second loss function may be a photometric loss (Photometric Loss) function.
  • the second loss function may also be other loss functions. This is only an example for illustration, and the embodiment of the present application does not limit this.
  • the first degree of similarity may adopt the expression of L21 distance, that is, the initial model may calculate the L21 distance between the pixel values of the corresponding pixels of the two images.
  • the first similarity degree can also be expressed in other ways, for example, the L2 distance, or the L1 distance, which is not limited in the embodiment of the present application.
  • Step 3 The initial model in the computer device matches the two-dimensional face image corresponding to the three-dimensional face model with the sample two-dimensional face image to obtain a third similarity, and the third similarity is used to indicate the two-dimensional Whether the identity of the face in the face image is the same as the identity of the face in the sample two-dimensional face image.
  • the initial model can focus on the high-level semantic information of the two images, and the initial model can determine whether the identities of the faces in the two images are the same, and use this as the accuracy of the initial model's face reconstruction. Ensure that after generating a three-dimensional face model, the generated face is consistent with the identity of the face in the input two-dimensional face image, that is, the identity of the face can be correctly identified by performing face recognition on the two images , And the identity of the user cannot be identified due to the face reconstruction process.
  • the third similarity may be determined based on the face recognition model, that is, the third similarity is based on the face recognition model and the two-dimensional face image corresponding to the three-dimensional face model and the The sample two-dimensional face image is obtained by face recognition.
  • the initial model may be based on a face recognition model, and face recognition is performed on the two-dimensional face image corresponding to the three-dimensional face model and the sample two-dimensional face image to obtain the third similarity.
  • the initial model may call a face recognition model, and input the two-dimensional face image corresponding to the three-dimensional face model and the sample two-dimensional face image into the face recognition model, and the face recognition model is used for the three-dimensional person Perform face recognition on the two-dimensional face image corresponding to the face model and the sample two-dimensional face image No, the third degree of similarity is output.
  • the face recognition model may be a trained model, and the initial model may use the face recognition model to recognize the identity of the face in the image.
  • the process of acquiring the third similarity degree by the face recognition model may be implemented based on the third loss function, that is, the initial model may be based on the third loss function, which corresponds to the three-dimensional face model
  • the two-dimensional face image and the sample two-dimensional face image obtain the third similarity.
  • the third loss function may be a perceptual loss (Perceptua ⁇ Loss) function.
  • the third loss function may also be other loss functions, which is only an example for description, and the embodiment of the present application does not limit this.
  • the face recognition model may be a VGG-Face network
  • the two-dimensional face image corresponding to the three-dimensional face model and the sample two-dimensional face image are input into the VGG-Face network
  • the VGG-Face network Multiple convolutional layers can perform feature extraction on the two-dimensional face image and the sample two-dimensional face image respectively to obtain two feature vectors, and then can calculate the Euclidean distance of the two feature vectors, and take the Euclidean distance as The third degree of similarity.
  • the convolutional layer in the VGG-Face network performs multiple feature extraction, and the PC7 layer can output the two feature vectors.
  • the VGG-Face network as a face recognition model, because the VGG-Face network is not sensitive to light, it can separate the light color from the skin color, thereby learning more natural skin color and more realistic lighting.
  • the facial structure of the generated 3D face model can be more similar to the input 2D face image. Combining these two points, the method provided in this application is relatively robust to two-dimensional face images with different resolutions, different lighting conditions and different backgrounds.
  • the computer device can also preprocess the two-dimensional face image. For example, it can perform face detection on the two-dimensional face image.
  • the two-dimensional face image includes multiple
  • the two-dimensional face image can be cropped into multiple face images corresponding to multiple faces, so that for each face image, the above step of generating a three-dimensional face model is performed.
  • the initial model may be based on the first loss function, the second loss function, the third loss function, and the two-dimensional face image corresponding to the three-dimensional face model and the Sample a two-dimensional face image to obtain the first similarity, the second similarity, and the third similarity.
  • step one to step three the computer equipment may not need to perform all In steps 1 to 3, you can determine the similarity of the two images based on which angles of similarity the initial model needs based on the settings in step 4, and then perform the corresponding steps in the above steps 1 to 3.
  • steps 1 to 3 can be arbitrary, that is, steps 1 to 3 can be arranged in any order, or the computer equipment can execute steps 1 to 3 at the same time. The order of execution of the three is not limited.
  • Step 4 The initial model in the computer device is based on at least one of the first similarity and the second similarity, and the third similarity, to obtain the two-dimensional face image corresponding to the three-dimensional face model and the The similarity of the sample two-dimensional face image.
  • this step four may include three situations:
  • the initial model in the computer device obtains the similarity between the two-dimensional face image corresponding to the three-dimensional face model and the sample two-dimensional face image based on the first similarity and the third similarity.
  • the initial model in the computer device may perform a weighted summation of the first similarity and the third similarity to obtain the two-dimensional face image corresponding to the three-dimensional face model and the sample two-dimensional face image. Similarity.
  • the embodiment of the present application does not limit the weights of multiple similarities.
  • the initial model in the computer device obtains the similarity between the two-dimensional face image corresponding to the three-dimensional face model and the sample two-dimensional face image based on the second similarity and the third similarity.
  • the initial model in the computer device may perform a weighted summation of the second similarity and the third similarity to obtain the similarity between the two-dimensional face image corresponding to the three-dimensional face model and the sample two-dimensional face image degree.
  • the embodiment of the present application does not limit the weights of multiple similarities.
  • the initial model in the computer device is based on the first similarity, the second similarity, and the third similarity to obtain the two-dimensional face image corresponding to the three-dimensional face model and the sample two-dimensional face
  • the similarity of the image may be performed by the initial model in the computer device.
  • the initial model in the computer device may perform a weighted summation of the first similarity, the second similarity, and the third similarity to obtain the two-dimensional face image and the sample corresponding to the three-dimensional face model The similarity of the two-dimensional face image.
  • the embodiment of the present application does not limit the weights of multiple similarities.
  • the initial model takes into account both the low-level information of the image and the high-level semantic information of the image, so that the analysis of the two images is more comprehensive and accurate, so as to ensure that the generated three-dimensional face model can be Accurately restore the low-level and high-level information of the input two-dimensional face image, with a high degree of restoration, more similar to the original input image, and more real.
  • the foregoing only provides three cases.
  • the initial model may also consider other angles to obtain the similarity of the two images.
  • the initial model may also be obtained after obtaining the first local feature vector.
  • the local feature of the sample two-dimensional face image is a first local feature vector
  • the first local feature vector is determined based on partial feature values extracted from global features.
  • the initial model may be based on the first local feature vector to obtain a second local feature vector, the feature value of the second local feature vector and the distribution of the partial feature values extracted from the global feature are the same.
  • the second local feature vector is the reconstructed local feature vector.
  • the initial model may obtain the fourth similarity based on the distance between the second local feature vector and the corresponding partial feature value extracted from the global feature.
  • the fourth similarity degree may be determined based on a fourth loss function, and the initial model may be based on the fourth loss function, the second local feature vector, and corresponding partial features extracted from the global feature Value to obtain the fourth degree of similarity.
  • the fourth loss function may be a Patch Reconstruct i on Loss (image patch reconstruction loss) function.
  • the fourth loss function may also be another loss function, which is not limited in the embodiment of the application.
  • the distance between the second local feature vector and the partial feature value extracted from the global feature may be the L1 distance, that is, the fourth similarity may be expressed in the L1 distance, that is, the first
  • the similarity can be L1 loss, which is also called Mean Abso I ute Dev i at i on (MAE), that is, the initial model can calculate the difference between the second local eigenvector and the corresponding eigenvalue The average value of the absolute value of the deviation.
  • MAE Mean Abso I ute Dev i at i on
  • the smaller the L1 loss the greater the similarity between the reconstructed second local feature vector and the extracted partial feature value, and it also shows that the local encoder better grasps the local information.
  • the fourth similarity degree may also be expressed in other ways, for example, the L2 distance, which is not limited in the embodiment of the present application.
  • the initial model may be based on at least one of the first similarity and the second similarity, the third similarity and the fourth similarity, to obtain the corresponding three-dimensional face model
  • the similarity between the two-dimensional face image and the sample two-dimensional face image can also consider the fourth degree of similarity.
  • the first similarity can be Weighted summation of at least one similarity, third similarity and fourth similarity among the second similarity degree and the second similarity degree to obtain the two-dimensional face image corresponding to the three-dimensional face model and the sample two-dimensional face image
  • the embodiment of the present application does not limit which implementation manner is specifically adopted.
  • the computer device adjusts the model parameters of the initial model based on the similarity, and stops when the target condition is met, to obtain a face model generation model.
  • the model parameters can be adjusted based on the similarity.
  • the above steps 203 to 205 are an iterative process.
  • the computer device can execute steps 206 to 208 , Adjust the model parameters of the trained initial model based on the similarity, until the target condition is met, the face model generation model training is completed.
  • the target condition can be similarity convergence, or it can be that the number of iterations reaches the target number, that is, the model parameters are adjusted after each iteration process described above until the similarity converges after a certain iteration, or the iteration after a certain iteration.
  • the target condition may also be other preset conditions. It should be noted that the target condition may be preset by relevant technicians, which is not limited in the embodiment of the present application.
  • the face model generation model can include three modules.
  • the first module is an encoder, which is responsible for inputting images. Encode into a feature vector (corresponding to step 203 above);
  • the second module is a decoder, which is responsible for decoding the feature vector into 3DMM (Morphab I e Mode I For The Synthes is Of 3D Faces, 3D face deformation model) , Posture and lighting parameters (corresponding to the acquisition process of the shooting information shown in the above steps 204 and 206);
  • the third module is the face recognition network, which is responsible for judging whether the original image and the rendered image are the same person (corresponding to the above The third degree of similarity acquisition process shown in step 207).
  • the input picture passes through a global encoder based on the VGG-Face structure to obtain a global feature vector.
  • the local encoder will pay attention to the eye and mouth features of the conv2_2 and conv3_3 layers in VGG-Face, and use them to encode the local feature vector.
  • These local feature vectors of different levels and regions will be connected and sent to the decoder together with the global feature vector.
  • the posture and lighting are global information
  • the posture and lighting parameters are obtained from the global feature vector through a fully connected layer decoding.
  • the 3DMM parameters such as face shape, expression and texture are obtained by decoding the global and local feature vectors together, so that both global information and local details can be preserved.
  • the fitted 3DMM parameters can reconstruct a 3D face model, and then use the pose and lighting parameters to re-render the 3D face model into a A 2D picture
  • the rendering process is the process of simulating the lighting conditions of the original input picture, the camera angle and the internal parameters to take a picture of the 3D face model.
  • This rendered 2D output image will be compared with the input image, and the network weights of the encoder and decoder will be continuously updated based on the feedback information of these comparison results.
  • the initial model is trained through the sample two-dimensional face image to obtain the face model generation model.
  • the initial model extracts the global and local features of the sample two-dimensional face image, and generates both The face details of the three-dimensional face model are more obvious, and the generation effect of the face model generation model is better.
  • the two-dimensional face image obtained by the projection of the three-dimensional face model and the input sample two-dimensional face image may be compared with the low-level information and the high-level semantic information to adjust the model parameters.
  • the generated three-dimensional face model can accurately restore the input original image on both the low-level information and the high-level voice information, and the reduction degree is high, and the three-dimensional face model is more realistic.
  • the above embodiment shown in Fig. 2 describes the training process of the face model generation model in detail.
  • the process of generating a three-dimensional face model based on the trained face model generation model Obtain a three-dimensional face model.
  • the process of generating a three-dimensional face model based on the face model generation model will be described in detail below through the embodiment shown in Figure 4.
  • FIG. 4 is a flowchart of a computer application method for generating a three-dimensional face model according to an embodiment of the present application.
  • the computer application method for generating a three-dimensional face model can be applied to a computer device. See FIG. 4, The method can include the following steps:
  • the computer device acquires a two-dimensional face image.
  • the computer device can obtain the two-dimensional face image in a variety of ways. For example, when the user wants to generate a three-dimensional face model, the computer device can perform image capture on himself or other people based on the image capture function of the computer device to obtain a two-dimensional face image. Face image. For another example, the computer device may download the two-dimensional face image from the target address according to the first operation instruction. For another example, the computer device can select an image from locally stored images as the two-dimensional face image according to the second operation instruction.
  • the specific method used for the acquisition process may be determined based on the application scenario, which is not limited in the embodiment of the present application.
  • this step 401 may also be: when a face model generation instruction is received, the computer device acquires a two-dimensional face image.
  • the face model generation instruction can be generated by the face model
  • the computer device detects the face model generation operation, it can obtain the face model generation instruction triggered by the face model generation operation, and execute step 401 according to the face model generation instruction.
  • the face model generation instruction may also be sent to the computer device by another computer device, which is not limited in the embodiment of the present application.
  • the computer device calls the face model to generate a model.
  • the face model generation model is used to extract the global features and local features of the two-dimensional face image, based on the global features and local features, obtain three-dimensional face model parameters, and based on the three-dimensional face model parameters, generate the two The three-dimensional face model corresponding to the three-dimensional face image.
  • the face model generation model can be trained based on the model training process shown in Figure 2 above.
  • the trained face model generation model can be called to generate a three-dimensional face model.
  • the computer device inputs the two-dimensional face image into the face model generation model, and the face model generation model extracts global features and local features of the two-dimensional face image.
  • This step 403 is the same as the above step 203.
  • the face model generation model can process the input two-dimensional face image.
  • the face model generation model may first extract the features of the two-dimensional face image, and generate a three-dimensional face model based on the features of the two-dimensional face image.
  • the face model generation model can obtain global features and local features of the two-dimensional face image, where the global features refer to all features obtained by feature extraction of the two-dimensional face image.
  • the local feature refers to the feature extracted from the local area of the two-dimensional face image.
  • the global feature may reflect all areas of the two-dimensional face image
  • the local features may reflect the local area of the two-dimensional face image, for example, the facial features of the face in the two-dimensional face image.
  • the local area may be eyes and nose, or eyes and mouth, of course, it may also be other areas, which is not limited in the embodiment of the present application.
  • both global features and local features are taken into account. In this way, while gaining an overall grasp of the two-dimensional face image, the details of the face can be further optimized, thereby integrating global features and local features to obtain The 3D face model is better.
  • step 403 the process of extracting global features and local features of the two-dimensional face image in step 403 can also be implemented through steps one to three:
  • Step 1 The computer device can perform features on the two-dimensional face image based on multiple convolutional layers Extract, obtain the global feature of the two-dimensional face image.
  • Step 2 The computer device obtains the center position of the key point of the two-dimensional face image.
  • Step 3 Based on the center position, the computer device extracts some features from the features obtained by at least one target convolutional layer in the multiple convolutional layers as local features of the two-dimensional face image.
  • Steps 1 to 3 are the same as those shown in step 203.
  • the computer device obtains the feature map from the target convolutional layer, with the center The position is the center, and the feature map of the target size corresponding to the target convolutional layer is intercepted as the local feature of the two-dimensional face image.
  • the extraction process of the global feature can be: the face model generation model in the computer device is based on multiple convolutional layers of the encoder, and the two-dimensional face image is encoded to obtain the The global feature vector of the two-dimensional face image.
  • the local feature extraction process can be: the face model generation model in the computer device extracts the global feature vector from the global feature vector obtained by at least one target convolution layer among the multiple convolution layers of the encoder The partial feature value, based on the partial feature value, obtain the first local feature vector of the two-dimensional face image.
  • a second decoder may also be provided after the local encoder, and the face model generation model in the computer device may extract part of the feature values in the global feature vector obtained by the at least one target convolution layer; based on the second The decoder decodes the extracted partial feature values to obtain the first local feature vector of the two-dimensional face image.
  • step 403 is the same as the content shown in step 203 above, and there are some content in step 203 that are not shown in step 403, but they can all be applied to step 403. Since the step 403 and step 203 are the same, the embodiment of the present application will not repeat them here.
  • the face model generation model in the computer device obtains three-dimensional face model parameters based on the global feature and the local feature.
  • This step 404 is the same as the above step 204.
  • the face model generation model can calculate the three-dimensional face model parameters based on global features and local features.
  • the computer device can be based on the first decoding The device decodes the global feature vector and the first local feature vector to obtain the three-dimensional face model parameters. The embodiments of the application will not be repeated here.
  • the face model generation model in the computer device outputs a three-dimensional face model corresponding to the two-dimensional face image based on the three-dimensional face model parameters.
  • This step 405 is the same as the above step 205.
  • the face model The generated model may also be calculated based on the three-dimensional face model parameters to obtain a three-dimensional face model, that is, the face model generation model generates a three-dimensional face model corresponding to the two-dimensional face image based on the three-dimensional face model parameters, and then outputs the The generated 3D face model.
  • the generation process may adopt any of the methods shown in step 205, which is not described in detail in the embodiment of the present application.
  • steps 403 to 405 are the process of inputting the two-dimensional face image into the face model generation model, and outputting the three-dimensional face model corresponding to the two-dimensional face image.
  • a three-dimensional face model is obtained by combining the two.
  • the three-dimensional face model obtained in this way has more obvious face details. The face details are processed more finely and the reduction degree is high, so that the three-dimensional face model is more realistic.
  • the computer device can also preprocess the two-dimensional face image. For example, it can perform face detection on the two-dimensional face image.
  • the two-dimensional face image includes multiple
  • the two-dimensional face image can be cropped into multiple face images corresponding to multiple faces, so that for each face image, the above step of generating a three-dimensional face model is performed.
  • a face model is used to generate a model, and a two-dimensional face image is processed to generate a three-dimensional face model.
  • both global features and local features are extracted, thereby combining the two to obtain a three-dimensional face
  • the three-dimensional face model obtained in this way is more obvious in the face details, the face details are processed more finely, and the reduction degree is high, so that the three-dimensional face model is more realistic .
  • FIG. 5 is a schematic structural diagram of a face model generation apparatus provided by an embodiment of the present application.
  • the apparatus may include:
  • the obtaining module 501 is used to obtain a two-dimensional face image
  • the calling module 502 is used to call the face model generation model, and the face model generation model is used to extract global features and local features of the two-dimensional face image, and based on the global features and local features, to obtain three-dimensional face model parameters, Generating a three-dimensional face model corresponding to the two-dimensional face image based on the three-dimensional face model parameters;
  • the generation module 503 is used to input the two-dimensional face image into the face model generation model, and input A three-dimensional face model corresponding to the two-dimensional face image is obtained.
  • the generating module 503 is used to:
  • the generation module 503 is configured to, for each target convolutional layer, in the feature map obtained from the target convolutional layer, take the center position as the center to intercept the target corresponding to the target convolutional layer.
  • the feature map of the size is used as the local feature of the two-dimensional face image.
  • the generating module 503 is used to:
  • the generating module 503 is further configured to extract a partial feature value of the global feature vector from the global feature vector obtained from at least one target convolution layer of the multiple convolution layers of the encoder, and based on the partial feature value, Acquiring the first local feature vector of the two-dimensional face image;
  • the generating module 503 is further configured to decode the global feature vector and the first local feature vector based on the first decoder to obtain the three-dimensional face model parameters.
  • the generating module 503 is configured to extract part of the feature values in the global feature vector obtained by the at least one target convolutional layer; based on the second decoder, decode the extracted part of the feature values to obtain The first local feature vector of the two-dimensional face image.
  • the acquiring module 501 is also used to acquire multiple sample two-dimensional face images
  • the calling module 502 is also used to call the initial model, input the multiple sample two-dimensional face images into the initial model, and for each sample two-dimensional face image, extract the sample two-dimensional face image from the initial model Based on the global features and local features, obtain three-dimensional face model parameters, and based on the three-dimensional face model parameters, output the three-dimensional face model corresponding to the sample two-dimensional face image;
  • the device also includes:
  • the projection module is used to project the three-dimensional face model to obtain the corresponding three-dimensional face model Two-dimensional face image of
  • the acquiring module 501 is also used to acquire the similarity between the two-dimensional face image corresponding to the three-dimensional face model and the two-dimensional face image of the sample;
  • the adjustment module is used to adjust the model parameters of the initial model based on the similarity, and stop until the target condition is met, to obtain a face model generation model.
  • the projection module is also used for:
  • the shooting information is used to indicate at least one of the shooting posture, illumination, or shooting background when the sample two-dimensional face image is taken; based on the shooting information, Projecting the three-dimensional face model to obtain a two-dimensional face image corresponding to the three-dimensional face model.
  • the obtaining module 501 is also used to:
  • the two-dimensional face image corresponding to the three-dimensional face model is matched with the sample two-dimensional face image to obtain a third similarity, where the third similarity is used to indicate the identity and identity of the face in the two-dimensional face image Whether the identities of the faces in the sample two-dimensional face images are the same;
  • the similarity between the two-dimensional face image corresponding to the three-dimensional face model and the sample two-dimensional face image is acquired .
  • the acquisition module 501 is further configured to perform face recognition on the two-dimensional face image corresponding to the three-dimensional face model and the sample two-dimensional face image based on the face recognition model, to obtain the third Similarity.
  • the acquisition module 501 is further configured to be based on the first loss function, the second loss function, the third loss function, and the two-dimensional face image corresponding to the three-dimensional face model and the sample two-dimensional For the face image, the first similarity, the second similarity, and the third similarity are acquired.
  • the local feature of the sample two-dimensional face image is a first local feature vector
  • the first local feature vector is determined based on partial feature values extracted from the global feature
  • the acquisition module 501 is also used to: Acquiring a second local feature vector based on the first local feature vector, the feature value of the second local feature vector is the same as the distribution of the partial feature value extracted from the global feature;
  • the obtaining module 501 is also used for:
  • the two-dimensional face image corresponding to the three-dimensional face model and the sample two-dimensional person are acquired The similarity of the face image.
  • the obtaining module 501 is further configured to obtain the fourth similarity based on the fourth loss function, the second local feature vector, and the corresponding partial feature values extracted from the global feature.
  • the device provided by the embodiment of the present application processes a two-dimensional face image through a face model generation model to generate a three-dimensional face model.
  • a face model generation model to generate a three-dimensional face model.
  • both global features and local features are extracted, thereby integrating the two Obtain a three-dimensional face model.
  • the three-dimensional face model obtained in this way is more obvious in the face details, the face details are processed more finely, and the reduction degree is high, so that the three-dimensional person The face model is more realistic.
  • the face model generating apparatus when the face model generating apparatus provided in the above embodiment generates a three-dimensional face model, only the division of the above functional modules is used as an example for illustration. In actual applications, the above functions can be assigned to different functions as required.
  • the function module is completed, that is, the internal structure of the computer device is divided into different function modules to complete all or part of the functions described above.
  • the face model generation apparatus provided in the above-mentioned embodiment and the face model generation method embodiment belong to the same concept. For the specific implementation process, please refer to the method embodiment, which will not be repeated here.
  • the foregoing computer equipment may be provided as the terminal shown in FIG. 6 below, or may be provided as the server shown in FIG. 7 below, which is not limited in the embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a terminal provided by an embodiment of the present application.
  • the terminal 600 can be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, moving picture experts compressed standard audio layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, moving picture experts compressed standard audio Level 4) Player, laptop or desktop computer.
  • the terminal 600 may also be called a user equipment, a portable terminal, a laptop terminal, a desk Type terminal and other names.
  • the terminal 600 includes: a processor 601 and a memory 602.
  • the processor 601 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on.
  • the processor 601 may adopt at least one hardware form of DSP (Digital Signal Processing), FPGA (Field-Programmable Gate Array), and PLA (Programmable Logic Array). achieve.
  • the processor 601 may also include a main processor and a co-processor.
  • the main processor is a processor used to process data in the awake state, also called a CPU (Central Processing Unit, central processing unit); the co-processor is A low-power processor used to process data in the standby state.
  • the processor 601 may be integrated with a GPU (Graphics Processing Unit, image processor), and the GPU is used for rendering and drawing content that needs to be displayed on the display screen.
  • the processor 601 may further include an AI (Artificial Intelligence, artificial intelligence) processor, and the AI processor is used to process computing operations related to machine learning.
  • AI Artificial Intelligence, artificial intelligence
  • the memory 602 may include one or more computer-readable storage media, and the computer-readable storage media may be non-transitory.
  • the memory 602 may also include high-speed random access memory and non-volatile memory, such as one or more magnetic disk storage devices and flash memory storage devices.
  • the non-transitory computer-readable storage medium in the memory 602 is used to store at least one computer-readable instruction, and the at least one computer-readable instruction is used to be executed by the processor 601 to implement the method in the present application
  • the computer application method for three-dimensional face model generation or the face model generation model training method provided by the embodiment.
  • the terminal 600 may optionally further include: a peripheral device interface 603 and at least one peripheral device.
  • the processor 601, the memory 602, and the peripheral device interface 603 may be connected by a bus or signal line.
  • Each peripheral device can be connected to the peripheral device interface 603 through a bus, a signal line or a circuit board.
  • the peripheral device includes: at least one of a radio frequency circuit 604, a display screen 605, a camera 606, an audio circuit 607, a positioning component 608, and a power supply 609.
  • the peripheral device interface 603 may be used to connect at least one peripheral device related to I/O (Input/Output, input/output) to the processor 601 and the memory 602.
  • the processor 601, the memory 602, and the peripheral device interface 603 are integrated on the same chip or circuit board; in some other embodiments, any one of the processor 601, the memory 602, and the peripheral device interface 603 or The two can be implemented on separate chips or circuit boards, which are not limited in this embodiment.
  • the radio frequency circuit 604 is used to receive and transmit RF (Radio Frequency, radio frequency) signals, also called electromagnetic signals.
  • the radio frequency circuit 604 communicates with a communication network and other communication devices through electromagnetic signals.
  • the radio frequency circuit 604 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals into electrical signals.
  • the radio frequency circuit 604 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a user identity module card, and so on.
  • the radio frequency circuit 604 can communicate with other terminals through at least one wireless communication protocol.
  • the wireless communication protocol includes but is not limited to: metropolitan area network, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area network and/or WiFifWireless Fidelity, wireless fidelity) networks.
  • the radio frequency circuit 604 may also include NFC (Near Field Communication, Near Field Communication) related circuits, which is not limited in this application.
  • the display screen 605 is used to display UI (User Interface, user interface).
  • the UI may include graphics, text, icons, videos, and any combination thereof.
  • the display screen 605 also has the ability to collect touch signals on or above the surface of the display screen 605.
  • the touch signal can be input to the processor 601 as a control signal for processing.
  • the display screen 605 may also be used to provide virtual buttons and/or virtual keyboards, also called soft buttons and/or soft keyboards.
  • the display screen 605 there may be one display screen 605, which is provided with the front panel of the terminal 600; in other embodiments, there may be at least two display screens 605, which are respectively arranged on different surfaces of the terminal 600 or in a folded design; In still other embodiments, the display screen 605 may be a flexible display screen, which is arranged on the curved surface or the folding surface of the terminal 600. Furthermore, the display screen 605 can also be set as a non-rectangular irregular pattern, that is, a special-shaped screen.
  • the display screen 605 can be made of materials such as LCD (Liquid Crystal Display) and OLED (Organic Light-Emitting Diode).
  • the camera assembly 606 is used to capture images or videos.
  • the camera assembly 606 includes a front camera and a rear camera.
  • the front camera is set on the front panel of the terminal, and the rear camera is set on the back of the terminal.
  • the camera assembly 606 may also include a flash.
  • the flash can be a single-color temperature flash or a dual-color temperature flash. Dual color temperature flash refers to a combination of warm light flash and cold light flash, which can be used for light compensation under different color temperatures.
  • the audio circuit 607 may include a microphone and a speaker.
  • the microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals and input them to the processor 601 for processing, or input to the radio frequency circuit 604 to implement voice communication. For the purpose of stereo collection or noise reduction, there may be multiple microphones, which are respectively set in different parts of the terminal 600.
  • the microphone can also be an array microphone or an omnidirectional acquisition microphone.
  • the speaker is used to convert the electrical signal from the processor 601 or the radio frequency circuit 604 into sound waves.
  • the speaker can be a traditional thin-film speaker or a piezoelectric ceramic speaker.
  • the speaker is a piezoelectric ceramic speaker, not only can the electrical signal be converted into human audible sound waves, but also the electrical signal can be converted into human inaudible sound waves for purposes such as distance measurement.
  • the audio circuit 607 may also include a headphone jack.
  • the positioning component 608 is used to locate the current geographic position of the terminal 600 to implement navigation or LBS (Location Based Service).
  • the positioning component 608 may be a positioning component based on the GPS (Global Positioning System, Global Positioning System) of the United States, the Beidou system of China, the Granus system of Russia, or the Galileo system of the European Union.
  • the power supply 609 is used to supply power to various components in the terminal 600.
  • the power source 609 may be alternating current, direct current, disposable batteries or rechargeable batteries.
  • the rechargeable battery may support wired charging or wireless charging.
  • the rechargeable battery can also be used to support fast charging technology.
  • the terminal 600 further includes one or more sensors 610.
  • the one or more sensors 610 include but are not limited to: an acceleration sensor 611, a gyroscope sensor 612, a pressure sensor 613, a fingerprint sensor 614, an optical sensor 615, and a proximity sensor 616.
  • the acceleration sensor 611 can detect the magnitude of acceleration on the three coordinate axes of the coordinate system established by the terminal 600.
  • the acceleration sensor 611 can be used to detect the components of gravitational acceleration on three coordinate axes.
  • the processor 601 can control the display screen 605 to display the user interface in a horizontal view or a vertical view according to the gravity acceleration signal collected by the acceleration sensor 611.
  • the acceleration sensor 611 can also be used for the collection of game or user motion data.
  • the gyroscope sensor 612 can detect the body direction and rotation angle of the terminal 600, and the gyroscope sensor 612 can cooperate with the acceleration sensor 611 to collect the user's 3D actions on the terminal 600.
  • the processor 601 can implement the following functions according to the data collected by the gyroscope sensor 612: motion sensing (for example, changing the UI according to the user's tilt operation), image stabilization during shooting, game control, and inertial navigation.
  • the pressure sensor 613 may be disposed on the side frame of the terminal 600 and/or the lower layer of the display screen 605.
  • the processor 601 controls the operability controls on the UI interface according to the user's pressure operation on the display screen 605.
  • the operability controls include at least one of button controls, scroll bar controls, icon controls, and menu controls.
  • the fingerprint sensor 614 is used to collect the user's fingerprint.
  • the processor 601 identifies the user's identity based on the fingerprint collected by the fingerprint sensor 614, or the fingerprint sensor 614 identifies the user's identity based on the collected fingerprint.
  • the processor 601 authorizes the user to perform related sensitive operations.
  • the sensitive operations include unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings.
  • the fingerprint sensor 614 can be set on the front, back or side of the terminal 600. When the terminal 600 is provided with a physical button or a manufacturer's logo, the fingerprint sensor 614 may be integrated with the physical button or the manufacturer's Logo.
  • the optical sensor 615 is used to collect the ambient light intensity.
  • the processor 601 may control the display brightness of the display screen 605 according to the ambient light intensity collected by the optical sensor 615. Specifically, when the ambient light intensity is high, the display brightness of the display screen 605 is increased; when the ambient light intensity is low, the display brightness of the display screen 605 is decreased.
  • the processor 601 may also dynamically adjust the shooting parameters of the camera assembly 606 according to the ambient light intensity collected by the optical sensor 615.
  • the proximity sensor 616 also called a distance sensor, is usually set on the front panel of the terminal 600.
  • the proximity sensor 616 is used to collect the distance between the user and the front of the terminal 600.
  • the processor 601 controls the display screen 605 to switch from the on-screen state to the off-screen state; when the proximity sensor 616 detects When the distance between the user and the front of the terminal 600 gradually increases, the processor 601 controls the display screen 605 to switch from the rest screen state to the bright screen state.
  • FIG. 7 is a schematic structural diagram of a server provided by an embodiment of the present application.
  • the server 700 may have relatively large differences due to different configurations or performances, and may include one or more processors (centra l process i ng un i ts, CPU) 701 and one or more memories 702, where at least one computer-readable instruction is stored in the memory 702, and the at least one computer-readable instruction is loaded and executed by the processor 701 to implement the foregoing various method embodiments.
  • processors centra l process i ng un i ts, CPU
  • a computer application method for generating a three-dimensional face model or a training method for generating a face model may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface for input and output, and the server may also include other components for implementing device functions, which will not be repeated here.
  • a computer-readable storage medium is also provided, such as a memory including computer-readable instructions, which may be executed by a processor to complete the generation of a three-dimensional face model in the foregoing embodiment.
  • Computer application method or face model generation model training method may be executed by a processor to complete the generation of a three-dimensional face model in the foregoing embodiment.
  • the computer-readable storage medium may be a read-only memory (Read-On I y Memory, ROM), a random access memory (Random Access Memory, RAM), and a compact disc (Compact D i sc Read-Only Memory, ROM). CD-ROM), magnetic tapes, floppy disks and optical data storage devices.
  • ROM read-only memory
  • RAM Random Access Memory
  • CD-ROM Compact disc
  • CD-ROM magnetic tapes
  • floppy disks and optical data storage devices.
  • the computer-readable instructions can be stored in a computer-readable storage.
  • the aforementioned storage medium may be a read-only memory, a magnetic disk, or an optical disk, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Processing Or Creating Images (AREA)
  • Image Analysis (AREA)

Abstract

一种用于三维人脸模型生成的计算机应用方法,包括:获取二维人脸图像;调用人脸模型生成模型;将二维人脸图像输入人脸模型生成模型中,通过人脸模型生成模型提取二维人脸图像的全局特征和局部特征,基于全局特征和局部特征,获取三维人脸模型参数,基于三维人脸模型参数,输出二维人脸图像对应的三维人脸模型。

Description

用于三维人脸模型生成的计算机应用方法、 装置、 计算机设备及
存储介质 本申请要求于 2019 年 02 月 26 日提交中国专利局, 申请号为 201 91 01 40602. X, 申请名称为 “三维人脸模型生成方法、 装置、 计算机设备 及存储介质” 的中国专利申请的优先权, 其全部内容通过引用结合在本申请 中。
技术领域
本申请涉及计算机技术领域, 特别涉及一种用于三维人脸模型生成的计 算机应用方法、 装置、 计算机设备及存储介质。 背景技术
随着计算机技术的发展, 基于图像生成三维人脸模型的技术已经在很多 领域得到了应用, 例如, 该技术已经广泛应用于人脸识别、 公安、 医疗、 游 戏或影视娱乐等领域。
目前,三维人脸模型生成方法通常是通过提取二维人脸图像的全局特征, 根据该全局特征, 计算得到三维人脸模型参数, 从而可以根据三维人脸模型 参数计算得到三维人脸模型。
上述三维人脸模型生成方法中仅提取了二维人脸图像的全局特征, 基于 全局特征计算得到的三维人脸模型并未关注人脸细节, 并不能很好还原二维 人脸图像中人脸的细节。 发明内容
一种用于三维人脸模型生成的计算机应用方法, 所述方法包括: 获取二维人脸图像;
调用人脸模型生成模型; 及
将所述二维人脸图像输入所述人脸模型生成模型中, 通过所述人脸模型 生成模型提取所述二维人脸图像的全局特征和局部特征, 基于所述全局特征 和局部特征, 获取三维人脸模型参数, 基于所述三维人脸模型参数, 输出所 述二维人脸图像对应的三维人脸模型。 一种用于三维人脸模型生成的计算机应用装置, 所述装置包括: 获取模块, 用于获取二维人脸图像;
调用模块, 用于调用人脸模型生成模型; 及
生成模块, 用于将所述二维人脸图像输入所述人脸模型生成模型中, 通 过所述人脸模型生成模型提取所述二维人脸图像的全局特征和局部特征, 基 于所述全局特征和局部特征, 获取三维人脸模型参数, 基于所述三维人脸模 型参数, 输出所述二维人脸图像对应的三维人脸模型。 一种计算机设备, 包括存储器和一个或多个处理器, 存储器中储存有计 算机可读指令, 计算机可读指令被处理器执行时, 使得一个或多个处理器执 行以下步骤:
获取二维人脸图像;
调用人脸模型生成模型; 及
将所述二维人脸图像输入所述人脸模型生成模型中, 通过所述人脸模型 生成模型提取所述二维人脸图像的全局特征和局部特征, 基于所述全局特征 和局部特征, 获取三维人脸模型参数, 基于所述三维人脸模型参数, 输出所 述二维人脸图像对应的三维人脸模型。 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质, 计 算机可读指令被一个或多个处理器执行时, 使得一个或多个处理器执行以下 步骤:
获取二维人脸图像;
调用人脸模型生成模型; 及
将所述二维人脸图像输入所述人脸模型生成模型中, 通过所述人脸模型 生成模型提取所述二维人脸图像的全局特征和局部特征, 基于所述全局特征 和局部特征, 获取三维人脸模型参数, 基于所述三维人脸模型参数, 输出所 述二维人脸图像对应的三维人脸模型。 附图说明 为了更清楚地说明本申请实施例中的技术方案, 下面将对实施例描述中 所需要使用的附图作简单地介绍, 显而易见地, 下面描述中的附图仅仅是本 申请的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动的 前提下, 还可以根据这些附图获得其他的附图。
图 1 是本申请实施例提供的一种用于三维人脸模型生成的计算机应用方 法的实施环境;
图 2是本申请实施例提供的一种人脸模型生成模型训练方法的流程图; 图 3是本申请实施例提供的一种人脸模型生成模型的结构示意图; 图 4是本申请实施例提供的一种用于三维人脸模型生成的计算机应用方 法的流程图;
图 5是本申请实施例提供的一种用于三维人脸模型生成的计算机应用装 置的结构示意图;
图 6是本申请实施例提供的一种终端的结构示意图;
图 7是本申请实施例提供的一种服务器的结构示意图。 具体实施方式
为使本申请的目的、 技术方案和优点更加清楚, 下面将结合附图对本申 请实施方式作进一步地详细描述。
图 1 是本申请实施例提供的一种用于三维人脸模型生成的计算机应用方 法的实施环境, 该实施环境可以包括至少一个计算机设备, 参见图 1, 仅以 该实施环境包括多个计算机设备为例进行说明。 其中, 该多个计算机设备可 以通过有线连接方式实现数据交互, 也可以通过无线网络连接方式实现数据 交互, 本申请实施例对此不作限定。
在本申请实施例中, 计算机设备 1 01 可以基于二维人脸图像, 生成三维 人脸模型, 得到该二维人脸图像对应的三维人脸模型, 该过程是三维人脸重 建过程。 在一种可能实现方式中, 该计算机设备 1 01 中可以存储有人脸模型 生成模型, 该计算机设备 1 01 可以基于存储的该人脸模型生成模型对二维人 脸图像进行处理, 实现三维人脸模型生成过程。 在另一种可能实现方式中, 该计算机设备 1 01 也可以在有人脸模型生成需求时调用其它计算机设备上的 人脸模型生成模型进行人脸模型生成的过程, 本申请实施例对此不作限定, 下述均以该计算机设备 1 01 存储有人脸模型生成模型为例进行说明。
在一种可能实现方式中, 该人脸模型生成模型可以在该计算机设备 1 01 上训练得到, 也可以在其他计算机设备上训练得到, 例如, 该其他计算机设 备可以为计算机设备 102。 计算机设备 1 02还可以将该训练好的人脸模型生 成模型封装后发送至该计算机设备 1 01, 从而计算机设备 1 01 可以接收并存 储该训练好的人脸模型生成模型。 本申请实施例对该人脸模型生成模型的训 练设备不作限定。
Figure imgf000006_0001
维人脸图像, 并调用存储的人脸模型生成模型对该二维人脸图像进行处理, 得到三维人脸模型。 在另一种可能实现方式中, 该计算机设备 1 01 可以接收 其他计算机设备发送的人脸模型生成请求, 该人脸模型生成请求中携带有二 维人脸图像, 该计算机设备 1 01 可以实现上述三维人脸模型生成步骤, 并将 生成的三维人脸模型发送至该其他计算机设备。 本申请实施例对具体采用哪 种实现方式不作限定。
具体地, 该计算机设备 1 01 和计算机设备 1 02均可以被提供为终端, 也 可以被提供为服务器, 本申请实施例对此不作限定。 图 2是本申请实施例提供的一种人脸模型生成模型训练方法的流程图, 该人脸模型生成模型训练方法可以应用于计算机设备, 该计算机设备可以为 上述计算机设备 1 01 , 也可以为上述计算机设备 1 02 , 本申请实施例对此不作 限定。 该计算机设备可以为终端, 也可以为服务器, 本申请实施例对此也不 作限定。 参见图 2, 该人脸模型生成模型训练方法可以包括以下步骤:
201、 计算机设备获取多个样本二维人脸图像。
在本申请实施例中, 计算机设备可以基于样本二维人脸图像, 对初始模 型进行训练, 得到人脸模型生成模型, 从而后续可以基于该训练好的人脸模 型生成模型, 生成三维人脸模型。
在一种可能实现方式中, 该多个样本二维人脸图像可以存储于该计算机 设备中, 在该计算机设备需要进行人脸模型生成模型训练时, 可以从该本地 存储中获取该多个样本二维人脸图像。 在另一种可能实现方式中, 该多个样 本二维人脸图像也可以存储于其他计算机设备中, 在该计算机设备需要进行 人脸模型生成模型训练时, 从该其他计算机设备处获取该多个样本二维人脸 图像。 例如, 该计算机设备可以从图像数据库, 获取多个二维人脸图像作为 样本图像。 本申请实施例对该样本二维人脸图像的获取方式不作限定。
需要说明的是, 该人脸模型生成模型的训练过程可以采用无监督学习的 方式, 计算机设备可以获取多个样本二维人脸图像, 基于样本二维人脸图像 即可完成模型训练过程, 该样本二维人脸图像可以无需携带有标签信息。 该 标签信息一般是指人工合成的三维人脸模型参数, 为伪真实数据。
202、 计算机设备调用初始模型, 将该多个样本二维人脸图像输入该初始 模型中。
计算机设备获取到样本二维人脸图像后, 可以基于该样本二维人脸图像 对初始模型进行训练, 因而, 计算机设备可以调用初始模型, 将样本二维人 脸图像输入初始模型中, 该初始模型中的模型参数为初始值, 计算机设备可 以根据该初始模型对样本二维人脸图像进行处理的情况, 对该初始模型的模 型参数进行调整, 以使得调整后的初始模型基于二维人脸图像能够得到与二 维人脸图像更相像、 人脸细节更好的三维人脸模型。
其中, 该初始模型可以存储于该计算机设备中, 也可以存储于其他计算 机设备中, 该计算机设备从其他计算机设备处获取该初始模型, 本申请实施 例对此不作限定。
203、 对于每个样本二维人脸图像, 计算机设备中的初始模型提取该样本 二维人脸图像的全局特征和局部特征。
计算机设备将多个样本二维人脸图像输入初始模型后, 该初始模型可以 对输入的多个样本二维人脸图像进行处理。 具体地, 初始模型可以基于每个 样本二维人脸图像进行三维人脸模型生成步骤, 得到三维人脸模型。
该三维人脸模型生成步骤中, 对于每个样本二维人脸图像, 初始模型可 以先提取样本二维人脸图像的特征, 基于该样本二维人脸图像的特征, 生成 三维人脸模型, 这样得到的三维人脸模型与样本二维人脸图像才可以具有相 同的特征, 二者更相似。
具体地, 该初始模型可以获取该样本二维人脸图像的全局特征和局部特 征, 其中, 该全局特征是指对样本二维人脸图像进行特征提取得到的全部特 征。局部特征是指对样本二维人脸图像的局部区域进行特征提取得到的特征。 例如, 该全局特征可以体现该样本二维人脸图像的全部区域, 局部特征可以 体现该样本二维人脸图像的局部区域, 例如, 该样本二维人脸图像中人脸的 五官。 又例如, 该局部区域可以为眼睛和鼻子, 或眼睛和嘴巴, 当然, 也可 以为其他区域, 本申请实施例对此不作限定。 该特征提取过程中既考虑到了 全局特征, 又考虑到了局部特征, 这样在对样本二维人脸图像有了整体把握 的同时, 还能对人脸细节进行进一步优化, 从而使得计算机设备综合全局特 征和局部特征得到的三维人脸模型效果更好。
下面针对每个样本二维人脸图像的全局特征和局部特征的提取过程进行 详细说明, 具体可以通过下述步骤一至步骤三实现:
步骤一、 计算机设备可以基于多个卷积层, 对该样本二维人脸图像进行 特征提取, 得到该样本二维人脸图像的全局特征。
计算机设备可以通过对样本二维人脸图像进行多次卷积, 来提取该样本 二维人脸图像的全局特征。 在一个具体的可能实施例中, 该初始模型可以采 用人脸视觉几何组 (V i sua l Geometry Group-Face, VGG-Face) 网络, 初始 模型可以使用该 VGG-Face 网络中的多个卷积层对样本二维人脸图像进行特 征提取。 在另一个具体的可能实施例中, 该初始模型还可以采用其他人脸识 别网络实现, 例如, 可以采用 FaceNet,该 FaceNet是一种人脸识别网络。 本 申请实施例对该初始模型具体采用哪种人脸识别网络不作限定。
在一种可能实现方式中, 该步骤一可以通过编码器实现, 该全局特征可 以采用全局特征向量的形式表示。 该步骤一可以为: 计算机设备可以基于编 码器的多个卷积层, 对该样本二维人脸图像进行编码, 得到该样本二维人脸 图像的全局特征向量。
进一步地, 该全局特征向量也可以为特征图的形式, 例如, 可以为矩阵 的形式, 当然, 也可以为其他形式, 例如数组的形式, 本申请实施例对此不 作限定。 例如, 上一个卷积层均可以对样本二维人脸图像进行处理得到一个 特征图, 并将该特征图输入下一个卷积层, 由该下一个卷积层继续对输入的 特征图进行处理, 得到一个特征图。
步骤二、 计算机设备获取该样本二维人脸图像的关键点的中心位置。 在该步骤二中, 计算机设备可以对样本二维人脸图像进行关键点检测, 得到该样本二维人脸图像的关键点的位置, 其中, 该关键点可以是指该人脸 的五官和脸部轮廓等部位。 例如, 人脸可以包括 68个关键点。 计算机设备可 以基于得到的关键点的位置, 获取该关键点的中心位置。 例如, 计算机设备 在得到 68个关键点的位置时, 可以计算该 68个关键点的中心位置。 其中, 该计算机设备进行关键点检测的过程可以采用任一种关键点检测技术实现, 本申请实施例对此不作限定。
步骤三、 计算机设备基于该中心位置, 从该多个卷积层中至少一个目标 卷积层得到的特征中, 提取部分特征作为该样本二维人脸图像的局部特征。
该步骤二和步骤三为计算机设备获取样本二维人脸图像的局部特征的过 程, 在该过程中, 计算机设备先基于人脸的关键点的中心位置, 从样本二维 人脸图像的全部特征中截取部分特征, 可以得到人脸的五官或脸部轮廓部位 的局部特征, 从而在后续生成三维人脸模型时, 基于获取到的局部特征, 可 以使得生成的三维人脸模型中人脸的五官或脸部轮廓处理的更细致。
Figure imgf000009_0001
维人脸图像进行处理得到一个特征图, 并将该特征图输入下一个卷积层, 由 该下一个卷积层继续对输入的特征图进行处理, 得到一个特征图。 计算机设 备获取局部特征时, 即可从上述多个卷积层中的某一个或某几个卷积层得到 的特征图中截取部分特征。 该某一个或某几个卷积层即为至少一个目标卷积 层。 例如, 以该初始模型采用 VGG-Face网络为例, 该至少一个目标卷积层可 以为卷积层 con2_2和卷积层 con 3 3, 该至少一个目标卷积层可以由相关技 术人员进行设置或调整, 本申请实施例对此不作限定。 这样对不同层次的卷 积层得到的特征进行部分特征的提取, 得到的局部特征也包括了人脸的底层 信息和高层信息, 局部特征更丰富, 最终体现在三维人脸模型中人脸细节也 更细致。
具体地, 不同的目标卷积层可能对应于不同的目标尺寸, 对于每个目标 卷积层, 计算机设备从该目标卷积层得到的特征图中, 以该中心位置为中心, 截取该目标卷积层对应的目标尺寸的特征图作为该样本二维人脸图像的局部 特征。 例如, 计算机设备以该中心位置为中心, 从 con2_2得到的特征图中截 取大小为 64x64的特征图,从 con3_3得到的特征图中截取大小为 32x32的特 征图。 该大小为 64x64的特征图和该大小为 32x32的特征图可以体现该样本 二维人脸图像中人脸的五官或脸部轮廓等部位对应的特征。 在该局部特征的 过程中, 可以将该目标卷积层看做局部编码器中的卷积层, 基于局部编码器 获取得到局部特征。
在一种可能实现方式中, 步骤一可以通过编码器实现, 全局特征可以采 用全局特征向量的形式表示, 该步骤三也可以从上述编码器中的目标卷积层 中提取局部特征, 该局部特征也可以采用局部特征向量的形式表示。 该步骤 三可以为: 计算机设备从该编码器的多个卷积层中至少一个目标卷积层得到 的全局特征向量中, 提取该全局特征向量的部分特征值, 基于该部分特征值, 获取该二维人脸图像的第一局部特征向量。 相应地, 下述步骤 204 中, 计算 三维人脸模型参数的过程可以基于第一解码器实现, 具体可以参见下述步骤 204, 本申请实施例在此不多做赘述。
在一个具体的可能实施例中, 从该全局特征向量中提取的部分特征值组 成的向量与上述步骤一中经过多个卷积层得到的全局特征向量的形式不同, 后续还需要基于该全局特征向量和局部特征向量计算三维人脸模型参数, 因 而, 在提取到部分特征值后, 计算机设备还可以基于第二解码器, 对该部分 特征值进行进一步处理,使得得到的局部特征向量与全局特征向量形式相同, 更容易融合以计算三维人脸模型参数。
在该实施例中, 计算机设备基于该部分特征值, 获取该二维人脸图像的 第一局部特征向量的过程可以基于第二解码器实现, 则上述获取局部特征的 过程可以为: 计算机设备提取该至少一个目标卷积层得到的全局特征向量中 的部分特征值; 计算机设备基于第二解码器, 对提取到的部分特征值进行解 码, 得到该二维人脸图像的第一局部特征向量。 该第一局部特征向量即用于 与全局特征向量结合以获取三维人脸模型参数。
在一种可能实现方式中, 人脸可以包括多个部位, 例如, 眼睛、 鼻子、 嘴巴等, 计算机设备在提取该全局特征中的部分特征时, 可以获取到多个区 域对应的部分特征, 然后可以将该多个区域对应的部分特征进行整合, 得到 该样本二维人脸图像的局部特征。 对不同的目标卷积层得到的部分特征中还 可以均包括该多个区域对应的部分特征, 如果对多个目标卷积层进行了部分 特征提取,计算机设备也需要对该多个目标卷积层对应的部分特征进行整合。
在该实现方式中, 对于每个目标卷积层, 计算机设备可以提取该目标卷 积层中全局特征向量中多个区域对应的部分特征值, 基于第二解码器, 对提 取到的部分特征值进行解码, 得到多个区域的第一局部特征向量, 每个区域 对应于人脸的一个器官部位; 计算机设备可以对该多个区域的第一局部特征 向量进行拼接, 得到该二维人脸图像的第一局部特征向量。
例如, 以目标卷积层为 con2_2和 con3_3, 多个区域为左眼、 右眼和嘴 巴为例, 计算机设备在 con2_2 中可以获取到左眼、 右眼和嘴巴的特征, 在 con3_3中也可以获取到左眼、 右眼和嘴巴的特征, 也即是, 计算机设备获取 到了多个层次中多个区域的局部特征。 在一个具体示例中, 每个层次每个区 域的局部特征均可以对应一个第二解码器, 计算机设备可以对每个层次每个 区域提取到的局部特征进行解码, 得到每个层次每个区域对应的第一局部特 征向量。 计算机设备可以将该多个区域对应的多个第一局部特征向量拼接在 一起, 得到该二维人脸图像的第一局部特征向量。
204、 计算机设备中的初始模型基于该全局特征和局部特征, 获取三维人 脸模型参数。
对于每个样本二维人脸图像,初始模型在获取到全局特征和局部特征后, 可以综合二者, 计算三维人脸模型参数, 在一种可能实现方式中, 该三维人 脸模型参数可以为三维可变人脸模型 (three d i mens i ona I morphab I e mode I , 3DMM) 的参数。
在一种可能实现方式中, 初始模型在得到全局特征和局部特征时基于编 码器实现, 在该步骤 204 中, 初始模型可以基于第一解码器对全局特征和局 部特征进行解码, 得到三维人脸模型参数。 在上述步骤 203 中可以得知, 该 全局特征可以为编码得到的全局特征向量, 该局部特征可以为编码并解码得 到的第一局部特征向量, 则该步骤 204中, 计算机设备可以基于第一解码器, 对该全局特征向量和该第一局部特征向量进行解码,得到三维人脸模型参数。
在一个具体的可能实施例中, 该第一解码器中可以包括一层全连接层, 计算机设备可以基于该全连接层, 对该全局特征向量和该第一局部特征向量 进行计算, 得到三维人脸模型参数。
需要说明的是, 该三维人脸模型参数可以包括人脸的纹理信息、 表情信 息、 形状信息等人脸信息。 当然, 该三维人脸模型参数还可以包括其他人脸 信息, 例如, 姿态信息等, 本申请实施例对此不作限定。 通过该三维人脸模 型参数即可以获知该人脸在纹理、 表情、 形状上的情况, 因而, 计算机设备 可以下述步骤 205, 对该三维人脸模型参数进行处理, 得到三维人脸模型。
205、 计算机设备中的初始模型基于该三维人脸模型参数, 输出该样本二 维人脸图像对应的三维人脸模型。
该三维人脸模型参数即包括了多种人脸信息, 初始模型可以根据该多种 人脸信息生成三维人脸模型, 使得生成的三维人脸模型的人脸信息与该三维 人脸模型参数所指示的人脸信息相同。 例如, 该三维人脸模型的纹理信息应 该与三维人脸模型参数所包括的纹理信息相同, 三维人脸模型中人脸的表情 应该与三维人脸模型参数所包括的表情信息对应的表情相同,人脸形状同理, 在此不多做赘述。
在一种可能实现方式中, 该三维人脸模型可以为多个脸部模型的组合, 计算机设备可以根据该三维人脸模型参数和多个初始脸部模型, 得到三维人 脸模型。 具体地, 该计算机设备可以根据该三维人脸模型参数, 计算得到该 多个脸部模型的系数, 从而对该多个初始脸部模型和对应的系数进行计算, 得到多个脸部模型, 将该多个脸部模型拼接可以得到该三维人脸模型。
在另一种可能实现方式中, 该三维人脸模型可以基于平均人脸模型和该 三维人脸模型参数确定, 计算机设备可以基于该三维人脸模型参数, 获取多 个主成分部分的系数, 然后将系数作为该主成分部分的权重, 对该多个主成 分部分进行加权求和, 从而在平均人脸模型与该加权求和结果进行求和, 得 到最终的三维人脸模型。 其中, 每个主成分部分可以只是人脸形状, 也可以 是指纹理等, 本申请实施例对此不作限定。 基于该三维人脸模型参数生成三 维人脸模型的过程还可以通过其他方式实现, 本申请实施例对具体采用哪种 实现方式不作限定。
需要说明的是, 上述步骤 202至步骤 205为调用初始模型, 将该多个样 本二维人脸图像输入该初始模型中, 对于每个样本二维人脸图像, 由该初始 模型提取该样本二维人脸图像的全局特征和局部特征; 基于该全局特征和局 部特征, 获取三维人脸模型参数, 基于该三维人脸模型参数, 输出该样本二 维人脸图像对应的三维人脸模型的过程, 初始模型对每个样本二维人脸图像 进行处理, 可以得到对应的三维人脸模型。
206、 计算机设备中的初始模型对该三维人脸模型进行投影, 得到该三维 人脸模型对应的二维人脸图像。 初始模型生成三维人脸模型后, 可以确定该三维人脸模型与输入的样本 二维人脸图像的相似度, 从而确定本次人脸模型生成的效果是好还是坏, 以 衡量该初始模型的人脸模型生成功能, 在该初始模型的人脸模型生成功能不 好时, 可以对初始模型的模型参数进行调整, 一直到调整后的初始模型的人 脸模型生成功能满足条件时可以停止调整, 也即完成了模型训练过程。
初始模型在确定三维人脸模型与样本二维人脸图像的相似度时, 可以将 该三维人脸模型渲染为二维人脸图像, 再去比较渲染得到的二维人脸图像与 输入的样本二维人脸图像的相似度。 其中, 该渲染过程可以为: 初始模型基 于该全局特征, 获取该样本二维人脸图像的拍摄信息, 该拍摄信息用于指示 拍摄该样本二维人脸图像时的拍摄姿势、 光照或拍摄背景中至少一种; 初始 模型基于该拍摄信息, 对该三维人脸模型进行投影, 得到该三维人脸模型对 应的二维人脸图像。
其中, 该拍摄信息体现出来的内容是需要从该样本二维人脸图像的整体 得出的, 因而初始模型可以基于全局特征进行拍摄信息获取步骤。 计算机设 备在获取到全局特征后, 可以根据该全局特征分析得到该样本二维人脸图像 的拍摄信息, 也即是, 可以获知拍摄者拍摄该样本二维人脸图像的姿势, 或 者可以获知该样本二维人脸图像是在什么样的光照条件下拍摄得到的, 或者 可以获知该样本二维人脸图像的拍摄背景是什么样的。 这样在投影时基于同 样的拍摄姿势、 同样的光照情况或同样的拍摄背景下对三维人脸模型进行投 影, 则可以提高投影得到的二维人脸图像和输入的样本二维人脸图像之间的 可比性, 也可以使得获取到的相似度更准确。
具体地,计算机设备可以采用正交投影的方式对三维人脸模型进行投影, 也可以采用透视投影的方式对三维人脸模型进行投影, 当然, 还可以采用其 他投影方式, 本申请实施例对采用的投影方式不作限定。 例如, 以采用正交 投影为例, 该投影过程可以为: 计算机设备按照该拍摄信息, 将该三维人脸 模型人脸按照拍摄信息中的拍摄姿势进行旋转, 然后计算机设备采用正交投 影, 把三维人脸模型投影到二维, 并根据三维人脸模型的法向量、 纹理信息、 和光照模型计算得到二维人脸图像中每个像素点的像素值, 具体地, 该像素 值可以为红绿蓝色彩模式 (Red Green B l ue, RGB) 的值。 其中, 该光照模型 可以采用球谐关照模型, 也可以采用 Phong 反射模型 ( Phong ref l ect i on mode l ) , 当然, 还可以采用其他光照模型, 本申请实施例对此不作限定。
207、计算机设备中的初始模型获取该三维人脸模型对应的二维人脸图像 和该样本二维人脸图像的相似度。
计算机设备投影得到二维人脸图像后, 可以对比该二维人脸图像和输入 的样本二维人脸图像, 以确定初始模型对样本二维人脸图像进行处理后得到 的三维人脸图像是否能够还原该样本二维人脸图像中人脸的特征。
计算机设备可以从多个角度对比两个人脸图像, 以得到该两个人脸图像 在多个角度上的相似度。 具体地, 可以既关注人脸底层的信息, 例如, 人脸 的形状、 表情、 纹理等, 也关注人脸高层的语义信息, 例如, 两个图像中人 脸的身份是否一致。 在一种可能实现方式中, 该步骤 207 中初始模型获取相 似度的过程可以通过下述步骤一至步骤四实现:
步骤一、 计算机设备中的初始模型基于该三维人脸模型对应的二维人脸 图像的关键点与该样本二维人脸图像对应的关键点的位置,获取第一相似度。
在该步骤一中, 初始模型可以关注图像的底层信息, 初始模型可以确定 两个图像中人脸的关键点位置是否一致, 以此来判断两个图像的相似度。 在 一种可能实现方式中, 该第一相似度可以基于第一损失函数确定, 该初始模 型可以基于第一损失函数、 该三维人脸模型对应的二维人脸图像和该样本二 维人脸图像, 获取第一相似度。 例如, 该第一损失函数可以为关键点损失 (Landmark Loss ) 函数。 当然, 该第一损失 数还可以为其他损失函数, 在 此仅为一种示例说明, 本申请实施例对此不作限定。
在一个具体的可能实施例中, 该第一相似度可以采用 L2 距离的表达方 式, 也即是该第一相似度可以为 L2 损失, 该 L2 损失又称均方误差 (Mean Squared Er ror , MSE), 也即是, 初始模型可以计算两个图像的关键点的位置 之间的差值, 并计算差值的平方值的期望值。 该 L2损失越小, 则说明两个图 像的关键点位置的相似度越大, 两个图像的关键点越一致。 当然, 上述仅为 一种示例性说明, 该第一相似度还可以采用其他表达方式, 例如, L1距离, 本申请实施例对此不作限定。
步骤二、 计算机设备中的初始模型基于该三维人脸模型对应的二维人脸 图像的像素点的像素值与该样本二维人脸图像对应像素点的像素值, 获取第 二相似度。 在该步骤二中, 初始模型可以关注图像的底层信息, 初始模型可以确定 两个图像中的像素点的像素值的差异, 如果相差很大, 则两个图像的相似度 较低, 如果相差很小, 则两个图像的相似度较高。
在一种可能实现方式中, 该第二相似度可以基于第二损失函数确定, 该 初始模型可以基于第二损失函数、 该三维人脸模型对应的二维人脸图像和该 样本二维人脸图像, 获取第二相似度。 例如, 该第二损失函数可以为光度损 失 ( Photometr i c Loss) 函数。 当然, 该第二损失函数还可以为其他损失函 数, 在此仅为一种示例说明, 本申请实施例对此不作限定。
在一个具体的可能实施例中, 该第一相似度可以采用 L21 距离的表达方 式, 也即是, 初始模型可以计算两个图像对应像素点的像素值之间的 L21 距 离。 当然, 该第一相似度也可以采用其他表达方式, 例如, L2 距离, 或 L1 距离等, 本申请实施例对此不作限定。
步骤三、 计算机设备中的初始模型对该三维人脸模型对应的二维人脸图 像和该样本二维人脸图像进行匹配, 得到第三相似度, 该第三相似度用于指 示该二维人脸图像中人脸的身份和该样本二维人脸图像中人脸的身份是否相 同。
在该步骤三中, 初始模型可以关注两个图像的高层语义信息, 初始模型 可以确定两个图像中人脸的身份是否一致, 并以此作为该初始模型的人脸重 建的准确性, 这样可以保证生成三维人脸模型后, 生成的人脸与输入的二维 人脸图像中人脸的身份一致, 也即是, 通过对两个图像进行人脸识别, 均可 以正确识别该人脸的身份, 且并未因人脸重建过程导致无法识别出该用户的 身份。
在一种可能实现方式中, 该第三相似度可以基于人脸识别模型确定, 也 即是, 该第三相似度基于人脸识别模型对该三维人脸模型对应的二维人脸图 像和该样本二维人脸图像进行人脸识别得到。 在该实现方式中, 初始模型可 以基于人脸识别模型, 对该三维人脸模型对应的二维人脸图像和该样本二维 人脸图像进行人脸识别, 得到第三相似度。
具体地, 初始模型可以调用人脸识别模型, 将该三维人脸模型对应的二 维人脸图像和该样本二维人脸图像输入该人脸识别模型, 由该人脸识别模型 对该三维人脸模型对应的二维人脸图像和该样本二维人脸图像进行人脸识 别, 输出第三相似度。 其中, 该人脸识别模型可以为训练好的模型, 初始模 型可以使用该人脸识别模型识别图像中人脸的身份。
在一个具体的可能实施例中, 该人脸识别模型获取第三相似度的过程可 以基于第三损失函数实现, 也即是, 该初始模型可以基于第三损失函数、 该 三维人脸模型对应的二维人脸图像和该样本二维人脸图像,获取第三相似度。 例如, 该第三损失函数可以为感知损失 (Perceptua丨 Loss) 函数。 当然, 该 第三损失函数还可以为其他损失函数, 在此仅为一种示例说明, 本申请实施 例对此不作限定。
例如, 该人脸识别模型可以为 VGG-Face网络, 将该三维人脸模型对应的 二维人脸图像和该样本二维人脸图像输入该 VGG-Face 网络中, 该 VGG-Face 网络中的多个卷积层可以分别对该二维人脸图像和该样本二维人脸图像进行 特征提取, 得到两个特征向量, 进而可以计算该两个特征向量的欧氏距离, 将该欧式距离作为第三相似度。 其中, 该 VGG-Face网络中卷积层进行多次特 征提取, PC7层可以输出该两个特征向量。
需要说明的是,使用 VGG-Face网络作为人脸识别模型, 由于该 VGG-Face 网络对光照不敏感, 可以使得光照颜色和肤色分离, 从而学习到更加自然的 肤色和更加真实的光照。 且形状上通过光影变化和人脸识别信息比对, 可以 让生成的三维人脸模型的面部结构与输入的二维人脸图像更加相似。 综合这 两点, 本申请提供的方法对不同分辨率、 不同光照条件不同背景下的二维人 脸图像都比较鲁棒。
进一步地, 上述方法中获取单张图片比较容易, 这也使得该方法更具有 可推广性。 在一种可能实现方式中, 该方法中, 计算机设备还可以对二维人 脸图像进行预处理, 例如, 可以对二维人脸图像进行人脸检测, 当该二维人 脸图像中包括多个人脸时, 可以将该二维人脸图像裁剪为多个人脸对应的多 个人脸图像, 从而针对每个人脸图像, 执行上述生成三维人脸模型的步骤。
在一种可能实现方式中, 上述步骤一至步骤三中, 初始模型可以分别基 于第一损失函数、 第二损失函数、 第三损失函数, 以及该三维人脸模型对应 的二维人脸图像和该样本二维人脸图像, 获取第一相似度、 第二相似度以及 第三相似度。 上述内容已示出, 本申请实施例在此不多做赘述。
需要说明的是, 对于该步骤一至步骤三, 计算机设备可以无需全部执行 该步骤一至步骤三, 可以根据步骤四中的设置, 初始模型需要基于哪几个角 度的相似度, 确定两个图像的相似度, 则执行上述步骤一至步骤三中的相应 的步骤即可。 且该步骤一至步骤三的执行顺序可以任意, 也即是, 该步骤一 至步骤三可以按照任意顺序进行排列, 也可以由计算机设备同时执行该步骤 一至步骤三, 本申请实施例对该步骤一至步骤三的执行顺序不作限定。
步骤四、 计算机设备中的初始模型基于该第一相似度和该第二相似度中 至少一种相似度, 以及该第三相似度, 获取该三维人脸模型对应的二维人脸 图像和该样本二维人脸图像的相似度。
初始模型在获取到多个角度的相似度时, 可以综合考虑该多个相似度, 获取两个图像的相似度。 具体地, 该步骤四中可以包括三种情况:
在情况一中, 计算机设备中的初始模型基于该第一相似度和该第三相似 度, 获取该三维人脸模型对应的二维人脸图像和该样本二维人脸图像的相似 度。 具体地, 计算机设备中的初始模型可以对所述第一相似度和该第三相似 度进行加权求和, 得到该三维人脸模型对应的二维人脸图像和该样本二维人 脸图像的相似度。 本申请实施例对多个相似度的权重不作限定。
在情况二中, 计算机设备中的初始模型基于该第二相似度和该第三相似 度, 获取该三维人脸模型对应的二维人脸图像和该样本二维人脸图像的相似 度。 具体地, 计算机设备中的初始模型可以对该第二相似度和该第三相似度 进行加权求和, 得到该三维人脸模型对应的二维人脸图像和该样本二维人脸 图像的相似度。 本申请实施例对多个相似度的权重不作限定。
在情况三中, 计算机设备中的初始模型基于该第一相似度、 该第二相似 度和该第三相似度, 获取该三维人脸模型对应的二维人脸图像和该样本二维 人脸图像的相似度。 具体地, 计算机设备中的初始模型可以对所述第一相似 度、 该第二相似度和该第三相似度进行加权求和, 得到该三维人脸模型对应 的二维人脸图像和该样本二维人脸图像的相似度。 本申请实施例对多个相似 度的权重不作限定。
在上述三种情况中, 初始模型既考虑到了图像的底层信息, 也考虑到了 图像高层的语义信息, 这样对该两个图像的分析更全面, 更准确, 从而可以 保证生成的三维人脸模型可以准确还原输入的二维人脸图像的底层和高层信 息, 还原度高, 与原输入图像更相似, 更真实。 上述仅提供了三种情况, 该初始模型还可以考虑其他角度获取两个图像 的相似度, 在一种可能实现方式中, 上述步骤 203 中, 该初始模型还可以在 得到第一局部特征向量后, 还可以基于该第一局部特征向量进行重建输入的 局部特征, 来对比重建的局部特征与直接从全局特征中提取到的局部特征是 否一致, 获取第四相似度, 以该第四相似度来训练局部编码器更好的抓住底 层局部信息, 使得人脸细节体现的更明显。
具体地, 该样本二维人脸图像的局部特征为第一局部特征向量, 该第一 局部特征向量基于从全局特征中提取到的部分特征值确定。 初始模型可以基 于该第一局部特征向量, 获取第二局部特征向量, 该第二局部特征向量的特 征值和该从全局特征中提取到的部分特征值的分布情况相同。 其中, 该第二 局部特征向量即为重建得到的局部特征向量。 初始模型可以基于该第二局部 特征向量和从该全局特征中提取到的对应的部分特征值之间的距离, 获取第 四相似度。
在一个具体的可能实施例中,该第四相似度可以基于第四损失函数确定, 初始模型可以基于第四损失函数、 该第二局部特征向量和从该全局特征中提 取到的对应的部分特征值, 获取第四相似度。 例如, 该第四损失函数可以为 Patch Reconstruct i on Loss (图像斑块重建损失) 函数。 当然, 该第四损失 函数还可以为其他损失函数, 本申请实施例对此不作限定。
其中, 上述第二局部特征向量和从全局特征中提取到的部分特征值之间 的距离可以为 L1 距离, 也即是, 第四相似度可以采用 L1 距离的表达方式, 也即是该第一相似度可以为 L1 损失, 该 L1 损失又称平均绝对误差 (Mean Abso I ute Dev i at i on , MAE) , 也即是, 初始模型可以计算第二局部特征向量 和对应的特征值之间的偏差的绝对值的平均值。 该 L1损失越小, 则说明重建 的第二局部特征向量和提取到的部分特征值之间的相似度越大, 也说明局部 编码器更好地抓住了局部信息。 当然, 上述仅为一种示例性说明, 该第四相 似度还可以采用其他表达方式, 例如, L2距离, 本申请实施例对此不作限定。
相应地, 上述步骤四中, 初始模型可以基于该第一相似度和该第二相似 度中至少一种相似度、 该第三相似度和该第四相似度, 获取该三维人脸模型 对应的二维人脸图像和该样本二维人脸图像的相似度。 也即是, 在上述步骤 四中的三种情况中, 初始模型还可以考虑第四相似度, 具体可以对第一相似 度和该第二相似度中至少一种相似度、 第三相似度和第四相似度进行加权求 和, 得到该三维人脸模型对应的二维人脸图像和该样本二维人脸图像的相似 度, 本申请实施例对具体采用哪种实现方式不作限定。
208、 计算机设备基于该相似度, 对该初始模型的模型参数进行调整, 直 至符合目标条件时停止, 得到人脸模型生成模型。
得到两个图像之间的相似度后, 则可以基于相似度对模型参数进行调整, 上述步骤 203至步骤 205为一次迭代过程, 在每次迭代过程后, 该计算机设 备可以执行步骤 206至步骤 208, 基于相似度对训练的初始模型的模型参数 进行调整, 直到符合目标条件时, 人脸模型生成模型训练完成。
其中, 该目标条件可以为相似度收敛, 也可以为迭代次数达到目标次数, 也即是上述每次迭代过程后对模型参数进行调整, 直到某次迭代后相似度收 敛, 或者某次迭代后迭代次数达到目标次数时, 人脸模型生成模型训练完成。 当然, 该目标条件还可以为其他预设条件, 需要说明的是, 该目标条件可以 由相关技术人员预先设置, 本申请实施例对此不作限定。
下面通过一个具体示例对上述人脸模型生成模型的训练过程进行说明, 参见图 3, 该人脸模型生成模型中可以包括三个模块, 第一个模块是编码器 (encoder) , 负责把输入图片编码成特征向量 (对应于上述步骤 203) ; 第二 个模块是解码器 (decoder), 负责把特征向量解码成 3DMM (Morphab I e Mode I For The Synthes i s Of 3D Faces , 3D人脸形变模型)、 姿势以及光照参数 (对 应于上述步骤 204和步骤 206中所示的拍摄信息的获取过程);第三个模块是 人脸识别网络, 负责判断原图和渲染图是否为同一个人(对应于上述步骤 207 中所示的第三相似度的获取过程)。
输入图片通过基于 VGG-Face结构的全局编码器, 得到全局特征向量。 随 后,局部编码器会关注 VGG-Face中 conv2_2和 conv3_3层的眼睛和嘴巴的特 征, 并利用它们编码出局部特征向量。 这些不同层次和不同区域的局部特征 向量会接起来, 跟全局特征向量一起送到解码器。 由于姿势和光照是全局信 息, 所以由全局特征向量通过一层全连接层解码得到姿势和光照参数。 而脸 部形状、 表情和纹理等 3DMM参数则由全局和局部特征向量共同解码得到, 这 样既可以保留全局信息, 也可以保留局部细节。 然后, 拟合的 3DMM参数可以 重建一个 3D人脸模型, 再利用姿势和光照参数将 3D人脸模型重新渲染成一 张 2D图片,该渲染过程是模拟原始输入图片的光照条件和相机拍照角度以及 内参对 3D人脸模型进行拍照的过程。 这张渲染的 2D输出图片, 会跟输入图 片做比较, 并通过这些比较结果的反馈信息, 不断地更新编码器和解码器的 网络权重。
本申请实施例通过样本二维人脸图像对初始模型进行训练, 得到人脸模 型生成模型, 在训练过程中, 初始模型提取了样本二维人脸图像的全局特征 和局部特征, 综合二者生成的三维人脸模型的人脸细节体现的更明显, 人脸 模型生成模型的生成效果更好。
进一步地, 本申请实施例中还可以根据三维人脸模型投影得到的二维人 脸图像与输入的样本二维人脸图像进行了底层信息和高层的语义信息的对 比, 以此来调整模型参数, 使得生成的三维人脸模型在底层信息和高层的语 音信息上均能准确还原输入的原始图像, 还原度高, 三维人脸模型更真实。 上述图 2 所示实施例中对人脸模型生成模型的训练过程进行了详细说 明, 在计算机设备需要生成三维人脸模型时即可基于上述训练好的人脸模型 生成模型生成三维人脸模型过程, 得到三维人脸模型。 下面通过图 4所示实 施例对基于人脸模型生成模型生成三维人脸模型过程进行详细说明。
图 4是本申请实施例提供的一种用于三维人脸模型生成的计算机应用方 法的流程图, 该用于三维人脸模型生成的计算机应用方法可以应用于计算机 设备上, 参见图 4, 该方法可以包括以下步骤:
401、 计算机设备获取二维人脸图像。
计算机设备可以通过多种方式获取该二维人脸图像, 例如, 在用户想要 生成三维人脸模型时, 可以基于该计算机设备的图像采集功能, 对自己或其 他人进行图像采集, 得到二维人脸图像。 又例如, 该计算机设备可以根据第 一操作指令, 从目标地址下载该二维人脸图像。 又例如, 该计算机设备可以 根据第二操作指令,从本地存储的图像中选择一个图像作为该二维人脸图像。 具体该获取过程采用哪种方式可以基于应用场景确定, 本申请实施例对此不 作限定。
在一种可能实现方式中, 该步骤 401 还可以为: 当接收到人脸模型生成 指令时, 计算机设备获取二维人脸图像。 该人脸模型生成指令可以由人脸模 型生成操作触发, 在计算机设备检测到人脸模型生成操作时, 可以获取该人 脸模型生成操作触发的人脸模型生成指令, 并根据该人脸模型生成指令, 执 行该步骤 401。 当然, 该人脸模型生成指令还可以为其他计算机设备发送至 该计算机设备, 本申请实施例对此不作限定。
402、 计算机设备调用人脸模型生成模型。
其中, 该人脸模型生成模型用于提取该二维人脸图像的全局特征和局部 特征, 基于该全局特征和局部特征, 获取三维人脸模型参数, 基于该三维人 脸模型参数, 生成该二维人脸图像对应的三维人脸模型。
该人脸模型生成模型可以基于上述图 2所示的模型训练过程训练得到。 在计算机设备有人脸模型生成需求时, 可以调用该训练好的人脸模型生成模 型生成三维人脸模型。
403、 计算机设备将该二维人脸图像输入该人脸模型生成模型中, 由该人 脸模型生成模型提取该二维人脸图像的全局特征和局部特征。
该步骤 403与上述步骤 203同理, 计算机设备将二维人脸图像输入人脸 模型生成模型后,该人脸模型生成模型可以对输入的二维人脸图像进行处理。 该三维人脸模型生成步骤中, 人脸模型生成模型可以先提取二维人脸图像的 特征, 基于该二维人脸图像的特征生成三维人脸模型。
具体地, 该人脸模型生成模型可以获取该二维人脸图像的全局特征和局 部特征, 其中, 该全局特征是指对二维人脸图像进行特征提取得到的全部特 征。 局部特征是指对二维人脸图像的局部区域进行特征提取得到的特征。 例 如, 该全局特征可以体现该二维人脸图像的全部区域, 局部特征可以体现该 二维人脸图像的局部区域, 例如, 该二维人脸图像中人脸的五官。 又例如, 该局部区域可以为眼睛和鼻子, 或眼睛和嘴巴, 当然, 也可以为其他区域, 本申请实施例对此不作限定。 该特征提取过程中既考虑到了全局特征, 又考 虑到了局部特征, 这样在对二维人脸图像有了整体把握的同时, 还能对人脸 细节进行进一步优化, 从而综合全局特征和局部特征得到的三维人脸模型效 果更好。
同理地, 该步骤 403 中二维人脸图像的全局特征和局部特征的提取过程 也可以通过步骤一至步骤三实现:
步骤一、 计算机设备可以基于多个卷积层, 对该二维人脸图像进行特征 提取, 得到该二维人脸图像的全局特征。
步骤二、 计算机设备获取该二维人脸图像的关键点的中心位置。
步骤三、 计算机设备基于该中心位置, 从该多个卷积层中至少一个目标 卷积层得到的特征中, 提取部分特征作为该二维人脸图像的局部特征。
该步骤一至步骤三均与上述步骤 203 中所示内容同理, 在一种可能实现 方式中, 对于每个目标卷积层, 计算机设备从该目标卷积层得到的特征图中, 以该中心位置为中心, 截取该目标卷积层对应的目标尺寸的特征图作为该二 维人脸图像的局部特征。
与步骤 203 中所示内容同理地, 全局特征的提取过程可以为: 计算机设 备中的人脸模型生成模型基于编码器的多个卷积层, 对该二维人脸图像进行 编码, 得到该二维人脸图像的全局特征向量。 相应地, 局部特征的提取过程 可以为: 计算机设备中的人脸模型生成模型从该编码器的多个卷积层中至少 一个目标卷积层得到的全局特征向量中,提取该全局特征向量的部分特征值, 基于该部分特征值, 获取该二维人脸图像的第一局部特征向量。
同理地, 在局部编码器后也可以设置有第二解码器, 计算机设备中的人 脸模型生成模型可以提取该至少一个目标卷积层得到的全局特征向量中的部 分特征值; 基于第二解码器, 对提取到的部分特征值进行解码, 得到该二维 人脸图像的第一局部特征向量。
需要说明的是, 该步骤 403中所示内容均与上述步骤 203 中所示内容同 理, 该步骤 203还有一些内容在该步骤 403中并未示出, 但均可以应用于步 骤 403中, 由于该步骤 403和步骤 203同理, 本申请实施例在此不多做赘述。
404、 计算机设备中的人脸模型生成模型基于该全局特征和局部特征, 获 取三维人脸模型参数。
该步骤 404与上述步骤 204同理, 人脸模型生成模型可以基于全局特征 和局部特征, 计算得到三维人脸模型参数, 同理地, 在一种可能实现方式中, 计算机设备可以基于第一解码器, 对该全局特征向量和该第一局部特征向量 进行解码, 得到三维人脸模型参数。 本申请实施例在此不多做赘述。
405、 计算机设备中的人脸模型生成模型基于该三维人脸模型参数, 输出 该二维人脸图像对应的三维人脸模型。
该步骤 405与上述步骤 205同理, 得到三维人脸模型参数后, 人脸模型 生成模型还可以基于该三维人脸模型参数计算得到三维人脸模型, 也即是, 人脸模型生成模型基于该三维人脸模型参数生成二维人脸图像对应的三维人 脸模型, 从而输出该生成的三维人脸模型。 同理地, 该生成过程可以采用步 骤 205中所示的任一种方式, 本申请实施例在此不多赘述。
需要说明的是, 该步骤 403至步骤 405为将该二维人脸图像输入该人脸 模型生成模型中, 输出该二维人脸图像对应的三维人脸模型的过程, 在该过 程中既关注了全局特征, 又关注了局部特征, 从而综合二者获取三维人脸模 型, 这样得到的三维人脸模型相比于只根据局部特征得到的三维人脸模型, 人脸细节体现的更明显, 人脸细节处理的更精细, 还原度高, 从而三维人脸 模型更真实。
在一种可能实现方式中, 该方法中, 计算机设备还可以对二维人脸图像 进行预处理, 例如, 可以对二维人脸图像进行人脸检测, 当该二维人脸图像 中包括多个人脸时, 可以将该二维人脸图像裁剪为多个人脸对应的多个人脸 图像, 从而针对每个人脸图像, 执行上述生成三维人脸模型的步骤。
本申请实施例通过人脸模型生成模型, 对二维人脸图像进行处理, 生成 三维人脸模型, 在生成过程中既提取了全局特征, 又提取了局部特征, 从而 综合二者获取三维人脸模型, 这样得到的三维人脸模型相比于只根据局部特 征得到的三维人脸模型, 人脸细节体现的更明显, 人脸细节处理的更精细, 还原度高, 从而三维人脸模型更真实。
上述所有可选技术方案, 可以采用任意结合形成本申请的可选实施例, 在此不再一一赘述。 图 5是本申请实施例提供的一种人脸模型生成装置的结构示意图, 参见 图 5, 该装置可以包括:
获取模块 501, 用于获取二维人脸图像;
调用模块 502, 用于调用人脸模型生成模型, 该人脸模型生成模型用于 提取该二维人脸图像的全局特征和局部特征, 基于该全局特征和局部特征, 获取三维人脸模型参数, 基于该三维人脸模型参数, 生成该二维人脸图像对 应的三维人脸模型;
生成模块 503, 用于将该二维人脸图像输入该人脸模型生成模型中, 输 出该二维人脸图像对应的三维人脸模型。
在一种可能实现方式中, 该生成模块 503用于:
基于多个卷积层, 对该二维人脸图像进行特征提取, 得到该二维人脸图 像的全局特征;
获取该二维人脸图像的关键点的中心位置;
基于该中心位置,从该多个卷积层中至少一个目标卷积层得到的特征中, 提取部分特征作为该二维人脸图像的局部特征。
在一种可能实现方式中, 该生成模块 503 用于对于每个目标卷积层, 从 该目标卷积层得到的特征图中, 以该中心位置为中心, 截取该目标卷积层对 应的目标尺寸的特征图作为该二维人脸图像的局部特征。
在一种可能实现方式中, 该生成模块 503用于:
基于编码器的多个卷积层, 对该二维人脸图像进行编码, 得到该二维人 脸图像的全局特征向量;
相应地, 该生成模块 503还用于从该编码器的多个卷积层中至少一个目 标卷积层得到的全局特征向量中, 提取该全局特征向量的部分特征值, 基于 该部分特征值, 获取该二维人脸图像的第一局部特征向量;
相应地, 该生成模块 503还用于基于第一解码器, 对该全局特征向量和 该第一局部特征向量进行解码, 得到三维人脸模型参数。
在一种可能实现方式中, 该生成模块 503 用于提取该至少一个目标卷积 层得到的全局特征向量中的部分特征值; 基于第二解码器, 对提取到的部分 特征值进行解码, 得到该二维人脸图像的第一局部特征向量。
在一种可能实现方式中, 该获取模块 501 还用于获取多个样本二维人脸 图像;
该调用模块 502, 还用于调用初始模型, 将该多个样本二维人脸图像输 入该初始模型中, 对于每个样本二维人脸图像, 由该初始模型提取该样本二 维人脸图像的全局特征和局部特征; 基于该全局特征和局部特征, 获取三维 人脸模型参数, 基于该三维人脸模型参数, 输出该样本二维人脸图像对应的 三维人脸模型;
该装置还包括:
投影模块, 用于对该三维人脸模型进行投影, 得到该三维人脸模型对应 的二维人脸图像;
该获取模块 501 还用于获取该三维人脸模型对应的二维人脸图像和该样 本二维人脸图像的相似度;
调整模块, 用于基于该相似度, 对该初始模型的模型参数进行调整, 直 至符合目标条件时停止, 得到人脸模型生成模型。
在一种可能实现方式中, 该投影模块还用于:
基于该全局特征, 获取该样本二维人脸图像的拍摄信息, 该拍摄信息用 于指示拍摄该样本二维人脸图像时的拍摄姿势、光照或拍摄背景中至少一种; 基于该拍摄信息, 对该三维人脸模型进行投影, 得到该三维人脸模型对 应的二维人脸图像。
在一种可能实现方式中, 该获取模块 501还用于:
基于该三维人脸模型对应的二维人脸图像的关键点与该样本二维人脸图 像对应的关键点的位置, 获取第一相似度;
基于该三维人脸模型对应的二维人脸图像的像素点的像素值与该样本二 维人脸图像对应像素点的像素值, 获取第二相似度;
对该三维人脸模型对应的二维人脸图像和该样本二维人脸图像进行匹 配, 得到第三相似度, 该第三相似度用于指示该二维人脸图像中人脸的身份 和该样本二维人脸图像中人脸的身份是否相同;
基于该第一相似度和该第二相似度中至少一种相似度, 以及该第三相似 度, 获取该三维人脸模型对应的二维人脸图像和该样本二维人脸图像的相似 度。
在一种可能实现方式中, 该获取模块 501 还用于基于人脸识别模型, 对 该三维人脸模型对应的二维人脸图像和该样本二维人脸图像进行人脸识别, 得到第三相似度。
在一种可能实现方式中,该获取模块 501还用于分别基于第一损失函数、 第二损失函数、 第三损失函数, 以及该三维人脸模型对应的二维人脸图像和 该样本二维人脸图像, 获取第一相似度、 第二相似度以及第三相似度。
在一种可能实现方式中, 该样本二维人脸图像的局部特征为第一局部特 征向量, 该第一局部特征向量基于从全局特征中提取到的部分特征值确定; 相应地, 该获取模块 501还用于: 基于该第一局部特征向量, 获取第二局部特征向量, 该第二局部特征向 量的特征值和该从全局特征中提取到的部分特征值的分布情况相同;
基于该第二局部特征向量和从该全局特征中提取到的对应的部分特征值 之间的距离, 获取第四相似度;
相应地, 该获取模块 501还用于:
基于该第一相似度和该第二相似度中至少一种相似度、 该第三相似度和 该第四相似度, 获取该三维人脸模型对应的二维人脸图像和该样本二维人脸 图像的相似度。
在一种可能实现方式中, 该获取模块 501 还用于基于第四损失函数、 该 第二局部特征向量和从该全局特征中提取到的对应的部分特征值, 获取第四 相似度。
本申请实施例提供的装置, 通过人脸模型生成模型, 对二维人脸图像进 行处理, 生成三维人脸模型, 在生成过程中既提取了全局特征, 又提取了局 部特征, 从而综合二者获取三维人脸模型, 这样得到的三维人脸模型相比于 只根据局部特征得到的三维人脸模型, 人脸细节体现的更明显, 人脸细节处 理的更精细, 还原度高, 从而三维人脸模型更真实。
需要说明的是: 上述实施例提供的人脸模型生成装置在生成三维人脸模 型时, 仅以上述各功能模块的划分进行举例说明, 实际应用中, 可以根据需 要而将上述功能分配由不同的功能模块完成, 即将计算机设备的内部结构划 分成不同的功能模块, 以完成以上描述的全部或者部分功能。 另外, 上述实 施例提供的人脸模型生成装置与人脸模型生成方法实施例属于同一构思, 其 具体实现过程详见方法实施例, 这里不再赘述。 上述计算机设备可以被提供为下述图 6所示的终端, 也可以被提供为下 述图 7所示的服务器, 本申请实施例对此不作限定。
图 6是本申请实施例提供的一种终端的结构示意图。该终端 600可以是: 智能手机、 平板电脑、 MP3播放器(Moving Picture Experts Group Audio Layer III, 动态影像专家压缩标准音频层面 3)、 MP4(Moving Picture Experts Group Audio Layer IV, 动态影像专家压缩标准音频层面 4) 播放器、 笔记本电脑或 台式电脑。 终端 600还可能被称为用户设备、 便携式终端、 膝上型终端、 台 式终端等其他名称。
通常, 终端 600包括有: 处理器 601和存储器 602。
处理器 601 可以包括一个或多个处理核心, 比如 4核心处理器、 8核心 处理器等。 处理器 601可以采用 DSP (Digital Signal Processing, 数字信号处 理)、 FPGA (Field— Programmable Gate Array, 现场可编程门阵列)、 PLA (Programmable Logic Array, 可编程逻辑阵列) 中的至少一种硬件形式来实 现。 处理器 601 也可以包括主处理器和协处理器, 主处理器是用于对在唤醒 状态下的数据进行处理的处理器, 也称 CPU (Central Processing Unit, 中央 处理器); 协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。 在一些实施例中, 处理器 601可以在集成有 GPU (Graphics Processing Unit, 图像处理器), GPU用于负责显示屏所需要显示的内容的渲染和绘制。 一些 实施例中, 处理器 601还可以包括 AI (Artificial Intelligence, 人工智能) 处 理器, 该 AI处理器用于处理有关机器学习的计算操作。
存储器 602可以包括一个或多个计算机可读存储介质, 该计算机可读存 储介质可以是非暂态的。 存储器 602还可包括高速随机存取存储器, 以及非 易失性存储器, 比如一个或多个磁盘存储设备、 闪存存储设备。 在一些实施 例中, 存储器 602 中的非暂态的计算机可读存储介质用于存储至少一个计算 机可读指令, 该至少一个计算机可读指令用于被处理器 601 所执行以实现本 申请中方法实施例提供的用于三维人脸模型生成的计算机应用方法或人脸模 型生成模型训练方法。
在一些实施例中, 终端 600还可选包括有: 外围设备接口 603和至少一 个外围设备。 处理器 601、 存储器 602和外围设备接口 603之间可以通过总 线或信号线相连。 各个外围设备可以通过总线、 信号线或电路板与外围设备 接口 603相连。 具体地, 外围设备包括: 射频电路 604、 显示屏 605、 摄像头 606、 音频电路 607、 定位组件 608和电源 609中的至少一种。
外围设备接口 603可被用于将 I/O (Input /Output, 输入 /输出) 相关的至 少一个外围设备连接到处理器 601 和存储器 602。 在一些实施例中, 处理器 601、 存储器 602和外围设备接口 603被集成在同一芯片或电路板上; 在一些 其他实施例中, 处理器 601、 存储器 602和外围设备接口 603 中的任意一个 或两个可以在单独的芯片或电路板上实现, 本实施例对此不加以限定。 射频电路 604用于接收和发射 RF (Radio Frequency, 射频) 信号, 也称 电磁信号。 射频电路 604通过电磁信号与通信网络以及其他通信设备进行通 信。 射频电路 604将电信号转换为电磁信号进行发送, 或者, 将接收到的电 磁信号转换为电信号。 可选地, 射频电路 604包括: 天线系统、 RF收发器、 一个或多个放大器、 调谐器、 振荡器、 数字信号处理器、 编解码芯片组、 用 户身份模块卡等等。 射频电路 604可以通过至少一种无线通信协议来与其它 终端进行通信。 该无线通信协议包括但不限于: 城域网、 各代移动通信网络 (2G、 3G、 4G及 5G)、 无线局域网和 /或 WiFifWireless Fidelity, 无线保真) 网络。 在一些实施例中, 射频电路 604 还可以包括 NFC ( Near Field Communication , 近距离无线通信) 有关的电路, 本申请对此不加以限定。
显示屏 605用于显示 UI (User Interface, 用户界面)。 该 UI可以包括图 形、 文本、 图标、 视频及其它们的任意组合。 当显示屏 605是触摸显示屏时, 显示屏 605还具有采集在显示屏 605的表面或表面上方的触摸信号的能力。 该触摸信号可以作为控制信号输入至处理器 601进行处理。此时, 显示屏 605 还可以用于提供虚拟按钮和 /或虚拟键盘, 也称软按钮和 /或软键盘。在一些实 施例中, 显示屏 605可以为一个, 设置终端 600的前面板; 在另一些实施例 中, 显示屏 605可以为至少两个, 分别设置在终端 600的不同表面或呈折叠 设计; 在再一些实施例中, 显示屏 605 可以是柔性显示屏, 设置在终端 600 的弯曲表面上或折叠面上。 甚至, 显示屏 605还可以设置成非矩形的不规则 图形, 也即异形屏。 显示屏 605可以采用 LCD(Liquid Crystal Display, 液晶 显示屏)、 OLED(Organic Light-Emitting Diode,有机发光二极管)等材质制备。
摄像头组件 606用于采集图像或视频。 可选地, 摄像头组件 606包括前 置摄像头和后置摄像头。 通常, 前置摄像头设置在终端的前面板, 后置摄像 头设置在终端的背面。 在一些实施例中, 后置摄像头为至少两个, 分别为主 摄像头、 景深摄像头、 广角摄像头、 长焦摄像头中的任意一种, 以实现主摄 像头和景深摄像头融合实现背景虚化功能、 主摄像头和广角摄像头融合实现 全景拍摄以及 VR(Virtual Reality, 虚拟现实) 拍摄功能或者其它融合拍摄功 能。 在一些实施例中, 摄像头组件 606还可以包括闪光灯。 闪光灯可以是单 色温闪光灯, 也可以是双色温闪光灯。 双色温闪光灯是指暖光闪光灯和冷光 闪光灯的组合, 可以用于不同色温下的光线补偿。 音频电路 607可以包括麦克风和扬声器。 麦克风用于采集用户及环境的 声波, 并将声波转换为电信号输入至处理器 601进行处理, 或者输入至射频 电路 604 以实现语音通信。 出于立体声采集或降噪的目的, 麦克风可以为多 个, 分别设置在终端 600的不同部位。 麦克风还可以是阵列麦克风或全向采 集型麦克风。 扬声器则用于将来自处理器 601或射频电路 604的电信号转换 为声波。 扬声器可以是传统的薄膜扬声器, 也可以是压电陶瓷扬声器。 当扬 声器是压电陶瓷扬声器时, 不仅可以将电信号转换为人类可听见的声波, 也 可以将电信号转换为人类听不见的声波以进行测距等用途。在一些实施例中, 音频电路 607还可以包括耳机插孔。
定位组件 608 用于定位终端 600 的当前地理位置, 以实现导航或 LBS (Location Based Service, 基于位置的服务)。 定位组件 608可以是基于美国 的 GPS ( Global Positioning System, 全球定位系统)、 中国的北斗系统、 俄罗 斯的格雷纳斯系统或欧盟的伽利略系统的定位组件。
电源 609用于为终端 600中的各个组件进行供电。 电源 609可以是交流 电、 直流电、 一次性电池或可充电电池。 当电源 609 包括可充电电池时, 该 可充电电池可以支持有线充电或无线充电。 该可充电电池还可以用于支持快 充技术。
在一些实施例中, 终端 600还包括有一个或多个传感器 610。 该一个或 多个传感器 610包括但不限于: 加速度传感器 611、 陀螺仪传感器 612、 压力 传感器 613、 指纹传感器 614、 光学传感器 615以及接近传感器 616。
加速度传感器 611 可以检测以终端 600建立的坐标系的三个坐标轴上的 加速度大小。 比如, 加速度传感器 611 可以用于检测重力加速度在三个坐标 轴上的分量。处理器 601可以根据加速度传感器 611采集的重力加速度信号, 控制显示屏 605 以横向视图或纵向视图进行用户界面的显示。 加速度传感器 611还可以用于游戏或者用户的运动数据的采集。
陀螺仪传感器 612可以检测终端 600的机体方向及转动角度, 陀螺仪传 感器 612可以与加速度传感器 611 协同采集用户对终端 600的 3D动作。 处 理器 601根据陀螺仪传感器 612采集的数据, 可以实现如下功能: 动作感应 (比如根据用户的倾斜操作来改变 UI) 、 拍摄时的图像稳定、 游戏控制以及 惯性导航。 压力传感器 613可以设置在终端 600的侧边框和 /或显示屏 605的下层。 当压力传感器 613设置在终端 600的侧边框时, 可以检测用户对终端 600的 握持信号, 由处理器 601根据压力传感器 613采集的握持信号进行左右手识 别或快捷操作。 当压力传感器 613设置在显示屏 605的下层时, 由处理器 601 根据用户对显示屏 605的压力操作,实现对 UI界面上的可操作性控件进行控 制。 可操作性控件包括按钮控件、 滚动条控件、 图标控件、 菜单控件中的至 少一种。
指纹传感器 614用于采集用户的指纹,由处理器 601根据指纹传感器 614 采集到的指纹识别用户的身份, 或者, 由指纹传感器 614根据采集到的指纹 识别用户的身份。 在识别出用户的身份为可信身份时, 由处理器 601授权该 用户执行相关的敏感操作, 该敏感操作包括解锁屏幕、 查看加密信息、 下载 软件、 支付及更改设置等。 指纹传感器 614可以被设置终端 600的正面、 背 面或侧面。 当终端 600 上设置有物理按键或厂商 Logo 时, 指纹传感器 614 可以与物理按键或厂商 Logo集成在一起。
光学传感器 615 用于采集环境光强度。 在一个实施例中, 处理器 601 可 以根据光学传感器 615采集的环境光强度, 控制显示屏 605的显示亮度。 具 体地, 当环境光强度较高时, 调高显示屏 605 的显示亮度; 当环境光强度较 低时, 调低显示屏 605的显示亮度。 在另一个实施例中, 处理器 601还可以 根据光学传感器 615采集的环境光强度, 动态调整摄像头组件 606的拍摄参 数。
接近传感器 616, 也称距离传感器, 通常设置在终端 600 的前面板。 接 近传感器 616用于采集用户与终端 600的正面之间的距离。在一个实施例中, 当接近传感器 616检测到用户与终端 600的正面之间的距离逐渐变小时, 由 处理器 601控制显示屏 605从亮屏状态切换为息屏状态; 当接近传感器 616 检测到用户与终端 600的正面之间的距离逐渐变大时, 由处理器 601控制显 示屏 605从息屏状态切换为亮屏状态。
本领域技术人员可以理解, 图 6 中示出的结构并不构成对终端 600的限 定, 可以包括比图示更多或更少的组件, 或者组合某些组件, 或者采用不同 的组件布置。 图 7是本申请实施例提供的一种服务器的结构示意图, 该服务器 700可 因配置或性能不同而产生比较大的差异, 可以包括一个或一个以上处理器 (centra l process i ng un i ts , CPU) 701 和一个或一个以上的存储器 702, 其中, 该存储器 702 中存储有至少一条计算机可读指令, 该至少一条计算机 可读指令由该处理器 701 加载并执行以实现上述各个方法实施例提供的用于 三维人脸模型生成的计算机应用方法或人脸模型生成模型训练方法。 当然, 该服务器还可以具有有线或无线网络接口、 键盘以及输入输出接口等部件, 以便进行输入输出, 该服务器还可以包括其他用于实现设备功能的部件, 在 此不做赘述。 在示例性实施例中, 还提供了一种计算机可读存储介质, 例如包括计算 机可读指令的存储器, 上述计算机可读指令可由处理器执行以完成上述实施 例中的用于三维人脸模型生成的计算机应用方法或人脸模型生成模型训练方 法。 例如, 该计算机可读存储介质可以是只读存储器 (Read-On I y Memory, ROM)、 随机存取存储器 (Random Access Memory, RAM)、 只读光盘 (Compact D i sc Read-On l y Memory, CD-ROM)、 磁带、 软盘和光数据存储设备等。 本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通 过硬件来完成, 也可以通过计算机可读指令来指令相关的硬件完成, 该计算 机可读指令可以存储于一种计算机可读存储介质中, 上述提到的存储介质可 以是只读存储器, 磁盘或光盘等。
上述仅为本申请的较佳实施例, 并不用以限制本申请, 凡在本申请的精 神和原则之内, 所作的任何修改、 等同替换、 改进等, 均应包含在本申请的 保护范围之内。

Claims

权利要求书
1、 一种用于三维人脸模型生成的计算机应用方法, 包括:
获取二维人脸图像;
调用人脸模型生成模型; 及
将所述二维人脸图像输入所述人脸模型生成模型中, 通过所述人脸模型 生成模型提取所述二维人脸图像的全局特征和局部特征, 基于所述全局特征 和局部特征, 获取三维人脸模型参数, 基于所述三维人脸模型参数, 输出所 述二维人脸图像对应的三维人脸模型。
2、 根据权利要求 1 所述的方法, 其特征在于, 所述提取所述二维人脸图 像的全局特征和局部特征, 包括:
基于多个卷积层, 对所述二维人脸图像进行特征提取, 得到所述二维人 脸图像的全局特征;
获取所述二维人脸图像的关键点的中心位置; 及
基于所述中心位置, 从所述多个卷积层中至少一个目标卷积层得到的特 征中, 提取部分特征作为所述二维人脸图像的局部特征。
3、 根据权利要求 2所述的方法, 其特征在于, 所述基于所述中心位置, 从所述多个卷积层中至少一个目标卷积层得到的特征中, 提取部分特征作为 所述二维人脸图像的局部特征, 包括:
对于每个目标卷积层, 从所述目标卷积层得到的特征图中, 以所述中心 位置为中心, 截取所述目标卷积层对应的目标尺寸的特征图作为所述二维人 脸图像的局部特征。
4、 根据权利要求 2所述的方法, 其特征在于, 所述基于多个卷积层, 对 所述二维人脸图像进行特征提取, 得到所述二维人脸图像的全局特征, 包括: 基于编码器的多个卷积层, 对所述二维人脸图像进行编码, 得到所述二 维人脸图像的全局特征向量;
所述基于所述中心位置, 从所述多个卷积层中至少一个目标卷积层得到 的特征中, 提取部分特征作为所述二维人脸图像的局部特征, 包括: 从所述编码器的多个卷积层中至少一个目标卷积层得到的全局特征向量 中, 提取所述全局特征向量的部分特征值, 基于所述部分特征值, 获取所述 二维人脸图像的第一局部特征向量; 及
所述基于所述全局特征和局部特征, 获取三维人脸模型参数, 包括: 基于第一解码器, 对所述全局特征向量和所述第一局部特征向量进行解 码, 得到三维人脸模型参数。
5、 根据权利要求 4所述的方法, 其特征在于, 所述从所述编码器的多个 卷积层中至少一个目标卷积层得到的全局特征向量中, 提取所述全局特征向 量的部分特征值, 基于所述部分特征值, 获取所述二维人脸图像的第一局部 特征向量, 包括:
提取所述至少一个目标卷积层得到的全局特征向量中的部分特征值; 及 基于第二解码器, 对提取到的部分特征值进行解码, 得到所述二维人脸 图像的第一局部特征向量。
6、 根据权利要求 1 所述的方法, 其特征在于, 所述人脸模型生成模型的 训练步骤包括:
获取多个样本二维人脸图像;
调用初始模型, 将所述多个样本二维人脸图像输入所述初始模型中; 对于每个样本二维人脸图像, 由所述初始模型提取所述样本二维人脸图 像的全局特征和局部特征; 基于所述全局特征和局部特征, 获取三维人脸模 型参数, 基于所述三维人脸模型参数, 输出所述样本二维人脸图像对应的三 维人脸模型;
对所述三维人脸模型进行投影, 得到所述三维人脸模型对应的二维人脸 图像;
获取所述三维人脸模型对应的二维人脸图像和所述样本二维人脸图像的 相似度; 及
基于所述相似度, 对所述初始模型的模型参数进行调整, 直至符合目标 条件时停止, 得到人脸模型生成模型。
7、 根据权利要求 6所述的方法, 其特征在于, 所述提取所述样本二维人 脸图像的全局特征和局部特征, 包括:
基于多个卷积层, 对所述样本二维人脸图像进行特征提取, 得到所述样 本二维人脸图像的全局特征;
获取所述样本二维人脸图像的关键点的中心位置; 及
基于所述中心位置, 从所述多个卷积层中至少一个目标卷积层得到的特 征中, 提取部分特征作为所述样本二维人脸图像的局部特征。
8、 根据权利要求 7所述的方法, 其特征在于, 所述基于所述中心位置, 从所述多个卷积层中至少一个目标卷积层得到的特征中, 提取部分特征作为 所述样本二维人脸图像的局部特征, 包括:
对于每个目标卷积层, 从所述目标卷积层得到的特征图中, 以所述中心 位置为中心, 截取所述目标卷积层对应的目标尺寸的特征图作为所述样本二 维人脸图像的局部特征。
9、 根据权利要求 7所述的方法, 其特征在于, 所述基于多个卷积层, 对 所述样本二维人脸图像进行特征提取, 得到所述样本二维人脸图像的全局特 征, 包括:
基于编码器的多个卷积层, 对所述样本二维人脸图像进行编码, 得到所 述样本二维人脸图像的全局特征向量;
所述基于所述中心位置, 从所述多个卷积层中至少一个目标卷积层得到 的特征中, 提取部分特征作为所述样本二维人脸图像的局部特征, 包括: 从所述编码器的多个卷积层中至少一个目标卷积层得到的全局特征向量 中, 提取所述全局特征向量的部分特征值, 基于所述部分特征值, 获取所述 样本二维人脸图像的第一局部特征向量; 及
所述基于所述全局特征和局部特征, 获取三维人脸模型参数, 包括: 基于第一解码器, 对所述全局特征向量和所述第一局部特征向量进行解 码, 得到三维人脸模型参数。
1 0、 根据权利要求 9所述的方法, 其特征在于, 所述从所述编码器的多 个卷积层中至少一个目标卷积层得到的全局特征向量中, 提取所述全局特征 向量的部分特征值, 基于所述部分特征值, 获取所述样本二维人脸图像的第 一局部特征向量, 包括:
提取所述至少一个目标卷积层得到的全局特征向量中的部分特征值; 及 基于第二解码器, 对提取到的部分特征值进行解码, 得到所述样本二维 人脸图像的第一局部特征向量。
1 1、 根据权利要求 6所述的方法, 其特征在于, 所述对所述三维人脸模 型进行投影, 得到所述三维人脸模型对应的二维人脸图像, 包括:
基于所述全局特征, 获取所述样本二维人脸图像的拍摄信息, 所述拍摄 信息用于指示拍摄所述样本二维人脸图像时的拍摄姿势、 光照或拍摄背景中 至少一种; 及
基于所述拍摄信息, 对所述三维人脸模型进行投影, 得到所述三维人脸 模型对应的二维人脸图像。
1 2、 根据权利要求 6所述的方法, 其特征在于, 所述获取所述三维人脸 模型对应的二维人脸图像和所述样本二维人脸图像的相似度, 包括:
基于所述三维人脸模型对应的二维人脸图像的关键点与所述样本二维人 脸图像对应的关键点的位置, 获取第一相似度;
基于所述三维人脸模型对应的二维人脸图像的像素点的像素值与所述样 本二维人脸图像对应像素点的像素值, 获取第二相似度;
对所述三维人脸模型对应的二维人脸图像和所述样本二维人脸图像进行 匹配, 得到第三相似度, 所述第三相似度用于指示所述二维人脸图像中人脸 的身份和所述样本二维人脸图像中人脸的身份是否相同; 及
基于所述第一相似度和所述第二相似度中至少一种相似度, 以及所述第 三相似度, 获取所述三维人脸模型对应的二维人脸图像和所述样本二维人脸 图像的相似度。
1 3、 根据权利要求 12所述的方法, 其特征在于, 所述对所述三维人脸模 型对应的二维人脸图像和所述样本二维人脸图像进行匹配,得到第三相似度, 包括:
基于人脸识别模型, 对所述三维人脸模型对应的二维人脸图像和所述样 本二维人脸图像进行人脸识别, 得到第三相似度。
1 4、 根据权利要求 12所述的方法, 其特征在于, 所述第一相似度、 所述 第二相似度以及所述第三相似度的获取过程包括:
分别基于第一损失函数、 第二损失函数、 第三损失函数, 以及所述三维 人脸模型对应的二维人脸图像和所述样本二维人脸图像, 获取第一相似度、 第二相似度以及第三相似度。
1 5、 根据权利要求 12所述的方法, 其特征在于, 所述样本二维人脸图像 的局部特征为第一局部特征向量, 所述第一局部特征向量基于从全局特征中 提取到的部分特征值确定; 所述方法还包括:
基于所述第一局部特征向量, 获取第二局部特征向量, 所述第二局部特 征向量的特征值和所述从全局特征中提取到的部分特征值的分布情况相同; 及
基于所述第二局部特征向量和从所述全局特征中提取到的对应的部分特 征值之间的距离, 获取第四相似度。
1 6、 根据权利要求 15所述的方法, 其特征在于, 所述基于所述第一相似 度和所述第二相似度中至少一种相似度, 以及所述第三相似度, 获取所述三 维人脸模型对应的二维人脸图像和所述样本二维人脸图像的相似度, 包括: 基于所述第一相似度和所述第二相似度中至少一种相似度、 所述第三相 似度和所述第四相似度, 获取所述三维人脸模型对应的二维人脸图像和所述 样本二维人脸图像的相似度。
1 7、 根据权利要求 15所述的方法, 其特征在于, 所述基于所述第二局部 特征向量和从所述全局特征中提取到的对应的部分特征值之间的距离, 获取 第四相似度, 包括:
基于第四损失函数、 所述第二局部特征向量和从所述全局特征中提取到 的对应的部分特征值, 获取第四相似度。
1 8、 一种用于生成三维人脸模型的计算机应用装置, 所述装置包括: 获取模块, 用于获取二维人脸图像;
调用模块, 用于调用人脸模型生成模型; 及
生成模块, 用于将所述二维人脸图像输入所述人脸模型生成模型中, 通 过所述人脸模型生成模型提取所述二维人脸图像的全局特征和局部特征, 基 于所述全局特征和局部特征, 获取三维人脸模型参数, 基于所述三维人脸模 型参数, 输出所述二维人脸图像对应的三维人脸模型。
1 9、 一种计算机设备, 包括存储器和一个或多个处理器, 存储器中储存 有计算机可读指令, 计算机可读指令被处理器执行时, 使得一个或多个处理 器执行以下步骤:
获取二维人脸图像;
调用人脸模型生成模型; 及
将所述二维人脸图像输入所述人脸模型生成模型中, 通过所述人脸模型 生成模型提取所述二维人脸图像的全局特征和局部特征, 基于所述全局特征 和局部特征, 获取三维人脸模型参数, 基于所述三维人脸模型参数, 输出所 述二维人脸图像对应的三维人脸模型。
20、一个或多个存储有计算机可读指令的非易失性计算机可读存储介质, 计算机可读指令被一个或多个处理器执行时, 使得一个或多个处理器执行以 下步骤:
获取二维人脸图像;
调用人脸模型生成模型; 及
将所述二维人脸图像输入所述人脸模型生成模型中, 通过所述人脸模型 生成模型提取所述二维人脸图像的全局特征和局部特征, 基于所述全局特征 和局部特征, 获取三维人脸模型参数, 基于所述三维人脸模型参数, 输出所 述二维人脸图像对应的三维人脸模型。
PCT/CN2020/076650 2019-02-26 2020-02-25 用于三维人脸模型生成的计算机应用方法、装置、计算机设备及存储介质 WO2020173442A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP20763064.1A EP3933783A4 (en) 2019-02-26 2020-02-25 COMPUTER APPLICATION METHOD AND APPARATUS FOR GENERATING A THREE-DIMENSIONAL FACE MODEL, COMPUTER DEVICE AND INFORMATION HOLDER
US17/337,909 US11636613B2 (en) 2019-02-26 2021-06-03 Computer application method and apparatus for generating three-dimensional face model, computer device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910140602.XA CN109978989B (zh) 2019-02-26 2019-02-26 三维人脸模型生成方法、装置、计算机设备及存储介质
CN201910140602.X 2019-02-26

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/337,909 Continuation US11636613B2 (en) 2019-02-26 2021-06-03 Computer application method and apparatus for generating three-dimensional face model, computer device, and storage medium

Publications (1)

Publication Number Publication Date
WO2020173442A1 true WO2020173442A1 (zh) 2020-09-03

Family

ID=67077241

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/076650 WO2020173442A1 (zh) 2019-02-26 2020-02-25 用于三维人脸模型生成的计算机应用方法、装置、计算机设备及存储介质

Country Status (5)

Country Link
US (1) US11636613B2 (zh)
EP (1) EP3933783A4 (zh)
CN (1) CN109978989B (zh)
TW (1) TWI788630B (zh)
WO (1) WO2020173442A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112949576A (zh) * 2021-03-29 2021-06-11 北京京东方技术开发有限公司 姿态估计方法、装置、设备及存储介质
CN112991494A (zh) * 2021-01-28 2021-06-18 腾讯科技(深圳)有限公司 图像生成方法、装置、计算机设备及计算机可读存储介质

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978989B (zh) 2019-02-26 2023-08-01 腾讯科技(深圳)有限公司 三维人脸模型生成方法、装置、计算机设备及存储介质
CN110458924B (zh) * 2019-07-23 2021-03-12 腾讯科技(深圳)有限公司 一种三维脸部模型建立方法、装置和电子设备
CN110363175A (zh) * 2019-07-23 2019-10-22 厦门美图之家科技有限公司 图像处理方法、装置及电子设备
CN110532746B (zh) * 2019-07-24 2021-07-23 创新先进技术有限公司 人脸校验方法、装置、服务器及可读存储介质
US10853631B2 (en) 2019-07-24 2020-12-01 Advanced New Technologies Co., Ltd. Face verification method and apparatus, server and readable storage medium
KR20210030147A (ko) * 2019-09-09 2021-03-17 삼성전자주식회사 3d 렌더링 방법 및 장치
CN110675413B (zh) * 2019-09-27 2020-11-13 腾讯科技(深圳)有限公司 三维人脸模型构建方法、装置、计算机设备及存储介质
CN110796075B (zh) * 2019-10-28 2024-02-02 深圳前海微众银行股份有限公司 人脸多样性数据获取方法、装置、设备及可读存储介质
KR20210061839A (ko) * 2019-11-20 2021-05-28 삼성전자주식회사 전자 장치 및 그 제어 방법
CN113129425A (zh) * 2019-12-31 2021-07-16 Tcl集团股份有限公司 一种人脸图像三维重建方法、存储介质及终端设备
CN111210510B (zh) * 2020-01-16 2021-08-06 腾讯科技(深圳)有限公司 三维人脸模型生成方法、装置、计算机设备及存储介质
CN111294512A (zh) * 2020-02-10 2020-06-16 深圳市铂岩科技有限公司 图像处理方法、装置、存储介质及摄像装置
CN111598111B (zh) * 2020-05-18 2024-01-05 商汤集团有限公司 三维模型生成方法、装置、计算机设备及存储介质
CN111738968A (zh) * 2020-06-09 2020-10-02 北京三快在线科技有限公司 图像生成模型的训练方法、装置及图像生成方法、装置
US11386633B2 (en) * 2020-06-13 2022-07-12 Qualcomm Incorporated Image augmentation for analytics
CN111738217B (zh) * 2020-07-24 2020-11-13 支付宝(杭州)信息技术有限公司 生成人脸对抗补丁的方法和装置
CN112102461B (zh) * 2020-11-03 2021-04-09 北京智源人工智能研究院 一种人脸渲染方法、装置、电子设备和存储介质
CN112529999A (zh) * 2020-11-03 2021-03-19 百果园技术(新加坡)有限公司 一种参数估算模型的训练方法、装置、设备和存储介质
CN112419485B (zh) * 2020-11-25 2023-11-24 北京市商汤科技开发有限公司 一种人脸重建方法、装置、计算机设备及存储介质
WO2022111001A1 (zh) * 2020-11-25 2022-06-02 上海商汤智能科技有限公司 人脸图像的处理方法、装置、电子设备及存储介质
CN112614213B (zh) * 2020-12-14 2024-01-23 杭州网易云音乐科技有限公司 人脸表情确定方法、表情参数确定模型、介质及设备
CN112785494B (zh) * 2021-01-26 2023-06-16 网易(杭州)网络有限公司 一种三维模型构建方法、装置、电子设备和存储介质
US11759296B2 (en) * 2021-08-03 2023-09-19 Ningbo Shenlai Medical Technology Co., Ltd. Method for generating a digital data set representing a target tooth arrangement
CN114220142B (zh) * 2021-11-24 2022-08-23 慧之安信息技术股份有限公司 一种深度学习算法的人脸特征识别方法
CN114972634A (zh) * 2022-05-06 2022-08-30 清华大学 基于特征体素融合的多视角三维可变形人脸重建方法
CN115035313B (zh) * 2022-06-15 2023-01-03 云南这里信息技术有限公司 黑颈鹤识别方法、装置、设备及存储介质
CN115147524B (zh) * 2022-09-02 2023-01-17 荣耀终端有限公司 一种3d动画的生成方法及电子设备
CN115937638B (zh) * 2022-12-30 2023-07-25 北京瑞莱智慧科技有限公司 模型训练方法、图像处理方法、相关装置及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101303772A (zh) * 2008-06-20 2008-11-12 浙江大学 一种基于单幅图像的非线性三维人脸建模方法
CN101561874A (zh) * 2008-07-17 2009-10-21 清华大学 一种人脸图像识别的方法
US20140355843A1 (en) * 2011-12-21 2014-12-04 Feipeng Da 3d face recognition method based on intermediate frequency information in geometric image
CN104574432A (zh) * 2015-02-15 2015-04-29 四川川大智胜软件股份有限公司 一种自动多视角人脸自拍图像的三维人脸重建方法及系统
CN107680158A (zh) * 2017-11-01 2018-02-09 长沙学院 一种基于卷积神经网络模型的三维人脸重建方法
CN109978989A (zh) * 2019-02-26 2019-07-05 腾讯科技(深圳)有限公司 三维人脸模型生成方法、装置、计算机设备及存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6735566B1 (en) * 1998-10-09 2004-05-11 Mitsubishi Electric Research Laboratories, Inc. Generating realistic facial animation from speech
SI21200A (sl) * 2002-03-27 2003-10-31 Jože Balič Cnc upravljalna enota za krmiljenje obdelovalnih centrov s sposobnostjo učenja
CN102592309B (zh) * 2011-12-26 2014-05-07 北京工业大学 一种非线性三维人脸的建模方法
CN108510573B (zh) * 2018-04-03 2021-07-30 南京大学 一种基于深度学习的多视点人脸三维模型重建的方法
CN109035388B (zh) * 2018-06-28 2023-12-05 合肥的卢深视科技有限公司 三维人脸模型重建方法及装置
CN109255831B (zh) * 2018-09-21 2020-06-12 南京大学 基于多任务学习的单视图人脸三维重建及纹理生成的方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101303772A (zh) * 2008-06-20 2008-11-12 浙江大学 一种基于单幅图像的非线性三维人脸建模方法
CN101561874A (zh) * 2008-07-17 2009-10-21 清华大学 一种人脸图像识别的方法
US20140355843A1 (en) * 2011-12-21 2014-12-04 Feipeng Da 3d face recognition method based on intermediate frequency information in geometric image
CN104574432A (zh) * 2015-02-15 2015-04-29 四川川大智胜软件股份有限公司 一种自动多视角人脸自拍图像的三维人脸重建方法及系统
CN107680158A (zh) * 2017-11-01 2018-02-09 长沙学院 一种基于卷积神经网络模型的三维人脸重建方法
CN109978989A (zh) * 2019-02-26 2019-07-05 腾讯科技(深圳)有限公司 三维人脸模型生成方法、装置、计算机设备及存储介质

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112991494A (zh) * 2021-01-28 2021-06-18 腾讯科技(深圳)有限公司 图像生成方法、装置、计算机设备及计算机可读存储介质
CN112991494B (zh) * 2021-01-28 2023-09-15 腾讯科技(深圳)有限公司 图像生成方法、装置、计算机设备及计算机可读存储介质
CN112949576A (zh) * 2021-03-29 2021-06-11 北京京东方技术开发有限公司 姿态估计方法、装置、设备及存储介质
CN112949576B (zh) * 2021-03-29 2024-04-23 北京京东方技术开发有限公司 姿态估计方法、装置、设备及存储介质

Also Published As

Publication number Publication date
US20210286977A1 (en) 2021-09-16
TWI788630B (zh) 2023-01-01
EP3933783A4 (en) 2022-08-24
CN109978989A (zh) 2019-07-05
CN109978989B (zh) 2023-08-01
US11636613B2 (en) 2023-04-25
TW202032503A (zh) 2020-09-01
EP3933783A1 (en) 2022-01-05

Similar Documents

Publication Publication Date Title
TWI788630B (zh) 三維人臉模型生成方法、裝置、電腦設備及儲存介質
US11205282B2 (en) Relocalization method and apparatus in camera pose tracking process and storage medium
US11517099B2 (en) Method for processing images, electronic device, and storage medium
US11436779B2 (en) Image processing method, electronic device, and storage medium
CN108305236B (zh) 图像增强处理方法及装置
CN109308727B (zh) 虚拟形象模型生成方法、装置及存储介质
CN112907725B (zh) 图像生成、图像处理模型的训练、图像处理方法和装置
CN109815150B (zh) 应用测试方法、装置、电子设备及存储介质
CN111028144B (zh) 视频换脸方法及装置、存储介质
CN111723803B (zh) 图像处理方法、装置、设备及存储介质
CN109558837A (zh) 人脸关键点检测方法、装置及存储介质
CN112287852A (zh) 人脸图像的处理方法、显示方法、装置及设备
CN111144365A (zh) 活体检测方法、装置、计算机设备及存储介质
CN111613213B (zh) 音频分类的方法、装置、设备以及存储介质
CN112581358A (zh) 图像处理模型的训练方法、图像处理方法及装置
CN110796083B (zh) 图像显示方法、装置、终端及存储介质
CN112308103B (zh) 生成训练样本的方法和装置
CN112257594A (zh) 多媒体数据的显示方法、装置、计算机设备及存储介质
CN112967261B (zh) 图像融合方法、装置、设备及存储介质
CN111982293B (zh) 体温测量方法、装置、电子设备及存储介质
CN111988664B (zh) 视频处理方法、装置、计算机设备及计算机可读存储介质
CN111310701B (zh) 手势识别方法、装置、设备及存储介质
CN114155336A (zh) 虚拟物体显示方法、装置、电子设备及存储介质
CN114093020A (zh) 动作捕捉方法、装置、电子设备及存储介质
CN111797754A (zh) 图像检测的方法、装置、电子设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20763064

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020763064

Country of ref document: EP

Effective date: 20210927