WO2020156148A1 - Smpl参数预测模型的训练方法、计算机设备及存储介质 - Google Patents

Smpl参数预测模型的训练方法、计算机设备及存储介质 Download PDF

Info

Publication number
WO2020156148A1
WO2020156148A1 PCT/CN2020/072023 CN2020072023W WO2020156148A1 WO 2020156148 A1 WO2020156148 A1 WO 2020156148A1 CN 2020072023 W CN2020072023 W CN 2020072023W WO 2020156148 A1 WO2020156148 A1 WO 2020156148A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
prediction
dimensional
human body
contour
Prior art date
Application number
PCT/CN2020/072023
Other languages
English (en)
French (fr)
Inventor
孙爽
李琛
戴宇荣
贾佳亚
沈小勇
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP20748016.1A priority Critical patent/EP3920146A4/en
Publication of WO2020156148A1 publication Critical patent/WO2020156148A1/zh
Priority to US17/231,952 priority patent/US20210232924A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/809Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • G06V40/11Hand-related biometrics; Hand pose recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20036Morphological image processing
    • G06T2207/20044Skeletonization; Medial axis transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Definitions

  • the embodiments of the present application relate to the field of computer vision, and in particular, to a training method, computer equipment, and storage medium of an SMPL parameter prediction model.
  • Three-dimensional human body reconstruction is one of the important topics in computer vision research, and it has important application value in the fields of virtual reality (VR), human body animation, and games.
  • VR virtual reality
  • human body animation and games.
  • SMPL Skinned Multi-Person Linear
  • the human body information extraction model is first used to extract the human body information such as two-dimensional joint points, three-dimensional joint points, two-dimensional
  • the information input parameter prediction model performs SMPL parameter prediction, and then the predicted SMPL parameters are input into the SMPL model for 3D human body reconstruction.
  • a training method, computer equipment, and storage medium of an SMPL parameter prediction model are provided.
  • the technical solution is as follows:
  • an embodiment of the present application provides a method for training an SMPL parameter prediction model, which is executed by a computer device, and is characterized in that the method includes:
  • posture prediction parameters are parameters used to indicate the posture of the human body in the SMPL prediction parameters
  • a morphological parameter prediction model Inputting the sample picture into a morphological parameter prediction model to obtain a morphological prediction parameter, where the morphological prediction parameter is a parameter used to indicate a human body shape among the SMPL prediction parameters;
  • an embodiment of the present application provides a three-dimensional human body reconstruction method, which is executed by a computer device, and the method includes:
  • posture prediction parameters are parameters used to indicate the posture of the human body in the SMPL prediction parameters
  • morphological prediction parameters are parameters used to indicate the shape of the human body in the SMPL prediction parameters
  • a three-dimensional model of the target human body is constructed through the SMPL model.
  • an embodiment of the present application provides an SMPL parameter prediction model training device, the device includes:
  • the first acquisition module is used to acquire a sample picture, the sample picture contains a human body image
  • the first prediction module is configured to input the sample picture into the posture parameter prediction model to obtain posture prediction parameters, where the posture prediction parameters are parameters used to indicate the posture of the human body in the SMPL prediction parameters;
  • the second prediction module is configured to input the sample picture into a morphological parameter prediction model to obtain a morphological prediction parameter, where the morphological prediction parameter is a parameter used to indicate a human body shape among the SMPL prediction parameters;
  • the loss calculation module is used to calculate the model prediction loss according to the SMPL prediction parameters and the annotation information of the sample pictures;
  • the training module is used to reversely train the posture parameter prediction model and the morphological parameter prediction model according to the model prediction loss.
  • an embodiment of the present application provides a three-dimensional human body reconstruction device, and the method includes:
  • the second acquisition module is configured to acquire a target picture, and the target picture contains a human body image
  • the third prediction module is used to input the target picture into a pose parameter prediction model to obtain a pose prediction parameter, where the pose prediction parameter is a parameter used to indicate the posture of the human body in the SMPL prediction parameter;
  • the fourth prediction module is used to input the target picture into a morphological parameter prediction model to obtain morphological prediction parameters, where the morphological prediction parameters are parameters used to indicate human body shape in the SMPL prediction parameters;
  • the second construction module is used to construct a three-dimensional model of the target human body through the SMPL model according to the posture prediction parameters and the morphological prediction parameters.
  • an embodiment of the present application provides a computer device, including a processor and a memory, and computer-readable instructions are stored in the memory.
  • the processing The device executes the steps of the SMPL parameter prediction model training method or three-dimensional human body reconstruction method.
  • the embodiments of the present application provide a non-volatile computer-readable storage medium that stores computer-readable instructions.
  • the computer-readable instructions are executed by one or more processors, the one Or multiple processors execute the SMPL parameter prediction model training method or the three-dimensional human body reconstruction method described in the above aspect.
  • FIG. 1 shows a method flow chart of a training method of an SMPL parameter prediction model provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of the principle of the training method of the SMPL parameter prediction model provided by an embodiment of the present application
  • Fig. 3 shows a method flow chart of a training method of an SMPL parameter prediction model provided by another embodiment of the present application
  • FIG. 4 shows a flowchart of a method for calculating the prediction loss process of the first model in an embodiment
  • FIG. 5 shows a flowchart of a method for calculating the prediction loss process of the second model in an embodiment
  • FIG. 6 shows a schematic diagram of the principle of calculating the prediction loss process of the second model in an embodiment
  • FIG. 7 shows a flowchart of a method for calculating the prediction loss process of the third model in an embodiment
  • FIG. 8 shows a schematic diagram of the principle of calculating the prediction loss process of the third model in an embodiment
  • FIG. 9 shows a schematic diagram of an application scenario provided by an embodiment of the present application in an embodiment
  • FIG. 10 shows a method flowchart of a three-dimensional human body reconstruction method provided by an embodiment of the present application
  • Figures 11 and 12 are the three-dimensional reconstruction results of the human body obtained when the public data set is used to test the solution provided by this application and the HMR solution;
  • FIG. 13 shows a block diagram of a training device for an SMPL parameter prediction model provided by an embodiment of the present application
  • Fig. 14 shows a block diagram of a three-dimensional human body reconstruction device provided by an embodiment of the present application.
  • FIG. 15 shows a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • SMPL model a parametric human body model, which is driven by SMPL parameters.
  • SMPL parameters include shape parameter ⁇ and pose parameter ⁇ .
  • the morphological parameters Contains 10 parameters that characterize the body’s height, short, fat, thin, head-to-body ratio; pose parameters Contains 72 parameters corresponding to 24 joint points (the parameter corresponding to each joint point is represented by a three-dimensional rotation vector, so it contains a total of 24 ⁇ 3 parameters).
  • the three-dimensional human body model can be defined as:
  • Is a three-dimensional human body model of any human body, and the surface of the three-dimensional human body model contains n 6890 model vertices, ⁇ is the morphological parameter, ⁇ is the posture parameter, and ⁇ is the fixed parameter learned from the three-dimensional body scan data, It is the model vertex parameters of the average human body model under the average shape and standard pose (zero pose) (each vertex is represented by three-dimensional coordinates, so it contains 3n parameters), Bs is a shape-independent mixing function, used to adjust the average human body model according to the shape parameters Bp is a posture independent mixing function, which is used to adjust the posture of the average human model according to the posture parameters. Is a function used to calculate the position of human joint points, Is a standard mixed skin function.
  • Projection function a function used to project coordinate points in a three-dimensional space to a two-dimensional space.
  • the projection function in the embodiment of the present application is used to project the vertices of the three-dimensional human body model into the two-dimensional image space.
  • the projection function adopts a weak perspective projection function, and the projection parameter corresponding to the projection function among them, Is the scaling parameter, with Is the translation parameter, correspondingly, projecting the coordinate points (x, y, z) in the three-dimensional space to the two-dimensional space can be expressed as:
  • Posture parameter prediction model a neural network model used to predict the posture of the human body in a picture.
  • the model input of the model is a picture
  • the model output is a 72-dimensional posture parameter.
  • Morphological parameter prediction model a neural network model used to predict the shape of the human body in a picture.
  • the model input of the model is a picture, and the model output is a 10-dimensional morphological parameter.
  • Labeling information In the field of machine learning, the information used to indicate the key parameters in the training sample is called labeling information, which can be generated by manual labeling.
  • the annotation information in the embodiment of the present application is used to indicate key parameters in the human body image, and the annotation information may include at least one of SMPL parameters, two-dimensional joint point coordinates, three-dimensional joint point coordinates, and two-dimensional body contours.
  • Loss function a non-negative real-valued function used to estimate the difference between the predicted value of the model and the ground truth. Among them, the smaller the loss function of the model, the better the robustness of the model.
  • the loss function in the embodiment of the present application is used to estimate the difference between the prediction parameters output by the posture parameter prediction model and the morphological parameter prediction model and the pre-labeled information.
  • the parameters that affect the reconstructed 3D human body include morphological parameters and posture parameters. Therefore, the key point of 3D human body reconstruction based on a single picture is to accurately predict the morphological parameters and posture parameters.
  • two neural network models for predicting pose parameters and morphological parameters are separately designed (using pictures as model input), avoiding separate training for The human body information extraction model that extracts human body information; at the same time, the corresponding loss function is designed from the perspective of human body shape and human posture, and two neural network models are trained based on the designed loss function and annotation information to improve the prediction accuracy of the neural network model , Thereby improving the accuracy of the reconstructed three-dimensional human body in the posture and shape of the human body.
  • Illustrative embodiments are used for description below.
  • FIG. 1 shows a method flowchart of the SMPL parameter prediction model training method provided by an embodiment of the present application.
  • the training method is applied to a server as an example for description.
  • the method may include the following steps:
  • Step 101 Obtain a sample picture, and the sample picture contains a human body image.
  • the server performs model training based on several sample picture sets. Therefore, during the training process, the server obtains sample pictures from the sample picture set, and the sample picture set contains several pre-labeled sample pictures.
  • each sample picture corresponds to at least one type of annotation information, and the types of annotation information corresponding to the sample pictures in different sample picture sets are different.
  • the label information of the sample pictures in the sample picture set A includes two-dimensional joint point coordinates and two-dimensional body contours;
  • the label information of the sample pictures in the sample picture set B includes two-dimensional joint point coordinates and three-dimensional joint point coordinates;
  • the sample picture set B The labeling information of the sample pictures in the middle includes two-dimensional joint point coordinates and SMPL parameters.
  • the server inputs the sample pictures into the pose parameter prediction model and the morphological parameter prediction model respectively.
  • the server before inputting the sample image into the pose/morphological parameter prediction model, needs to preprocess the sample image so that the sample image of the input pose/morphological parameter prediction model meets the model input requirements.
  • preprocessing methods include cropping and size scaling. For example, the size of the sample picture after preprocessing is 224 ⁇ 224.
  • Step 102 Input the sample picture into the posture parameter prediction model to obtain posture prediction parameters.
  • the posture prediction parameters are parameters used to indicate the posture of the human body in the SMPL prediction parameters.
  • the pose prediction parameters output by the pose parameter prediction model are 72-dimensional parameters, which are used to indicate the rotation vectors of the 24 joint points of the human body.
  • the backbone network structure of the attitude parameter prediction model is a residual neural network (Residual Neural Network, ResNet), such as ResNet50.
  • ResNet residual Neural Network
  • the embodiment of the present application does not limit the specific structure of the attitude parameter prediction model.
  • the parameter settings of each network layer in the attitude parameter prediction model are shown in Table 1.
  • Step 103 Input the sample picture into the morphological parameter prediction model to obtain the morphological prediction parameter.
  • the morphological prediction parameter is a parameter used to indicate the human body shape in the SMPL prediction parameter.
  • the morphological prediction parameters output by the morphological parameter prediction model are 10-dimensional parameters, which are used to indicate 10 parameters such as the height of the human body, the proportion of the head and the body, and so on.
  • the morphological parameter prediction model is constructed based on a simplified visual geometry group VGG (Visual Geometry Group, VGG) network, and the embodiment of the present application does not limit the specific structure of the morphological parameter prediction model.
  • VGG Visual Geometry Group
  • Step 104 Construct a three-dimensional human body model through the SMPL model according to the posture prediction parameters and the shape prediction parameters.
  • the server brings the posture prediction parameters and the morphology prediction parameters into the SMPL model to construct a three-dimensional human body model, so as to subsequently evaluate the parameter prediction effect of the model based on the three-dimensional human body model.
  • the three-dimensional human body model contains the vertex coordinates of 6,890 model vertices.
  • Step 105 Calculate the model prediction loss according to the SMPL prediction parameters and/or the human body three-dimensional model, combined with the annotation information of the sample picture.
  • the server calculates the predicted loss by using a pre-built loss function based on the predicted result and the label information of the sample picture.
  • the loss function includes at least one sub-loss function, and different sub-loss functions are used to calculate the model prediction loss according to different types of annotation information.
  • the server determines to use the corresponding sub-loss function to calculate the model prediction loss according to the label information of the sample pictures.
  • the server calculates the model prediction loss based on the annotation information and SMPL prediction parameters, and/or the server calculates the model prediction loss based on the annotation information and the three-dimensional human body model.
  • Step 106 Reverse training the posture parameter prediction model and the morphological parameter prediction model according to the model prediction loss.
  • the server uses the gradient descent algorithm to reversely train the posture parameter prediction model and the morphological parameter prediction model (optimize the parameters in the model), and Stop reverse training when the gradient is less than the threshold.
  • the embodiment of the present application does not limit the specific manner of the reverse training model.
  • the learning rate used in the model training process is 1e-4, and the batch size (batch_size) is 96.
  • the posture prediction parameters and the morphological prediction parameters in the SMPL prediction parameters are obtained, and are based on the posture prediction parameters Construct a three-dimensional human body model with morphological prediction parameters, so that based on the annotation information of the sample picture, calculate the model prediction loss according to at least one of the SMPL prediction parameters and the human three-dimensional model, and then predict the pose parameter prediction model and morphological parameter based on the model prediction loss
  • the model is trained in reverse; when the model is trained using the method provided in the embodiments of this application, the sample picture is directly used as the model input for model training, and there is no need to separately train the model that extracts the human body information in the picture, thereby reducing the complexity of model training and improving The efficiency of model training is improved; at the same time, the calculation of model prediction loss based on the annotation information, the prediction parameters and the three-dimensional human body model constructed
  • the loss function predefined by the server includes four sub-loss functions, which are SMPL parameter loss function, joint point position loss function, human body contour loss function, and regular loss function.
  • the SMPL parameter loss function is used to measure the difference between the posture prediction parameters and the shape prediction parameters and the labeled SMPL parameters
  • the joint point position loss function is used to measure the difference between the predicted joint point positions and the marked joint point positions
  • the human body contour loss function is used to measure the difference between the human contour of the reconstructed three-dimensional human model and the human contour in the sample picture.
  • the process of the server training model is shown in Figure 2.
  • the server inputs the sample picture 21 into the posture parameter prediction model 22 to obtain the projection parameters 221 and posture prediction parameters 222 output by the posture parameter prediction model 22.
  • the output of the morphological parameter prediction model 23 is obtained.
  • the server constructs the human body three-dimensional model 24 through the SMPL model. The process of calculating the model prediction loss based on the prediction parameters, the three-dimensional model of the human body, and the annotation information will be described below through an illustrative embodiment.
  • FIG. 3 shows a method flowchart of the SMPL parameter prediction model training method provided by another embodiment of the present application.
  • the training method is applied to a server as an example for description.
  • the method may include the following steps:
  • Step 301 Obtain a sample picture, which contains a human body image.
  • Step 302 Input the sample picture into the pose parameter prediction model to obtain pose prediction parameters and projection parameters.
  • the projection parameter is related to the shooting angle of the sample picture.
  • the server realizes the pose parameter and projection parameter prediction through the pose parameter prediction model.
  • the output parameters of the pose prediction parameter model are 75 dimensions, including 72-dimensional attitude prediction parameters ⁇ and 3-dimensional projection parameters
  • Step 303 Input the sample picture into the morphological parameter prediction model to obtain morphological prediction parameters.
  • Step 304 Construct a three-dimensional human body model through the SMPL model according to the posture prediction parameters and the shape prediction parameters.
  • steps 303 and 304 For the implementation of the above steps 303 and 304, reference may be made to steps 103 and 104, which are not repeated in this embodiment.
  • Step 305 Calculate the prediction loss of the first model according to the SMPL prediction parameters and the SMPL labeling parameters in the labeling information.
  • the server when the labeling information of the sample picture contains SMPL labeling parameters (including pose labeling parameters and morphological labeling parameters), the server will predict the parameters (including pose prediction parameters and morphological prediction parameters) and SMPL labeling parameters according to the SMPL labeling parameters.
  • the prediction loss of the first model is calculated by the SMPL parameter loss function. In a possible implementation, as shown in FIG. 4, this step includes the following steps.
  • Step 305A Calculate the first Euclidean distance between the posture annotation parameter and the posture prediction parameter.
  • the server calculates the first Euclidean distance (the Euclidean distance between 72-dimensional vectors) between the posture annotation parameter and the posture prediction parameter, and then evaluates the accuracy of the posture parameter prediction based on the first Euclidean distance.
  • the smaller the first Euclidean distance the higher the accuracy of the attitude parameter prediction.
  • is the attitude prediction parameter, Mark the parameters for the pose.
  • Step 305B Calculate the second Euclidean distance between the shape labeling parameter and the shape prediction parameter.
  • the server calculates the second Euclidean distance (the Euclidean distance between the 10-dimensional vectors) between the morphological labeling parameter and the morphological prediction parameter, and then evaluates the morphology according to the second Euclidean distance The accuracy of parameter prediction. Among them, the smaller the second Euclidean distance, the higher the accuracy of morphological parameter prediction.
  • is the shape prediction parameter, Mark the parameters for the shape.
  • Step 305C Determine the first model prediction loss according to the first Euclidean distance and the second Euclidean distance.
  • the server calculates the first model prediction loss (that is, the SMPL parameter prediction loss).
  • the first model predicts the loss Expressed as:
  • ⁇ p is the parameter loss weight.
  • ⁇ p is 60.
  • the server calculates the first model prediction loss according to the pose prediction parameter 222 and the shape prediction parameter 231
  • Step 306 Calculate the prediction loss of the second model according to the joint point predicted coordinates of the joint points in the human body three-dimensional model and the joint point label coordinates of the joint points in the label information.
  • the server when the annotation information of the sample picture includes the annotation coordinates of the joint points (including the annotation coordinates of the two-dimensional joint points and/or the annotation coordinates of the three-dimensional joint points), the server first determines the joint point predictions of the joint points in the three-dimensional human body model According to the joint point prediction coordinates and the joint point label coordinates, the second model prediction loss is calculated through the joint point position loss function.
  • this step includes the following steps.
  • Step 306A Calculate the third Euclidean distance between the three-dimensional joint point predicted coordinates of the joint points in the three-dimensional human body model and the three-dimensional joint point labeled coordinates.
  • the server selects 14 joint points among the 24 joint points as the target joint points, and calculates the third between the three-dimensional joint point prediction coordinates of the 14 target joint points and the three-dimensional joint point labeling coordinates. Euclidean distance.
  • the server determines the three-dimensional joint points of the joint points in the three-dimensional human body model according to the vertex coordinates of the vertices of the joint points in the three-dimensional human body model. Forecast coordinates.
  • the three-dimensional joint point predicted coordinates of the joint points are the average value of the vertex coordinates of the model vertices around the joint points.
  • the server generates a three-dimensional joint point map 25 according to the human three-dimensional model 24, and the three-dimensional joint point map 25 contains the three-dimensional joint point predicted coordinates of each joint point.
  • the server calculates the third Euclidean distance between the three-dimensional joint point predicted coordinates and the three-dimensional joint point labeled coordinates (corresponding to the same joint point).
  • j 3D is the three-dimensional joint point labeled coordinates of the joint point.
  • Step 306B Calculate the fourth Euclidean distance between the two-dimensional joint point predicted coordinates of the joint points in the three-dimensional human body model and the two-dimensional joint point labeled coordinates.
  • this step includes the following steps:
  • the server first determines the three-dimensional joint point prediction coordinates of the joint points before calculating the two-dimensional joint point prediction coordinates.
  • the process of determining the predicted coordinates of the three-dimensional joint points can refer to the above step 306A, and this step will not be repeated here.
  • the server can project the predicted coordinates of the 3D joint points according to the projection parameters, that is, project the 3D joint points into the 2D image space to obtain two The two-dimensional joint point prediction coordinates of the two-dimensional joint point.
  • the server generates a two-dimensional joint point map 26 according to the three-dimensional joint point map 25 and projection parameters 221, and the two-dimensional joint point map 26 contains the two-dimensional joint point predicted coordinates of each joint point.
  • the server calculates the fourth Euclidean distance between the two-dimensional joint point predicted coordinates and the two-dimensional joint point labeling coordinates (corresponding to the same joint point).
  • proj is the projection processing
  • j 2D is the two-dimensional joint point label coordinates of the joint points.
  • Step 306C Calculate the second model prediction loss according to the third Euclidean distance and the fourth Euclidean distance.
  • the server calculates the second model prediction loss (that is, the joint point position prediction loss).
  • the second model predicts the loss Expressed as:
  • ⁇ 3D is the weight of three-dimensional joint point position loss
  • ⁇ 2D is the weight of two-dimensional joint point position loss.
  • both ⁇ 3D and ⁇ 3D are 60.0.
  • the server calculates the second model prediction loss according to the three-dimensional joint point map 25 and the two-dimensional joint point map 26
  • the server first determines the three-dimensional joint point prediction coordinates 62 of the joint points according to the human body three-dimensional model 61, and then according to the three-dimensional joint point prediction coordinates 62 and the annotation information
  • the third Euclidean distance 64 is calculated with the three-dimensional joint point labeled coordinates 63; at the same time, the server performs projection processing on the three-dimensional joint point predicted coordinates 62 according to the projection parameter 65 to obtain the two-dimensional joint point predicted coordinates 66 corresponding to the joint point, so as The predicted coordinates 66 and the two-dimensional joint point annotation coordinates 67 in the annotation information calculate the fourth Euclidean distance 68.
  • the server determines the second model prediction loss 69 based on the third Euclidean distance 64 and the fourth Euclidean distance 68.
  • the server can determine the second model only according to the third Euclidean distance or the fourth Euclidean distance
  • the prediction loss is not limited in this embodiment.
  • Step 307 Calculate the prediction loss of the third model according to the predicted two-dimensional human contour of the human three-dimensional model and the marked two-dimensional human contour in the annotation information.
  • the server can further generate a predicted two-dimensional human body contour based on the constructed human body 3D model, thereby determining the human body shape by calculating the loss between the human body contours The accuracy of the forecast.
  • the contour of the human body is used to indicate the human body region in the picture, and can be represented by a black and white image, where the white region in the black and white image is the human body region.
  • the server generates a predicted two-dimensional human body contour 27 according to the three-dimensional human body model 24, and calculates the predicted loss of the third model according to the marked two-dimensional human body contour 28
  • this step may include the following steps:
  • Step 307A according to the projection parameters, project the model vertices in the three-dimensional human body model to the two-dimensional space, and generate a predicted two-dimensional human body contour.
  • the server can be based on the two-dimensional human body.
  • the difference of contour measures the prediction accuracy of human body posture and shape.
  • the server projects each model vertex into a two-dimensional space through a projection function according to the projection parameters, thereby generating a two-dimensional image containing a predicted two-dimensional human contour.
  • Step 307B Calculate the first contour loss and the second contour loss according to the predicted two-dimensional human body contour and the labeling of the two-dimensional human contour.
  • the first contour loss is also called forward contour loss, which is used to indicate the loss from predicting two-dimensional human contours to marking two-dimensional human contours;
  • the second contour loss is also called reverse contour loss, which is used to indicate marking two-dimensional humans The loss of contour to the loss of predicting the contour of the two-dimensional human body.
  • calculating the contour loss by the server may include the following steps.
  • the terminal calculates the first shortest distance from the contour point to the marked two-dimensional human contour, and calculates the first shortest distance corresponding to each contour point Accumulate to get the first contour loss.
  • the calculation of the first contour loss includes the following steps:
  • the server automatically divides the model vertex ⁇ to the joint point closest to it Among them, the server divides the model vertices into 14 joint points
  • the server For predicting a contour point in a two-dimensional human contour, after calculating the first shortest distance corresponding to the contour point, the server detects the visibility of the joint point of the model vertex corresponding to the contour point. If the joint point of the model vertex is visible, the server determines the contour The first weight of the point is 1; if the joint point of the model vertex is not visible, the server determines that the first weight of the contour point is 0.
  • the following formula can be used to determine the first weight of the contour point corresponding to the model vertex:
  • ⁇ ⁇ is the first weight of the contour point corresponding to the model vertex ⁇
  • the server can also first detect the visibility of the joint points that the contour points correspond to the model vertices, and when the joint points are invisible, stop calculating the shortest distance from the contour points corresponding to the model vertices to the marked two-dimensional human contour, thereby reducing the amount of calculation .
  • the server corrects the first shortest distance corresponding to each contour point according to the first weight, thereby accumulating the corrected first shortest distance to obtain the first contour loss.
  • the first contour loss is calculated according to the first shortest distance corresponding to each contour point in the predicted two-dimensional human contour. It can also include the following steps:
  • the server automatically divides the model vertex ⁇ to the joint point closest to it. And set the shutdown point through the projection parameters Project to two-dimensional space to get joint points The (two-dimensional) joint point prediction coordinates. Among them, the joint point of the model vertex ⁇ belongs Joint point
  • the fifth Euclidean distance between the predicted coordinate of the joint point and the coordinate of the joint point determine the second weight of each model vertex corresponding to the contour point.
  • the second weight and the fifth Euclidean distance have a negative correlation.
  • the server determines the accuracy of the joint point prediction by calculating the fifth Euclidean distance between the joint point prediction coordinates and the joint point labeled coordinates (the same joint point).
  • the fifth European Is the joint point
  • the (two-dimensional) joint points are marked with coordinates.
  • the server determines the second weight of the contour point corresponding to the model vertex belonging to the joint point according to the fifth Euclidean distance, where the second weight is a positive value, and the second weight has a negative correlation with the fifth Euclidean distance .
  • the following formula may be used to determine the second weight of the contour point corresponding to the model vertex:
  • the server corrects the first shortest distance corresponding to each contour point according to the second weight, so as to accumulate the corrected first shortest distance to obtain the first contour loss.
  • the server calculates the first weight and the second weight at the same time, thereby calculating the first contour based on the first shortest distance, the first weight, and the second weight corresponding to each contour point in the predicted two-dimensional human contour
  • the loss, correspondingly, the first contour loss can be expressed as:
  • the terminal calculates the second shortest distance from the contour point to the predicted two-dimensional human contour, and calculates the second shortest distance corresponding to each contour point. Accumulate to obtain the second contour loss.
  • the second contour loss can be expressed as:
  • Step 307C Determine the third model prediction loss according to the first contour loss and the second contour loss.
  • the server calculates the third model prediction loss based on the first contour loss and its corresponding weight, and the second contour loss and its corresponding weight.
  • the prediction loss of the third model can be expressed as:
  • the server first generates a predicted two-dimensional human contour 83 based on the three-dimensional human body model 81 and projection parameters 82, and then according to the predicted two-dimensional human contour 83 and annotation two
  • the first contour loss 85 and the second contour loss 86 are calculated from the dimensional human contour 84, and finally the third model prediction loss 87 is determined according to the first contour loss 85 and the second contour loss 86.
  • the server calculates the first shortest distance 851 from the predicted contour point on the two-dimensional human contour 83 to the marked two-dimensional human contour 84, and determines the first shortest distance 851 according to the visibility of the joint points to which the contour points belong.
  • a weight 852, and a second weight 853 is determined according to the prediction accuracy of the joint points to which the contour points belong, so as to calculate the first contour loss 85 according to the first shortest distance 851, the first weight 852 and the second weight 853; calculate the second contour loss
  • the server calculates the second shortest distance 861 from marking the contour points on the two-dimensional human contour 84 to the predicted two-dimensional human contour 83, so as to determine the second contour loss 86 according to the second shortest distance 861.
  • Step 308 Perform regular processing on the morphological prediction parameters to obtain the fourth model prediction loss.
  • the server performs L2 regular processing on the shape prediction parameters to obtain the fourth model prediction loss Among them, ⁇ reg is the regular loss weight.
  • ⁇ reg is 1.0.
  • the server obtains the fourth model prediction loss according to the morphological prediction parameter 231
  • Step 309 Reversely train the posture parameter prediction model and the morphological parameter prediction model according to the model prediction loss.
  • the server performs reverse training on the posture parameter prediction model and the morphological parameter prediction model according to the first, second, third, and fourth model prediction losses calculated in the foregoing steps.
  • the reverse training process reference may be made to the above step 106, which will not be repeated in this embodiment.
  • the server may perform reverse training according to a part of the model prediction loss, which is not limited in this embodiment.
  • the server projects the model vertices of the reconstructed three-dimensional human body model to the two-dimensional space by introducing the human body contour item constraint, and obtains the predicted two-dimensional human body contour, and uses the predicted two-dimensional human body contour and labeling the two-dimensional human body
  • the contour loss between contours reverses training the model, which is beneficial to improve the accuracy of morphological parameter prediction, and thus the accuracy of the reconstructed three-dimensional human body model on the human body shape.
  • the trained model can be used to perform three-dimensional reconstruction of the human body in a single image.
  • the terminal 920 uploads a picture containing a human body image to the server 940.
  • the server 940 predicts the posture parameters of the human body in the picture through the posture parameter prediction model, and predicts the morphological parameters of the human body in the picture through the morphological parameter prediction model, thereby sending SMPL parameters including the posture parameters and the morphological parameters to the terminal 920.
  • the terminal 920 reconstructs the three-dimensional human body model through the SMPL model and displays it.
  • the terminal 920 can also perform SMPL parameter prediction locally without the server 940.
  • the VR device collects an image containing the player’s human body through the camera, it predicts the player’s posture parameters through the built-in posture parameter prediction model, and predicts the player’s morphological parameters through the morphological parameter prediction model, and then according to the posture parameters Reconstruct the player's three-dimensional human body model with morphological parameters, and display the player's three-dimensional human body model in the VR screen in real time, thereby increasing the player's immersion when using the VR device.
  • model obtained by the above training can also be used in other application scenarios for reconstructing a three-dimensional human body model based on a single picture (including a human body) or video (a video frame containing a continuous human body), which is not limited in this embodiment of the application.
  • FIG. 10 shows a method flow chart of a three-dimensional human body reconstruction method provided by an embodiment of the present application.
  • the method is applied to a server as an example for description.
  • the method may include the following steps:
  • Step 1001 Obtain a target picture, and the target picture contains a human body image.
  • the target picture is a single picture uploaded by the terminal, or the target picture is a video frame intercepted from a video uploaded by the terminal.
  • the server before inputting the target image into the pose/morphological parameter prediction model, needs to preprocess the target image so that the target image input into the pose/morphological parameter prediction model meets the model input requirements.
  • preprocessing methods include cropping and size scaling. For example, the size of the target picture after preprocessing is 224 ⁇ 224.
  • Step 1002 Input the target picture into the posture parameter prediction model to obtain posture prediction parameters.
  • the posture prediction parameters are parameters used to indicate the posture of the human body in the SMPL prediction parameters.
  • the pose parameter prediction model outputs 72-dimensional pose prediction parameters. It should be noted that when the state parameter prediction model outputs 75-dimensional parameters, the server determines the 72-dimensional parameters among them as attitude prediction parameters, and determines the remaining 3-dimensional parameters as projection parameters.
  • Step 1003 Input the target picture into the morphological parameter prediction model to obtain morphological prediction parameters.
  • the morphological prediction parameters are the parameters used to indicate the shape of the human body in the SMPL prediction parameters.
  • the morphological parameter prediction model outputs 10-dimensional morphological prediction parameters.
  • Step 1004 according to the posture prediction parameters and the shape prediction parameters, construct a target human body three-dimensional model through the SMPL model.
  • the server inputs the shape prediction parameters and posture prediction parameters output by the model into the SMPL model, thereby constructing a target human body 3D model containing 6890 model vertices.
  • the server sends the model data of the target human body three-dimensional model to the terminal for rendering and display by the terminal.
  • the server when the terminal has a three-dimensional human body model reconstruction function, the server sends the shape prediction parameters and posture prediction parameters output by the model to the terminal, and the terminal reconstructs and displays the three-dimensional human body model.
  • the posture prediction parameters and the morphological prediction parameters are obtained, and the posture prediction parameters and the morphological prediction parameters are constructed
  • the human body three-dimensional model so as to calculate the model prediction loss according to the annotation information of the sample picture, and at least one of the pose prediction parameter, the shape prediction parameter and the human body three-dimensional model, and then the pose parameter prediction model and the morphological parameter prediction model based on the model prediction loss Perform reverse training; when training the model using the method provided in the embodiments of this application, the sample picture is directly used as the model input for model training, and there is no need to separately train the model that extracts the human body information in the picture, thereby reducing the complexity of model training and improving The efficiency of model training; at the same time, the calculation of model prediction loss based on the annotation information, prediction parameters, and the three-dimensional human body model constructed based on the prediction parameters helps to improve the training quality of the
  • the accuracy rate is used to measure the fit between the reconstructed body contour and the original body contour
  • the F1 score is used to indicate the accuracy and recall rate of the result
  • the average joint position error (Procrustes Analysis-Mean Per Joint Position Error, PA -MPVPE) is used to indicate the prediction error of the joint point position.
  • FIG. 13 shows a block diagram of an SMPL parameter prediction model training device provided in an embodiment of the present application.
  • the device has the function of executing the above-mentioned method example, and the function can be realized by hardware, or by hardware executing corresponding software.
  • the device may include:
  • the first obtaining module 1310 is configured to obtain a sample picture, the sample picture contains a human body image
  • the first prediction module 1320 is configured to input the sample picture into a posture parameter prediction model to obtain posture prediction parameters, where the posture prediction parameters are parameters used to indicate the posture of the human body in the SMPL prediction parameters;
  • the second prediction module 1330 is configured to input the sample picture into a morphological parameter prediction model to obtain a morphological prediction parameter, where the morphological prediction parameter is a parameter used to indicate a human body shape among the SMPL prediction parameters;
  • the first construction module 1340 is configured to construct a three-dimensional human body model through the SMPL model according to the pose prediction parameters and the morphology prediction parameters;
  • the loss calculation module 1350 is configured to calculate the model prediction loss according to the SMPL prediction parameters and/or the three-dimensional human body model, combined with the annotation information of the sample picture;
  • the training module 1360 is configured to reversely train the posture parameter prediction model and the morphological parameter prediction model according to the model prediction loss.
  • the loss calculation module 1350 includes:
  • the first calculation unit is configured to calculate a first model prediction loss according to the SMPL prediction parameters and the SMPL annotation parameters in the annotation information, where the SMPL annotation parameters include pose annotation parameters and morphological annotation parameters;
  • the second calculation unit is configured to calculate the prediction loss of the second model according to the joint point predicted coordinates of the joint points in the three-dimensional human body model and the joint point label coordinates of the joint points in the label information;
  • the third calculation unit is configured to calculate the prediction loss of the third model according to the predicted two-dimensional human body contour of the human three-dimensional model and the marked two-dimensional human contour in the annotation information.
  • the first calculation unit is configured to:
  • the prediction loss of the first model is determined according to the first Euclidean distance and the second Euclidean distance.
  • the annotation coordinates of the joint points include three-dimensional joint point annotation coordinates and/or two-dimensional joint point annotation coordinates;
  • the second calculation unit is used to:
  • the second model prediction loss is calculated according to the third Euclidean distance and/or the fourth Euclidean distance.
  • the second calculation unit is further configured to:
  • the pose parameter prediction model is further used to output projection parameters according to the input sample picture, and the projection parameters are used to project points in a three-dimensional space into a two-dimensional space;
  • the second calculation unit is also used for:
  • the pose parameter prediction model is further used to output projection parameters according to the input sample picture, and the projection parameters are used to project points in a three-dimensional space into a two-dimensional space;
  • the third calculation unit is used to:
  • the third model prediction loss is determined according to the first contour loss and the second contour loss.
  • the third calculation unit is configured to:
  • the third calculation unit is configured to:
  • the first weight of the contour point corresponding to the model vertex is 1, and when the joint point to which the model vertex belongs is not visible, the first weight of the contour point corresponding to the model vertex is Is 0.
  • the third calculation unit is configured to
  • the second weight of the contour points corresponding to each model vertex is determined, and the second weight and the fifth Euclidean distance are negative relationship;
  • the device further includes:
  • the regular loss module is used to perform regular processing on the morphological prediction parameters to obtain the fourth model prediction loss
  • the training module 1360 is also used for:
  • FIG. 14 shows a block diagram of a three-dimensional human body reconstruction device provided by an embodiment of the present application.
  • the device has the function of executing the above-mentioned method example, and the function can be realized by hardware, or by hardware executing corresponding software.
  • the device may include:
  • the second obtaining module 1410 is configured to obtain a target picture, and the target picture contains a human body image;
  • the third prediction module 1420 is configured to input the target picture into a posture parameter prediction model to obtain posture prediction parameters, where the posture prediction parameters are parameters used to indicate the posture of the human body in the SMPL prediction parameters;
  • the fourth prediction module 1430 is configured to input the target picture into a morphological parameter prediction model to obtain a morphological prediction parameter, where the morphological prediction parameter is a parameter used to indicate a human body shape in the SMPL prediction parameter;
  • the second construction module 1440 is configured to construct a three-dimensional model of the target human body through the SMPL model according to the posture prediction parameter and the morphology prediction parameter.
  • FIG. 15 shows a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • the computer device is used to implement the training method of the SMPL parameter prediction model provided in the above-mentioned embodiment, or the three-dimensional human body reconstruction method. Specifically:
  • the computer device 1500 includes a central processing unit (CPU) 1501, a system memory 1504 including a random access memory (RAM) 1502 and a read only memory (ROM) 1503, and a system bus 1505 connecting the system memory 1504 and the central processing unit 1501 .
  • the server 1500 also includes a basic input/output system (I/O system) 1506 that helps transfer information between various devices in the computer, and a mass storage for storing the operating system 1513, application programs 1514, and other program modules 1515.
  • Equipment 1507 includes a basic input/output system
  • the basic input/output system 1506 includes a display 1508 for displaying information and an input device 1509 such as a mouse and a keyboard for the user to input information.
  • the display 1508 and the input device 1509 are both connected to the central processing unit 1501 through the input and output controller 1510 connected to the system bus 1505.
  • the basic input/output system 1506 may further include an input and output controller 1510 for receiving and processing input from multiple other devices such as a keyboard, a mouse, or an electronic stylus.
  • the input and output controller 1510 also provides output to a display screen, a printer, or other types of output devices.
  • the mass storage device 1507 is connected to the central processing unit 1501 through a mass storage controller (not shown) connected to the system bus 1505.
  • the mass storage device 1507 and its associated computer-readable medium provide non-volatile storage for the computer device 1500. That is, the mass storage device 1507 may include a computer-readable medium (not shown) such as a hard disk or a CD-ROM drive.
  • the computer-readable media may include computer storage media and communication media.
  • Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storing information such as computer readable instructions, data structures, program modules or other data.
  • Computer storage media include RAM, ROM, EPROM, EEPROM, flash memory or other solid-state storage technologies, CD-ROM, DVD or other optical storage, tape cartridges, magnetic tape, disk storage or other magnetic storage devices.
  • RAM random access memory
  • ROM read-only memory
  • EPROM Erasable programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • the computer device 1500 may also be connected to a remote computer on the network through a network such as the Internet to run. That is, the server 1500 can be connected to the network 1512 through the network interface unit 1511 connected to the system bus 1505, or in other words, the network interface unit 1511 can also be used to connect to other types of networks or remote computer systems.
  • a network such as the Internet
  • At least one instruction, at least one program, code set or instruction set is stored in the memory, and the at least one instruction, at least one program, code set or instruction set is configured to be executed by one or more processors to realize the foregoing
  • the embodiment of the present application also provides a computer-readable storage medium that stores at least one instruction, at least one program, code set, or instruction set, the at least one instruction, the at least one program, the code set Or the instruction set is loaded and executed by the processor to implement the training method of the SMPL parameter prediction model provided in the foregoing embodiments, or to implement the three-dimensional human body reconstruction method provided in the foregoing embodiments.
  • the computer-readable storage medium may include: Read Only Memory (ROM), Random Access Memory (RAM), Solid State Drives (SSD, Solid State Drives), or optical discs.
  • random access memory may include resistive random access memory (ReRAM, Resistance Random Access Memory) and dynamic random access memory (DRAM, Dynamic Random Access Memory).
  • ReRAM resistive random access memory
  • DRAM Dynamic Random Access Memory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Geometry (AREA)
  • Computer Graphics (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Processing Or Creating Images (AREA)

Abstract

一种SMPL参数预测模型的训练方法,包括:获取样本图片;将样本图片输入姿态参数预测模型,得到姿态预测参数;将样本图片输入形态参数预测模型,得到形态预测参数;根据SMPL预测参数,并结合样本图片的标注信息,计算模型预测损失;根据模型预测损失反向训练姿态参数预测模型和形态参数预测模型。

Description

SMPL参数预测模型的训练方法、计算机设备及存储介质
本申请要求于2019年02月01日提交中国专利局,申请号为201910103414X、发明名称为“SMPL参数预测模型的训练方法、服务器及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及计算机视觉领域,尤其涉及一种SMPL参数预测模型的训练方法、计算机设备及存储介质。
背景技术
三维人体重建是计算机视觉研究中的重要课题之一,在虚拟现实(VR,Virtual Reality)、人体动画、游戏等领域具有重要的应用价值。
相关技术中采用多人线性蒙皮(SMPL,Skinned Multi-Person Linear)模型对二维图像中的人体进行三维人体重建。在一种三维人体重建方式中,首先利用人体信息提取模型提取二维图像中人体的二维关节点、三维关节点、二维人体分割图、三维体素等人体信息,然后将提取到的人体信息输入参数预测模型进行SMPL参数预测,进而将预测得到的SMPL参数输入SMPL模型进行三维人体重建。
然而,采用上述方式进行三维人体重建前,需要分别训练人体信息提取模型和参数预测模型,然后再次对训练得到的模型进行联合训练,导致模型训练过程复杂,需要耗费大量时间。
发明内容
根据本申请提供的各种实施例,提供一种一种SMPL参数预测模型的训练方法、计算机设备及存储介质。所述技术方案如下:
一方面,本申请实施例提供了一种SMPL参数预测模型的训练方法,由计算机设备执行,其特征在于,所述方法包括:
获取样本图片,所述样本图片中包含人体图像;
将所述样本图片输入姿态参数预测模型,得到姿态预测参数,所述姿态预测参数是SMPL预测参数中用于指示人体姿态的参数;
将所述样本图片输入形态参数预测模型,得到形态预测参数,所述形态预测参数是所述SMPL预测参数中用于指示人体形态的参数;
根据所述SMPL预测参数,并结合所述样本图片的标注信息,计算模型预测损失;及
根据所述模型预测损失反向训练所述姿态参数预测模型和所述形态参数预测模型。
另一方面,本申请实施例提供了一种三维人体重建方法,由计算机设备执行,所述方法包括:
获取目标图片,所述目标图片中包含人体图像;
将所述目标图片输入姿态参数预测模型,得到姿态预测参数,所述姿态预测参数是SMPL预测参数中用于指示人体姿态的参数;
将所述目标图片输入形态参数预测模型,得到形态预测参数,所述形态预测参数是所述SMPL预测参数中用于指示人体形态的参数;及
根据所述姿态预测参数和所述形态预测参数,通过SMPL模型构建目标人体三维模型。
另一方面,本申请实施例提供了一种SMPL参数预测模型的训练装置,所述装置包括:
第一获取模块,用于获取样本图片,所述样本图片中包含人体图像;
第一预测模块,用于将所述样本图片输入姿态参数预测模型,得到姿态预测参数,所述姿态预测参数是SMPL预测参数中用于指示人体姿态的参数;
第二预测模块,用于将所述样本图片输入形态参数预测模型,得到形态预测参数,所述形态预测参数是所述SMPL预测参数中用于指示人体形态的参数;
损失计算模块,用于根据所述SMPL预测参数,并结合所述样本图片的标注信息,计算模型预测损失;及
训练模块,用于根据所述模型预测损失反向训练所述姿态参数预测模型和所述形态参数预测模型。
另一方面,本申请实施例提供了一种三维人体重建装置,所述方法包括:
第二获取模块,用于获取目标图片,所述目标图片中包含人体图像;
第三预测模块,用于将所述目标图片输入姿态参数预测模型,得到姿态预测参数,所述姿态预测参数是SMPL预测参数中用于指示人体姿态的参数;
第四预测模块,用于将所述目标图片输入形态参数预测模型,得到形态预测参数,所述形态预测参数是所述SMPL预测参数中用于指示人体形态的参数;及
第二构建模块,用于根据所述姿态预测参数和所述形态预测参数,通过SMPL模型构建目标人体三维模型。
另一方面,本申请实施例提供了一种计算机设备,包括处理器和存储器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行所述SMPL参数预测模型的训练方法或三维人体重建方法的步骤。
另一方面,本申请实施例提供了一种非易失性的计算机可读存储介质,存储有计算机可读指令,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行上述方面所述SMPL参数预测模型的训练方法或三维人体重建方法方法。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征、目的和优点将从说明书、附图以及权利要求书变得明显。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出了本申请一个实施例提供的SMPL参数预测模型的训练方法的方法流程图;
图2是本申请实施例提供的SMPL参数预测模型的训练方法的原理示意图;
图3示出了本申请另一个实施例提供的SMPL参数预测模型的训练方法的方法流程图;
图4示出了一个实施例中计算第一模型预测损失过程的方法流程图;
图5示出了一个实施例中计算第二模型预测损失过程的方法流程图;
图6示出了一个实施例中计算第二模型预测损失过程的原理示意图;
图7示出了一个实施例中计算第三模型预测损失过程的方法流程图;
图8示出了一个实施例中计算第三模型预测损失过程的原理示意图;
图9示出了一个实施例中本申请一个实施例提供的应用场景的场景示意图;
图10示出了本申请一个实施例提供的三维人体重建方法的方法流程图;
图11和图12是利用公开数据集对本申请提供方案以及HMR方案进行测试时得到的人体三维重建结果;
图13示出了本申请一个实施例提供的SMPL参数预测模型的训练装置的框图;
图14示出了本申请一个实施例提供的三维人体重建装置的框图;
图15示出了本申请一个实施例提供的计算机设备的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
为了方便理解,下面对本申请实施例中涉及的名词进行说明。
SMPL模型:一种参数化的人体模型,该模型由SMPL参数驱动,SMPL参数中包括形态(shape)参数β以及姿态(pose)参数θ。其中,形态参数
Figure PCTCN2020072023-appb-000001
包含表征人体的高矮胖瘦、头身比例等10个参数;姿态(pose)参数
Figure PCTCN2020072023-appb-000002
包含24个关节点对应的72个参数(每个关节点对应的参数使用一个三维旋转向量表示,因此共包含24×3个参数)。
基于SMPL模型,三维人体模型可以被定义为:
Figure PCTCN2020072023-appb-000003
其中,
Figure PCTCN2020072023-appb-000004
为任意人体的三维人体模型,且三维人体模型的表面包含n=6890个模型顶点,β为形态参数,θ为姿态参数,φ是从三维人体扫描数据中学习得到的固定参数,
Figure PCTCN2020072023-appb-000005
是平均形态、标准姿态(zero pose)下平均 人体模型的模型顶点参数(每个顶点使用三维坐标表示,因此包含3n个参数),Bs是形态独立混合函数,用于根据形态参数调整平均人体模型的形态,Bp是姿态独立混合函数,用于根据姿态参数调整平均人体模型的姿态,
Figure PCTCN2020072023-appb-000006
是用于计算人体关节点位置的函数,
Figure PCTCN2020072023-appb-000007
是一个标准的混合蒙皮函数。
投影函数:一种用于将三维空间中的坐标点投影到二维空间的函数,本申请实施例中的投影函数用于将三维人体模型的模型顶点投影到二维图像空间。在一种可能的实施方式中,该投影函数采用弱透视投影(weak perspective projection)函数,该投影函数对应的投影参数
Figure PCTCN2020072023-appb-000008
其中,
Figure PCTCN2020072023-appb-000009
为缩放参数,
Figure PCTCN2020072023-appb-000010
Figure PCTCN2020072023-appb-000011
为平移参数,相应的,将三维空间中的坐标点(x,y,z)投影到二维空间可以表示为:
Figure PCTCN2020072023-appb-000012
姿态参数预测模型:一种用于预测图片中人体姿态的神经网络模型,该模型的模型输入为图片,模型输出为72维的姿态参数。在一个实施例中,本申请实施例中的姿态参数预测模型还用于根据输入的图片输出投影参数,相应的,该模型的模型输出即为72+3=75维。
形态参数预测模型:一种用于预测图片中人体形态的神经网络模型,该模型的模型输入为图片,模型输出为10维的形态参数。
标注信息:在机器学习领域,用于指示训练样本中关键参数的信息被称为标注信息,该标注信息可以通过人工标注生成。本申请实施例中的标注信息即用于指示人体图像中的关键参数,该标注信息可以包括SMPL参数、二维关节点坐标、三维关节点坐标、二维人体轮廓中的至少一种。
损失函数(loss function):一种用于估量模型的预测值与真实值(ground truth)之间的差异,是一个非负实值函数。其中,模型的损失函数越小,模型的鲁棒性越好。本申请实施例中的损失函数即用于估量姿态参数预测模型与形态参数预测模型输出的预测参数与预先标注信息之间的差异。
采用SMPL模型进行三维人体重建时,影响重建三维人体的参数包括形态参数和姿态参数,因此,基于单张图片进行三维人体重建的关键点在于对形态参数以及姿态参数的准确预测。相关技术中,进行形态参数和姿态参数预测前,首先需要通过人体信息提取模型提取图片中人体的人体信息,然后 将提取到的一系列人体信息输入参数预测模型,最终得到参数预测模型输出的SMPL参数。
采用上述方式进行三维人体重建时,由于SMPL参数的准确性与提取到的人体信息的准确性密切相关,因此需要通过人体信息提取模型提取多维度的人体信息,比如二维关节点、三维关节点、二维人体分割图、三维体素等等,相应的,需要构建复杂度较高的人体信息提取模型。同时,由于输入参数预测模型的参数量较大(即人体信息的信息量较大),因此构建的参数预测模型的复杂度也较高。此外,模型训练过程中,首先需要单独训练人体信息提取模型和参数预测模型,然后对训练得到的模型进行联合训练,进一步提高了模型训练的复杂度,增加了模型训练耗时。
为了避免上述问题,本申请实施例提供的SMPL参数预测模型的训练方法中,分别设计用于预测姿态参数和形态参数的两个神经网络模型(以图片为模型输入),避免了单独训练用于提取人体信息的人体信息提取模型;同时,从人体形态和人体姿态角度设计相应的损失函数,并基于设计的损失函数以及标注信息对两个神经网络模型进行训练,提高神经网络模型的预测准确率,进而提高重建三维人体在人体姿态以及人体形态上准确性。下面采用示意性的实施例进行说明。
请参考图1,其示出了本申请一个实施例提供的SMPL参数预测模型的训练方法的方法流程图。本实施例以该训练方法应用于服务器为例进行说明,该方法可以包括以下几个步骤:
步骤101,获取样本图片,样本图片中包含人体图像。
在一种可能的实施方式中,服务器基于若干个样本图片集进行模型训练,因此,训练过程中,服务器从样本图片集中获取样本图片,该样本图片集中包含若干预先经过标注的样本图片。
在一个实施例中,每张样本图片对应至少一种类型的标注信息,且不同样本图片集中样本图片对应的标注信息的类型不同。
比如,样本图片集A中样本图片的标注信息包括二维关节点坐标和二维人体轮廓;样本图片集B中样本图片的标注信息包括二维关节点坐标和三维关节点坐标;样本图片集B中样本图片的标注信息包括二维关节点坐标和 SMPL参数。
由于预测姿态参数和形态参数时需要利用到图片中不同的信息,因此,获取到样本图片后,服务器分别将样本图片输入姿态参数预测模型和形态参数预测模型中。
在一个实施例中,将样本图片输入姿态/形态参数预测模型前,服务器需要对样本图片进行预处理,使得输入姿态/形态参数预测模型的样本图片符合模型输入要求。其中,预处理方式包括裁剪和尺寸缩放。比如,预处理后样本图片的尺寸为224×224。
步骤102,将样本图片输入姿态参数预测模型,得到姿态预测参数,姿态预测参数是SMPL预测参数中用于指示人体姿态的参数。
在一个实施例中,姿态参数预测模型输出的姿态预测参数为72维参数,用于指示人体24个关节点的旋转向量。
在一种可能的实施方式中,姿态参数预测模型的主干网络结构为残差神经网络(Residual neural Network,ResNet),比如,ResNet50,本申请实施例并不对姿态参数预测模型的具体结构进行限定。示意性的,姿态参数预测模型中各网络层的参数设置如表一所示。
表一
Figure PCTCN2020072023-appb-000013
Figure PCTCN2020072023-appb-000014
步骤103,将样本图片输入形态参数预测模型,得到形态预测参数,形态预测参数是SMPL预测参数中用于指示人体形态的参数。
在一个实施例中,形态参数预测模型输出的形态预测参数为10维参数,用于指示人体高矮胖瘦、头身比例等10个参数。
在一种可能的实施方式中,形态参数预测模型基于简化的视觉几何组VGG(Visual Geometry Group,VGG)网络构建,本申请实施例并不对形态参数预测模型的具体结构进行限定。示意性的,形态参数预测模型中各网络层的参数设置如表二所示。
表二
网络层 输出尺寸 网络层参数
卷积层1(conv1) 112×112 32个3×3卷积核,步长=2
卷积层2(conv2) 56×56 64个3×3卷积核,步幅=2
卷积层3(conv3) 28×28 128个3×3卷积核,步长=2
卷积层4(conv4) 14×14 256个3×3卷积核,步长=2
卷积层5(conv5) 7×7 236个3×3卷积核,步长=2
全连接层1(fc1) 1×1 512
全连接层2(fc2) 1×1 1024
全连接层3-输出(fc3-output) 1×1 10
步骤104,根据姿态预测参数和形态预测参数,通过SMPL模型构建人体三维模型。
进一步的,服务器将姿态预测参数和形态预测参数带入SMPL模型中,构建得到人体三维模型,以便后续基于该人体三维模型评估模型的参数预测效果。其中,该人体三维模型包含6890个模型顶点的顶点坐标。
步骤105,根据SMPL预测参数和/或人体三维模型,并结合样本图片的标注信息,计算模型预测损失。
为了衡量预测结果与真实值之间的差异,在一种可能的实施方式中,服务器根据预测结果以及样本图片的标注信息,通过预先构建的损失函数计算模型预测损失。在一个实施例中,该损失函数包括至少一个子损失函数,且不同子损失函数用于根据不同类型的标注信息计算模型预测损失。
在一个实施例中,由于不同样本图片集中样本图片包含的标注信息不同,因此服务器根据样本图片的标注信息,确定采用相应的子损失函数计算模型预测损失。
在一个实施例中,服务器根据标注信息和SMPL预测参数计算模型预测损失,和/或,服务器根据标注信息和人体三维模型计算模型预测损失。
步骤106,根据模型预测损失反向训练姿态参数预测模型和形态参数预测模型。
在一种可能的实施方式中,根据计算得到的模型预测损失,服务器采用梯度下降(Gradient Descent)算法,反向训练姿态参数预测模型和形态参数预测模型(对模型中的参数进行优化),并在梯度小于阈值时停止反向训练。本申请实施例并不对反向训练模型的具体方式进行限定。
在一个实施例中,模型训练过程中采用的学习率为1e-4,且批尺寸(batch_size)为96。
综上所述,本申请实施例中,通过将包含人体图像的样本图片分别输入姿态参数预测模型和形态参数预测模型,得到SMPL预测参数中的姿态预测参数和形态预测参数,并基于姿态预测参数和形态预测参数构建人体三维模型,从而基于样本图片的标注信息,根据SMPL预测参数和人体三维模型中的至少一种,计算模型预测损失,进而根据模型预测损失对姿态参数预测模型和形态参数预测模型进行反向训练;采用本申请实施例提供的方法训练模型时,直接将样本图片作为模型输入进行模型训练,无需单独训练提取图片中人体信息的模型,从而降低了模型训练的复杂度,提高了模型训练的效率;同时,根据标注信息、预测参数以及基于预测参数构建的三维人体模型计算模型预测损失,有助于提高模型的训练质量,进而提高预测得到的参数的准确性。
在一种可能的实施方式中,服务器预先定义的损失函数中包含四个子损失函数,分别为SMPL参数损失函数、关节点位置损失函数、人体轮廓损失函数以及正则损失函数。其中,SMPL参数损失函数用于衡量姿态预测参数以及形态预测参数与标注的SMPL参数之间的差异;关节点位置损失函数用于衡量预测出的关节点位置与标注的关节点位置之间的差异;人体轮廓损失 函数用于衡量重建三维人体模型的人体轮廓与样本图片中人体轮廓之间的差异。
相应的,服务器训练模型的过程如图2所示。服务器将样本图片21输入姿态参数预测模型22后,得到姿态参数预测模型22输出的投影参数221以及姿态预测参数222,将样本图片21输入形态参数预测模型23后,得到形态参数预测模型23输出的形态预测参数231。进一步的,基于预测得到姿态预测参数222和形态预测参数231,服务器通过SMPL模型构建人体三维模型24。针对根据预测参数、人体三维模型以及标注信息计算模型预测损失的过程,下面通过示意性的实施例进行说明。
请参考图3,其示出了本申请另一个实施例提供的SMPL参数预测模型的训练方法的方法流程图。本实施例以该训练方法应用于服务器为例进行说明,该方法可以包括以下几个步骤:
步骤301,获取样本图片,样本图片中包含人体图像。
本步骤的实施方式可以参考上述步骤101,本实施例在此不再赘述。
步骤302,将样本图片输入姿态参数预测模型,得到姿态预测参数和投影参数。
由于后续计算模型预测损失时需要应用到二维坐标(比如关节点的二维坐标,人体轮廓的二维坐标),而根据姿态预测参数和形态预测参数构建出的人体三维模型中仅包含模型顶点的三维坐标,因此,通过模型预测姿态参数和形态参数的同时,还需要对样本图片的投影参数进行预测,以便后续利用投影参数将人体三维模型上的点投影到二维图像空间。其中,该投影参数与样本图片的拍摄角度相关。
在实施过程中发现,改变投影参数中的缩放参数
Figure PCTCN2020072023-appb-000015
或者,改变形态参数β均会对人体形态产生影响,导致投影参数和形态参数的预测存在歧义性。为了避免投影参数和形态参数预测的歧义性,在一种可能的实施方式中,服务器通过姿态参数预测模型实现姿态参数以及投影参数预测,此时姿态预测参数模型输出的参数为75维度,其中包含72维度的姿态预测参数θ以及3维度的投影参数
Figure PCTCN2020072023-appb-000016
步骤303,将样本图片输入形态参数预测模型,得到形态预测参数。
步骤304,根据姿态预测参数和形态预测参数,通过SMPL模型构建人 体三维模型。
上述步骤303和304的实施方式可以参考步骤103和104,本实施例在此不再赘述。
步骤305,根据SMPL预测参数,以及标注信息中的SMPL标注参数,计算第一模型预测损失。
在一个实施例中,当样本图片的标注信息中包含SMPL标注参数(包括姿态标注参数和形态标注参数)时,服务器即根据SMPL预测参数(包括姿态预测参数和形态预测参数)和SMPL标注参数,通过SMPL参数损失函数计算第一模型预测损失。在一种可能的实施方式中,如图4所示,本步骤包括如下步骤。
步骤305A,计算姿态标注参数与姿态预测参数之间的第一欧式距离。
本实施例中,服务器通过计算姿态标注参数与姿态预测参数之间的第一欧式距离(72维向量之间的欧式距离),进而根据第一欧式距离评估姿态参数预测的准确性。其中,第一欧式距离越小,表明姿态参数预测的准确性越高。
示意性的,第一欧式
Figure PCTCN2020072023-appb-000017
其中,θ为姿态预测参数,
Figure PCTCN2020072023-appb-000018
为姿态标注参数。
步骤305B,计算形态标注参数与形态预测参数之间的第二欧式距离。
与计算第一欧式距离相似的,本实施例中,服务器通过计算形态标注参数与形态预测参数之间的第二欧式距离(10维向量之间的欧式距离),进而根据第二欧式距离评估形态参数预测的准确性。其中,第二欧式距离越小,表明形态参数预测的准确性越高。
示意性的,第二欧式
Figure PCTCN2020072023-appb-000019
其中,β为形态预测参数,
Figure PCTCN2020072023-appb-000020
为形态标注参数。
步骤305C,根据第一欧式距离和第二欧式距离确定第一模型预测损失。
进一步的,根据计算到的第一欧式距离和第二欧式距离,服务器计算第一模型预测损失(即SMPL参数预测损失)。在一个实施例中,第一模型预测 损失
Figure PCTCN2020072023-appb-000021
表示为:
Figure PCTCN2020072023-appb-000022
其中,λ p为参数损失权重。比如,λ p为60。
示意性的,如图2所示,服务器根据姿态预测参数222以及形态预测参数231计算得到第一模型预测损失
Figure PCTCN2020072023-appb-000023
步骤306,根据人体三维模型中关节点的关节点预测坐标,以及标注信息中关节点的关节点标注坐标,计算第二模型预测损失。
在一个实施例中,当样本图片的标注信息中包含关节点标注坐标(包括二维关节点标注坐标和/或三维关节点标注坐标)时,服务器首先确定人体三维模型中关节点的关节点预测坐标,从而根据关节点预测坐标和关节点标注坐标,通过关节点位置损失函数计算第二模型预测损失。
在一种可能的实施方式中,如图5所示,本步骤包括如下步骤。
步骤306A,计算人体三维模型中关节点的三维关节点预测坐标和三维关节点标注坐标之间的第三欧式距离。
在一种可能的实施方式中,服务器选取24个关节点中的14个关节点作为目标关节点,并计算14个目标关节点的三维关节点预测坐标和三维关节点标注坐标之间的第三欧式距离。
关于计算人体三维模型中关节点的三维关节点预测坐标的方式,在一个实施例中,服务器根据人体三维模型中关节点周侧模型顶点的顶点坐标,确定人体三维模型中关节点的三维关节点预测坐标。在一种可能的实现方式中,关节点的三维关节点预测坐标为关节点周侧模型顶点的顶点坐标的平均值。
示意性的,如图2所示,服务器根据人体三维模型24生成三维关节点图25,该三维关节点图25中包含各个关节点的三维关节点预测坐标。
计算得到各个关节点的三维关节点预测坐标后,服务器计算三维关节点预测坐标和三维关节点标注坐标(对应同一关节点)之间的第三欧式距离。示意性的,第三欧式
Figure PCTCN2020072023-appb-000024
其中,
Figure PCTCN2020072023-appb-000025
是关节点的集合,
Figure PCTCN2020072023-appb-000026
为三维人体模型中关节点的三维关节点 预测坐标,j 3D为关节点的三维关节点标注坐标。
步骤306B,计算人体三维模型中关节点的二维关节点预测坐标和二维关节点标注坐标之间的第四欧式距离。
除了衡量三维关节点坐标的准确度之外,服务器还可以进一步衡量二维关节点坐标的准确度。在一种可能的实施方式中,本步骤包括如下步骤:
一、根据人体三维模型中关节点周侧模型顶点的顶点坐标,确定人体三维模型中关节点的三维关节点预测坐标。
由于二维关节点可以由三维关节点经过投影变换得到,因此,在计算二维关节点预测坐标前,服务器首先确定关节点的三维关节点预测坐标。其中,确定三维关节点预测坐标的过程可以参考上述步骤306A,本步骤在此不再赘述。
二、根据投影参数,对三维关节点预测坐标进行投影处理,得到二维关节点预测坐标。
由于姿态参数预测模型在输出姿态预测参数的同时,还输出了投影参数,因此,服务器可以根据投影参数对三维关节点预测坐标进行投影处理,即将三维关节点投影到二维图像空间,从而得到二维关节点的二维关节点预测坐标。
比如,对于三维关节点预测坐标
Figure PCTCN2020072023-appb-000027
对其进行投影处理得到的二维关节点预测坐标即为
Figure PCTCN2020072023-appb-000028
示意性的,如图2所示,服务器根据三维关节点图25和投影参数221生成二维关节点图26,该二维关节点图26中包含各个关节点的二维关节点预测坐标。
三、计算二维关节点预测坐标和二维关节点标注坐标之间的第四欧式距离。
计算得到各个关节点的二维关节点预测坐标后,服务器计算二维关节点预测坐标和二维关节点标注坐标(对应同一关节点)之间的第四欧式距离。示意性的,第四欧式
Figure PCTCN2020072023-appb-000029
其中,
Figure PCTCN2020072023-appb-000030
是关节点的集合,
Figure PCTCN2020072023-appb-000031
为三维人体模型中关节点的三维关节点 预测坐标,proj为投影处理,j 2D为关节点的二维关节点标注坐标。
步骤306C,根据第三欧式距离和第四欧式距离计算第二模型预测损失。
进一步的,根据计算到的第三欧式距离和第四欧式距离,服务器计算第二模型预测损失(即关节点位置预测损失)。在一个实施例中,第二模型预测损失
Figure PCTCN2020072023-appb-000032
表示为:
Figure PCTCN2020072023-appb-000033
其中,λ 3D为三维关节点位置损失权重,λ 2D为二维关节点位置损失权重。比如,λ 3D和λ 3D均为60.0。
示意性的,如图2所示,服务器根据三维关节点图25和二维关节点图26计算得到第二模型预测损失
Figure PCTCN2020072023-appb-000034
如图6所示,在一个完整的第二模型预测损失计算过程中,服务器首先根据人体三维模型61确定关节点的三维关节点预测坐标62,从而根据三维关节点预测坐标62和标注信息中的三维关节点标注坐标63计算第三欧式距离64;同时,服务器根据投影参数65对三维关节点预测坐标62进行投影处理,得到关节点对应的二维关节点预测坐标66,从而根据二维关节点预测坐标66和标注信息中的二维关节点标注坐标67计算第四欧式距离68。最终,服务器根据第三欧式距离64和第四欧式距离68确定出第二模型预测损失69。
需要说明的是,当样本图片对应的标注信息中仅包含三维关节点标注坐标或二维关节点标注坐标中的一项时,服务器可以仅根据第三欧式距离或第四欧式距离确定第二模型预测损失,本实施例对此不做限定。
步骤307,根据人体三维模型的预测二维人体轮廓,以及标注信息中的标注二维人体轮廓,计算第三模型预测损失。
当标注信息中包含标注二维人体轮廓时,为了提高人体形态预测的准确性,服务器可以根据构建的人体三维模型进一步生成预测二维人体轮廓,从而通过计算人体轮廓之间的损失,确定人体形态预测的准确性。
在一个实施例中,人体轮廓用于指示图片中的人体区域,可以采用黑白图像进行表示,其中黑白图像中的白色区域即为人体区域。
示意性的,如图2所示,服务器根据人体三维模型24生成预测二维人体 轮廓27,并根据标注二维人体轮廓28计算得到第三模型预测损失
Figure PCTCN2020072023-appb-000035
在一种可能的实施方式中,如图7所示,本步骤可以包括如下步骤:
步骤307A,根据投影参数,将人体三维模型中的模型顶点投影到二维空间,并生成预测二维人体轮廓。
在人体姿态和人体形态预测准确的情况下,将人体三维模型投影到二维图像空间后,得到的二维人体轮廓应该与样本图片中的二维人体轮廓重合,因此,服务器可以基于二维人体轮廓的差异性衡量人体姿态和人体形态的预测准确性。
在一个实施例中,对于人体三维模型中的各个模型顶点,服务器根据投影参数,通过投影函数将各个模型顶点投影到二维空间,从而生成包含预测二维人体轮廓的二维图像。
步骤307B,根据预测二维人体轮廓和标注二维人体轮廓,计算第一轮廓损失和第二轮廓损失。
其中,第一轮廓损失又称为正向轮廓损失,用于指示预测二维人体轮廓到标注二维人体轮廓的损失;第二轮廓损失又称为反向轮廓损失,用于指示标注二维人体轮廓的损失到预测二维人体轮廓的损失。
在一种可能的实施方式,服务器计算轮廓损失可以包括如下步骤。
一、计算预测二维人体轮廓中轮廓点到标注二维人体轮廓的第一最短距离;根据预测二维人体轮廓中各个轮廓点对应的第一最短距离,计算第一轮廓损失。
在一种可能的实施方式中,对于预测二维人体轮廓中的各个轮廓点,终端计算该轮廓点到标注二维人体轮廓的第一最短距离,并将该各个轮廓点对应的第一最短距离进行累加,从而得到第一轮廓损失。
然而,在实施过程中发现,对于人体三维模型的遮挡区域(被其他物体所遮挡),由于遮挡区域不可视,若不考虑可视性而直接计算根据第一最短距离计算第一轮廓损失,将造成第一轮廓损失偏大。因此,为了提高第一轮廓损失的准确性,在一种可能的实施方式中,根据预测二维人体轮廓中各个轮廓点对应的第一最短距离,计算第一轮廓损失时包括如下步骤:
1、根据人体三维模型中各个模型顶点所属关节点的可见性,确定各个模型顶点对应轮廓点的第一权重。
在一个实施例中,对于人体三维模型中的各个模型顶点υ,服务器自动将模型顶点υ划分到与其距离最近的关节点
Figure PCTCN2020072023-appb-000036
其中,服务器将模型顶点划分至14个关节点
Figure PCTCN2020072023-appb-000037
对于预测二维人体轮廓中轮廓点,计算得到该轮廓点对应的第一最短距离后,服务器检测该轮廓点对应模型顶点所属关节点的可见性,若模型顶点所属关节点可见,服务器确定该轮廓点的第一权重为1;若模型顶点所属关节点不可见时,服务器确定该轮廓点的第一权重为0。其中,确定模型顶点对应轮廓点的第一权重可以采用如下公式:
Figure PCTCN2020072023-appb-000038
其中,ω υ为模型顶点υ对应轮廓点的第一权重,
Figure PCTCN2020072023-appb-000039
为模型顶点υ所属的关节点。
当然,服务器也可以先检测轮廓点对应模型顶点所属关节点的可见性,并在关节点的不可见时,停止计算该模型顶点对应轮廓点到标注二维人体轮廓的最短距离,从而减少计算量。
2、根据预测二维人体轮廓中各个轮廓点对应的第一最短距离以及第一权重,计算第一轮廓损失。
相应的,服务器根据第一权重对各个轮廓点对应的第一最短距离进行修正,从而对修正后的第一最短距离进行累加,得到第一轮廓损失。
此外,在实施过程中还发现,关节点预测不准确同样会影响到投影生成的预测二维人体轮廓。因此,为了降低关节点预测不准确对第一轮廓损失造成的影响,在一种可能的实施方式中,根据预测二维人体轮廓中各个轮廓点对应的第一最短距离,计算第一轮廓损失时还可以包括如下步骤:
1、确定人体三维模型中各个模型顶点所属关节点的关节点预测坐标。
在一种可能的实施方式中,对于人体三维模型中的各个模型顶点υ,服务器自动将模型顶点υ划分到与其距离最近的关节点
Figure PCTCN2020072023-appb-000040
并通过投影参数将关机点
Figure PCTCN2020072023-appb-000041
投影到二维空间,得到关节点
Figure PCTCN2020072023-appb-000042
的(二维)关节点预测坐标。其中,模型顶点υ所属关节点
Figure PCTCN2020072023-appb-000043
的关节点
Figure PCTCN2020072023-appb-000044
2、根据关节点预测坐标与关节点标注坐标之间的第五欧式距离,确定各个模型顶点对应轮廓点的第二权重,第二权重与第五欧式距离之间呈负相关 关系。
确定出关节点的关节点预测坐标后,服务器通过计算关节点预测坐标与关节点标注坐标(同一关节点)之间的第五欧式距离,确定关节点预测的准确性。其中,第五欧式
Figure PCTCN2020072023-appb-000045
为关节点
Figure PCTCN2020072023-appb-000046
的(二维)关节点标注坐标。
进一步的,服务器根据第五欧式距离,确定属于该关节点的模型顶点对应轮廓点的第二权重,其中,第二权重为正值,且第二权重与第五欧式距离之间呈负相关关系。
在一种可能的实施方式中,确定模型顶点对应轮廓点的第二权重可以采用如下公式:
Figure PCTCN2020072023-appb-000047
其中,当关节点坐标预测越准确,
Figure PCTCN2020072023-appb-000048
趋向于0,相应的,第二权重ω p趋向为1;当关节点坐标预测越不准确,
Figure PCTCN2020072023-appb-000049
趋向于-∞,相应的,第二权重ω p趋向为0。
3、根据预测二维人体轮廓中各个轮廓点对应的第一最短距离以及第二权重,计算第一轮廓损失。
相应的,服务器根据第二权重对各个轮廓点对应的第一最短距离进行修正,从而对修正后的第一最短距离进行累加,得到第一轮廓损失。
在一种可能的实施方式中,服务器同时计算第一权重和第二权重,从而根据预测二维人体轮廓中各个轮廓点对应的第一最短距离、第一权重以及第二权重,计算第一轮廓损失,相应的,第一轮廓损失可以表示为:
Figure PCTCN2020072023-appb-000050
其中,
Figure PCTCN2020072023-appb-000051
为预测二维人体轮廓,
Figure PCTCN2020072023-appb-000052
为标注二维人体轮廓中与预测二维人体轮廓中轮廓点p距离最近的点。
二、计算标注二维人体轮廓中轮廓点到预测二维人体轮廓的第二最短距 离;根据标注二维人体轮廓中各个轮廓点对应的第二最短距离,计算第二轮廓损失。
在一种可能的实施方式中,对于标注二维人体轮廓中的各个轮廓点,终端计算该轮廓点到预测二维人体轮廓的第二最短距离,并将该各个轮廓点对应的第二最短距离进行累加,从而得到第二轮廓损失,相应的,第二轮廓损失可以表示为:
Figure PCTCN2020072023-appb-000053
其中,
Figure PCTCN2020072023-appb-000054
为标注二维人体轮廓,
Figure PCTCN2020072023-appb-000055
为预测二维人体轮廓中与标注二维人体轮廓中轮廓点q距离最近的点。
步骤307C,根据第一轮廓损失和第二轮廓损失确定第三模型预测损失。
在一种可能的实施方式中,服务器根据第一轮廓损失及其对应的权重,以及第二轮廓损失及其对应的权重,计算得到第三模型预测损失。其中,第三模型预测损失可以表示为:
Figure PCTCN2020072023-appb-000056
其中,
Figure PCTCN2020072023-appb-000057
为第一轮廓损失对应的权重,
Figure PCTCN2020072023-appb-000058
为第二轮廓损失对应的权重。比如,
Figure PCTCN2020072023-appb-000059
Figure PCTCN2020072023-appb-000060
均为3.0。
如图8所示,在一个完整的第三模型预测损失计算过程中,服务器首先根据人体三维模型81和投影参数82,生成预测二维人体轮廓83,然后根据预测二维人体轮廓83和标注二维人体轮廓84计算得到第一轮廓损失85和第二轮廓损失86,最终根据第一轮廓损失85和第二轮廓损失86确定第三模型预测损失87。其中,计算第一轮廓损失85过程中,服务器计算预测二维人体轮廓83上的轮廓点到标注二维人体轮廓84的第一最短距离851的同时,根据轮廓点所属关节点的可见性确定第一权重852,并根据轮廓点所属关节点的预测准确性确定第二权重853,从而根据第一最短距离851、第一权重852和第二权重853计算第一轮廓损失85;计算第二轮廓损失86过程中,服务器计算标注二维人体轮廓84上的轮廓点到预测二维人体轮廓83的第二最短距离861的,从而根据第二最短距离861确定第二轮廓损失86。
步骤308,对形态预测参数进行正则处理,得到第四模型预测损失。
为了避免人体形态过度变形,在一种可能的实施方式中,服务器对形态预测参数进行L2正则处理,从而得到第四模型预测损失
Figure PCTCN2020072023-appb-000061
其中, λ reg为正则损失权重。比如,λ reg为1.0。
示意性的,如图2所示,服务器根据形态预测参数231得到第四模型预测损失
Figure PCTCN2020072023-appb-000062
步骤309,根据模型预测损失反向训练姿态参数预测模型和形态参数预测模型。
在一种可能的实施方式中,服务器根据上述步骤中计算得到的第一、第二、第三以及第四模型预测损失,对姿态参数预测模型和形态参数预测模型进行反向训练。其中,反向训练的过程可以参考上述步骤106,本实施例在此不再赘述。
需要说明的是,由于不同样本图片集包含的标注信息的类型不同,因此,服务器可以根据上述模型预测损失中的一部分进行反向训练,本实施例对此不做限定。
本实施例中,服务器通过引入人体轮廓项约束,根据投影参数将重建的人体三维模型的模型顶点投射到二维空间,得到预测二维人体轮廓,并利用预测二维人体轮廓与标注二维人体轮廓之间的轮廓损失对模型进行反向训练,有利于提高形态参数预测的准确性,进而提高重建的三维人体模型在人体形态上的准确性。
并且,在计算预测二维人体轮廓与标注二维人体轮廓之间的轮廓损失,充分考虑关节点的可见性以及关节点坐标预测准确性对轮廓损失的影响,进一步提高了计算得到的轮廓损失的准确性。
通过上述实施例提供的训练方法完成模型训练后,即可利用训练得到的模型对单张图像中人体进行三维重建。在一种可能的应用场景下,如图9所示,终端920将包含人体图像的图片上传至服务器940。服务器940接收到图片后,通过姿态参数预测模型预测图片中人体的姿态参数,并通过形态参数预测模型预测图片中人体的形态参数,从而将包含姿态参数和形态参数的SMPL参数发送给终端920。终端920接收到SMPL参数后,即通过SMPL模型重建三维人体模型,并进行显示。当然,若终端920中存储有姿态参数预测模型和形态参数预测模型,终端920也可以在本地完成SMPL参数预测,而无需借助服务器940。
在其他可能的应用场景下,VR设备通过摄像头采集到包含玩家人体的图像后,通过内置的姿态参数预测模型预测玩家的姿态参数,并通过形态参数预测模型预测玩家的形态参数,从而根据姿态参数和形态参数重建玩家三维人体模型,并将玩家三维人体模型实时显示在VR画面中,从而增加玩家使用VR设备时的沉浸感。
当然,上述训练得到的模型还可以用于其它基于单张图片(包含人体)或视频(视频帧中包含连续的人体)重建三维人体模型的应用场景,本申请实施例对此不做限定。
请参考图10,其示出了本申请一个实施例提供的三维人体重建方法的方法流程图。本实施例以该方法应用于服务器为例进行说明,该方法可以包括以下几个步骤:
步骤1001,获取目标图片,目标图片中包含人体图像。
在一个实施例中,该目标图片是终端上传的单张图片,或者,目标图片是从终端上传的视频中截取的视频帧。
在一个实施例中,将目标图片输入姿态/形态参数预测模型前,服务器需要对目标图片进行预处理,使得输入姿态/形态参数预测模型的目标图片符合模型输入要求。其中,预处理方式包括裁剪和尺寸缩放。比如,预处理后目标图片的尺寸为224×224。
步骤1002,将目标图片输入姿态参数预测模型,得到姿态预测参数,姿态预测参数是SMPL预测参数中用于指示人体姿态的参数。
在一个实施例中,服务器将目标图片输入姿态参数预测模型后,姿态参数预测模型输出72维的姿态预测参数。需要说明的是,当态参数预测模型输出75维的参数时,服务器将其中的72维参数确定为姿态预测参数,并将剩余的3维参数确定为投影参数。
步骤1003,将目标图片输入形态参数预测模型,得到形态预测参数,形态预测参数是SMPL预测参数中用于指示人体形态的参数。
在一个实施例中,服务器将目标图片输入形态参数预测模型后,形态参数预测模型输出10维的形态预测参数。
步骤1004,根据姿态预测参数和形态预测参数,通过SMPL模型构建目 标人体三维模型。
服务器将模型输出的形态预测参数和姿态预测参数输入SMPL模型,从而构建得到包含6890个模型顶点的目标人体三维模型。在一个实施例中,服务器将目标人体三维模型的模型数据发送给终端,供终端进行渲染显示。
在一个实施例中,当终端具有三维人体模型重建功能时,服务器将模型输出的形态预测参数和姿态预测参数发送给终端,由终端进行人体三维模型重建并显示。
综上所述,本申请实施例中,通过将包含人体图像的样本图片分别输入姿态参数预测模型和形态参数预测模型,得到姿态预测参数和形态预测参数,并基于姿态预测参数和形态预测参数构建人体三维模型,从而根据样本图片的标注信息,以及姿态预测参数、形态预测参数和人体三维模型中的至少一种,计算模型预测损失,进而根据模型预测损失对姿态参数预测模型和形态参数预测模型进行反向训练;采用本申请实施例提供的方法训练模型时,直接将样本图片作为模型输入进行模型训练,无需单独训练提取图片中人体信息的模型,从而降低了模型训练的复杂度,提高了模型训练的效率;同时,根据标注信息、预测参数以及基于预测参数构建的三维人体模型计算模型预测损失,有助于提高模型的训练质量,进而提高预测得到的参数的准确性。
利用公开数据集Human3.6M,对本申请提供方案以及世界领先的人体网格恢复(Human Mesh Recovery,HMR)方案进行测试,得到的人体三维重建结果如图11所示;利用公开数据集UP(包括UP-3D和UP-S1h),对本申请提供方案以及世界领先的人体网格恢复(Human Mesh Recovery,HMR)方案进行测试,得到的人体三维重建结果如图12所示。
同时,将图11和图12所示的人体三维重建结果与原图中的人体图像进行对比分析,得到的分析结果如表三所示。
表三
Figure PCTCN2020072023-appb-000063
其中,准确率用于衡量重建人体轮廓与原图人体轮廓的契合度,F1得分用于指示结果的准确率和召回率,普式分析平均关节位置误差(Procrustes Analysis-Mean Per Joint Position Error,PA-MPVPE)用于指示关节点位置的预测误差。
从图11、图12以及表三中的分析数据可以看出,相较于HMR方案,采用本申请提供的方案进行人体三维重建时,重建结果的准确率以及召回率均有所提高,且重建结果与原图中人体图像的契合度更高,关节点位置误差更小,重建效果达到了世界领先水平。
下述为本申请装置实施例,可以用于执行本申请方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。
请参考图13,其示出了本申请一个实施例提供的SMPL参数预测模型的训练装置的框图。该装置具有执行上述方法示例的功能,功能可以由硬件实现,也可以由硬件执行相应的软件实现。该装置可以包括:
第一获取模块1310,用于获取样本图片,所述样本图片中包含人体图像;
第一预测模块1320,用于将所述样本图片输入姿态参数预测模型,得到姿态预测参数,所述姿态预测参数是SMPL预测参数中用于指示人体姿态的参数;
第二预测模块1330,用于将所述样本图片输入形态参数预测模型,得到形态预测参数,所述形态预测参数是所述SMPL预测参数中用于指示人体形态的参数;
第一构建模块1340,用于根据所述姿态预测参数和所述形态预测参数,通过SMPL模型构建人体三维模型;
损失计算模块1350,用于根据所述SMPL预测参数和/或所述人体三维模型,并结合所述样本图片的标注信息,计算模型预测损失;
训练模块1360,用于根据所述模型预测损失反向训练所述姿态参数预测模型和所述形态参数预测模型。
在一个实施例中,所述损失计算模块1350,包括:
第一计算单元,用于根据所述SMPL预测参数,以及所述标注信息中的SMPL标注参数,计算第一模型预测损失,所述SMPL标注参数中包括姿态 标注参数和形态标注参数;
和/或,
第二计算单元,用于根据所述人体三维模型中关节点的关节点预测坐标,以及所述标注信息中关节点的关节点标注坐标,计算第二模型预测损失;
和/或,
第三计算单元,用于根据所述人体三维模型的预测二维人体轮廓,以及所述标注信息中的标注二维人体轮廓,计算第三模型预测损失。
在一个实施例中,所述第一计算单元,用于:
计算所述姿态标注参数与所述姿态预测参数之间的第一欧式距离;
计算所述形态标注参数与所述形态预测参数之间的第二欧式距离;
根据所述第一欧式距离和所述第二欧式距离确定所述第一模型预测损失。
在一个实施例中,所述关节点标注坐标中包含三维关节点标注坐标和/或二维关节点标注坐标;
所述第二计算单元,用于:
计算所述人体三维模型中关节点的三维关节点预测坐标和所述三维关节点标注坐标之间的第三欧式距离;
计算所述人体三维模型中关节点的二维关节点预测坐标和所述二维关节点标注坐标之间的第四欧式距离;
根据所述第三欧式距离和/或所述第四欧式距离计算所述第二模型预测损失。
在一个实施例中,所述第二计算单元,还用于:
根据所述人体三维模型中关节点周侧模型顶点的顶点坐标,确定所述人体三维模型中关节点的所述三维关节点预测坐标;
计算所述三维关节点预测坐标和所述三维关节点标注坐标之间的所述第三欧式距离。
在一个实施例中,所述姿态参数预测模型还用于根据输入的所述样本图片输出投影参数,所述投影参数用于将三维空间的点投影到二维空间;
所述第二计算单元,还用于:
根据所述人体三维模型中关节点周侧模型顶点的顶点坐标,确定所述人 体三维模型中关节点的所述三维关节点预测坐标;
根据所述投影参数,对所述三维关节点预测坐标进行投影处理,得到所述二维关节点预测坐标;
计算所述二维关节点预测坐标和所述二维关节点标注坐标之间的所述第四欧式距离。
在一个实施例中,所述姿态参数预测模型还用于根据输入的所述样本图片输出投影参数,所述投影参数用于将三维空间的点投影到二维空间;
所述第三计算单元,用于:
根据所述投影参数,将所述人体三维模型中的模型顶点投影到二维空间,并生成所述预测二维人体轮廓;
根据所述预测二维人体轮廓和所述标注二维人体轮廓,计算第一轮廓损失和第二轮廓损失;
根据所述第一轮廓损失和所述第二轮廓损失确定所述第三模型预测损失。
在一个实施例中,所述第三计算单元,用于:
计算所述预测二维人体轮廓中轮廓点到所述标注二维人体轮廓的第一最短距离;根据所述预测二维人体轮廓中各个轮廓点对应的所述第一最短距离,计算所述第一轮廓损失;
计算所述标注二维人体轮廓中轮廓点到所述预测二维人体轮廓的第二最短距离;根据所述标注二维人体轮廓中各个轮廓点对应的所述第二最短距离,计算所述第二轮廓损失。
在一个实施例中,所述第三计算单元,用于:
根据所述人体三维模型中各个模型顶点所属关节点的可见性,确定各个模型顶点对应轮廓点的第一权重;
根据所述预测二维人体轮廓中各个轮廓点对应的所述第一最短距离以及所述第一权重,计算所述第一轮廓损失;
其中,当模型顶点所属关节点可见时,所述模型顶点对应轮廓点的所述第一权重为1,当模型顶点所属关节点不可见时,所述模型顶点对应轮廓点的所述第一权重为0。
在一个实施例中,所述第三计算单元,用于
确定所述人体三维模型中各个模型顶点所属关节点的所述关节点预测坐标;
根据所述关节点预测坐标与所述关节点标注坐标之间的第五欧式距离,确定各个模型顶点对应轮廓点的第二权重,所述第二权重与所述第五欧式距离之间呈负相关关系;
根据所述预测二维人体轮廓中各个轮廓点对应的所述第一最短距离以及所述第二权重,计算所述第一轮廓损失。
在一个实施例中,所述装置还包括:
正则损失模块,用于对所述形态预测参数进行正则处理,得到第四模型预测损失;
所述训练模块1360,还用于:
根据所述第四模型预测损失反向训练所述姿态参数预测模型和所述形态参数预测模型。
请参考图14,其示出了本申请一个实施例提供的三维人体重建装置的框图。该装置具有执行上述方法示例的功能,功能可以由硬件实现,也可以由硬件执行相应的软件实现。该装置可以包括:
第二获取模块1410,用于获取目标图片,所述目标图片中包含人体图像;
第三预测模块1420,用于将所述目标图片输入姿态参数预测模型,得到姿态预测参数,所述姿态预测参数是SMPL预测参数中用于指示人体姿态的参数;
第四预测模块1430,用于将所述目标图片输入形态参数预测模型,得到形态预测参数,所述形态预测参数是所述SMPL预测参数中用于指示人体形态的参数;
第二构建模块1440,用于根据所述姿态预测参数和所述形态预测参数,通过SMPL模型构建目标人体三维模型。
请参考图15,其示出了本申请一个实施例提供的计算机设备的结构示意图。该计算机设备用于实施上述实施例提供的SMPL参数预测模型的训练方法,或,三维人体重建方法。具体来讲:
所述计算机设备1500包括中央处理单元(CPU)1501、包括随机存取存储器(RAM)1502和只读存储器(ROM)1503的系统存储器1504,以及连接系统存储器1504和中央处理单元1501的系统总线1505。所述服务器1500还包括帮助计算机内的各个器件之间传输信息的基本输入/输出系统(I/O系统)1506,和用于存储操作系统1513、应用程序1514和其他程序模块1515的大容量存储设备1507。
所述基本输入/输出系统1506包括有用于显示信息的显示器1508和用于用户输入信息的诸如鼠标、键盘之类的输入设备1509。其中所述显示器1508和输入设备1509都通过连接到系统总线1505的输入输出控制器1510连接到中央处理单元1501。所述基本输入/输出系统1506还可以包括输入输出控制器1510以用于接收和处理来自键盘、鼠标、或电子触控笔等多个其他设备的输入。类似地,输入输出控制器1510还提供输出到显示屏、打印机或其他类型的输出设备。
所述大容量存储设备1507通过连接到系统总线1505的大容量存储控制器(未示出)连接到中央处理单元1501。所述大容量存储设备1507及其相关联的计算机可读介质为计算机设备1500提供非易失性存储。也就是说,所述大容量存储设备1507可以包括诸如硬盘或者CD-ROM驱动器之类的计算机可读介质(未示出)。
不失一般性,所述计算机可读介质可以包括计算机存储介质和通信介质。计算机存储介质包括以用于存储诸如计算机可读指令、数据结构、程序模块或其他数据等信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动介质。计算机存储介质包括RAM、ROM、EPROM、EEPROM、闪存或其他固态存储其技术,CD-ROM、DVD或其他光学存储、磁带盒、磁带、磁盘存储或其他磁性存储设备。当然,本领域技术人员可知所述计算机存储介质不局限于上述几种。上述的系统存储器1504和大容量存储设备1507可以统称为存储器。
根据本申请的各种实施例,所述计算机设备1500还可以通过诸如因特网等网络连接到网络上的远程计算机运行。也即服务器1500可以通过连接在所述系统总线1505上的网络接口单元1511连接到网络1512,或者说,也可以使用网络接口单元1511来连接到其他类型的网络或远程计算机系统。
所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、至少一段程序、代码集或指令集经配置以由一个或者一个以上处理器执行,以实现上述SMPL参数预测模型的训练方法中各个步骤的功能,或者,实现上述三维人体重建方法中各个步骤的功能。
本申请实施例还提供一种计算机可读存储介质,该存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器加载并执行以实现如上述各个实施例提供的SMPL参数预测模型的训练方法,或,实现如上述各个实施例提供的三维人体重建方法。
可选地,该计算机可读存储介质可以包括:只读存储器(ROM,Read Only Memory)、随机存取记忆体(RAM,Random Access Memory)、固态硬盘(SSD,Solid State Drives)或光盘等。其中,随机存取记忆体可以包括电阻式随机存取记忆体(ReRAM,Resistance Random Access Memory)和动态随机存取存储器(DRAM,Dynamic Random Access Memory)。上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。

Claims (20)

  1. 一种SMPL参数预测模型的训练方法,由计算机设备执行,其特征在于,所述方法包括:
    获取样本图片,所述样本图片中包含人体图像;
    将所述样本图片输入姿态参数预测模型,得到姿态预测参数,所述姿态预测参数是SMPL预测参数中用于指示人体姿态的参数;
    将所述样本图片输入形态参数预测模型,得到形态预测参数,所述形态预测参数是所述SMPL预测参数中用于指示人体形态的参数;
    根据所述SMPL预测参数,结合所述样本图片的标注信息,计算模型预测损失;及
    根据所述模型预测损失反向训练所述姿态参数预测模型和所述形态参数预测模型。
  2. 根据权利要求1所述的方法,其特征在于,所述模型预测损失包括第一模型预测损失;所述根据所述SMPL预测参数,结合所述样本图片的标注信息,计算模型预测损失,包括:
    根据所述SMPL预测参数,以及所述标注信息中的SMPL标注参数,计算第一模型预测损失,所述SMPL标注参数中包括姿态标注参数和形态标注参数。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述SMPL预测参数,以及所述标注信息中的SMPL标注参数,计算第一模型预测损失,包括:
    计算所述姿态标注参数与所述姿态预测参数之间的第一欧式距离;
    计算所述形态标注参数与所述形态预测参数之间的第二欧式距离;及
    根据所述第一欧式距离和所述第二欧式距离确定所述第一模型预测损失。
  4. 根据权利要求1所述的方法,其特征在于,所述根据所述SMPL预测参数,结合所述样本图片的标注信息,计算模型预测损失包括:
    根据所述姿态预测参数和所述形态预测参数,通过SMPL模型构建人体 三维模型;及
    根据所述人体三维模型,结合所述样本图片的标注信息,计算模型预测损失。
  5. 根据权利要求4所述的方法,其特征在于,所述模型预测损失包括第二模型预测损失;所述根据所述人体三维模型,结合所述样本图片的标注信息,计算第二模型预测损失,包括:
    根据所述人体三维模型中关节点的关节点预测坐标,以及所述标注信息中关节点的关节点标注坐标,计算第二模型预测损失。
  6. 根据权利要求5所述的方法,其特征在于,所述关节点标注坐标中包含三维关节点标注坐标和/或二维关节点标注坐标;
    所述根据所述人体三维模型中关节点的关节点预测坐标,以及所述标注信息中关节点的关节点标注坐标,计算第二模型预测损失,包括:
    计算所述人体三维模型中关节点的三维关节点预测坐标和所述三维关节点标注坐标之间的第三欧式距离;及
    根据所述第三欧式距离计算所述第二模型预测损失。
  7. 根据权利要求6所述的方法,其特征在于,所述计算所述人体三维模型中关节点的三维关节点预测坐标和所述三维关节点标注坐标之间的第三欧式距离,包括:
    根据所述人体三维模型中关节点周侧模型顶点的顶点坐标,确定所述人体三维模型中关节点的所述三维关节点预测坐标;及
    计算所述三维关节点预测坐标和所述三维关节点标注坐标之间的所述第三欧式距离。
  8. 根据权利要求6所述的方法,其特征在于,所述根据所述第三欧式距离计算所述第二模型预测损失包括:
    计算所述人体三维模型中关节点的二维关节点预测坐标和所述二维关节点标注坐标之间的第四欧式距离;及
    根据所述第四欧式距离计算所述第二模型预测损失。
  9. 根据权利要求8所述的方法,其特征在于,所述姿态参数预测模型还用于根据输入的所述样本图片输出投影参数,所述投影参数用于将三维空间的点投影到二维空间;
    所述计算所述人体三维模型中关节点的二维关节点预测坐标和所述二维关节点标注坐标之间的第四欧式距离,包括:
    根据所述人体三维模型中关节点周侧模型顶点的顶点坐标,确定所述人体三维模型中关节点的所述三维关节点预测坐标;
    根据所述投影参数,对所述三维关节点预测坐标进行投影处理,得到所述二维关节点预测坐标;及
    计算所述二维关节点预测坐标和所述二维关节点标注坐标之间的所述第四欧式距离。
  10. 根据权利要求4所述的方法,其特征在于,所述模型预测损失包括第三模型预测损失;所述根据所述人体三维模型,结合所述样本图片的标注信息,计算模型预测损失,包括:
    根据所述人体三维模型的预测二维人体轮廓,以及所述标注信息中的标注二维人体轮廓,计算第三模型预测损失。
  11. 根据权利要求10所述的方法,其特征在于,所述姿态参数预测模型还用于根据输入的所述样本图片输出投影参数,所述投影参数用于将三维空间的点投影到二维空间;
    所述根据所述人体三维模型的预测二维人体轮廓,以及所述标注信息中的标注二维人体轮廓,计算第三模型预测损失,包括:
    根据所述投影参数,将所述人体三维模型中的模型顶点投影到二维空间,并生成所述预测二维人体轮廓;
    根据所述预测二维人体轮廓和所述标注二维人体轮廓,计算第一轮廓损失和第二轮廓损失;及
    根据所述第一轮廓损失和所述第二轮廓损失确定所述第三模型预测损失。
  12. 根据权利要求11所述的方法,其特征在于,所述根据所述预测二维人体轮廓和所述标注二维人体轮廓,计算第一轮廓损失和第二轮廓损失,包括:
    计算所述预测二维人体轮廓中轮廓点到所述标注二维人体轮廓的第一最短距离;根据所述预测二维人体轮廓中各个轮廓点对应的所述第一最短距离,计算所述第一轮廓损失;及
    计算所述标注二维人体轮廓中轮廓点到所述预测二维人体轮廓的第二最短距离;根据所述标注二维人体轮廓中各个轮廓点对应的所述第二最短距离,计算所述第二轮廓损失。
  13. 根据权利要求12所述的方法,其特征在于,所述根据所述预测二维人体轮廓中各个轮廓点对应的所述第一最短距离,计算所述第一轮廓损失,包括:
    根据所述人体三维模型中各个模型顶点所属关节点的可见性,确定各个模型顶点对应轮廓点的第一权重;
    根据所述预测二维人体轮廓中各个轮廓点对应的所述第一最短距离以及所述第一权重,计算所述第一轮廓损失;及
    其中,当模型顶点所属关节点可见时,所述模型顶点对应轮廓点的所述第一权重为1,当所述模型顶点所属关节点不可见时,所述模型顶点对应轮廓点的所述第一权重为0。
  14. 根据权利要求12所述的方法,其特征在于,所述根据所述预测二维人体轮廓中各个轮廓点对应的所述第一最短距离,计算所述第一轮廓损失,包括:
    确定所述人体三维模型中各个模型顶点所属关节点的所述关节点预测坐标;
    根据所述关节点预测坐标与所述关节点标注坐标之间的第五欧式距离,确定各个模型顶点对应轮廓点的第二权重,所述第二权重与所述第五欧式距离之间呈负相关关系;及
    根据所述预测二维人体轮廓中各个轮廓点对应的所述第一最短距离以及所述第二权重,计算所述第一轮廓损失。
  15. 根据权利要求1至14任一所述的方法,其特征在于,所述方法还包括:
    对所述形态预测参数进行正则处理,得到第四模型预测损失;
    根据所述第四模型预测损失反向训练所述姿态参数预测模型和所述形态参数预测模型。
  16. 一种三维人体重建方法,其特征在于,所述方法包括:
    获取目标图片,所述目标图片中包含人体图像;
    将所述目标图片输入姿态参数预测模型,得到姿态预测参数,所述姿态预测参数是SMPL预测参数中用于指示人体姿态的参数;
    将所述目标图片输入形态参数预测模型,得到形态预测参数,所述形态预测参数是所述SMPL预测参数中用于指示人体形态的参数;及
    根据所述姿态预测参数和所述形态预测参数,通过SMPL模型构建目标人体三维模型。
  17. 一种SMPL参数预测模型的训练装置,其特征在于,所述装置包括:
    第一获取模块,用于获取样本图片,所述样本图片中包含人体图像;
    第一预测模块,用于将所述样本图片输入姿态参数预测模型,得到姿态预测参数,所述姿态预测参数是SMPL预测参数中用于指示人体姿态的参数;
    第二预测模块,用于将所述样本图片输入形态参数预测模型,得到形态预测参数,所述形态预测参数是所述SMPL预测参数中用于指示人体形态的参数;
    损失计算模块,用于根据所述SMPL预测参数,并结合所述样本图片的标注信息,计算模型预测损失;及
    训练模块,用于根据所述模型预测损失反向训练所述姿态参数预测模型和所述形态参数预测模型。
  18. 一种三维人体重建装置,其特征在于,所述方法包括:
    第二获取模块,用于获取目标图片,所述目标图片中包含人体图像;
    第三预测模块,用于将所述目标图片输入姿态参数预测模型,得到姿态预测参数,所述姿态预测参数是SMPL预测参数中用于指示人体姿态的参数;
    第四预测模块,用于将所述目标图片输入形态参数预测模型,得到形态预测参数,所述形态预测参数是所述SMPL预测参数中用于指示人体形态的参数;及
    第二构建模块,用于根据所述姿态预测参数和所述形态预测参数,通过SMPL模型构建目标人体三维模型。
  19. 一种计算机设备,其特征在于,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器执行以实现如权利要求1至15任一所述的SMPL参数预测模型的训练方法,或者,实现如权利要求16所述的三维人体重建方法。
  20. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述至少一条指令、所述至少一段程序、所述代码集或指令集由所述处理器执行以实现如权利要求1至15任一所述的SMPL参数预测模型的训练方法,或者,实现如权利要求16所述的三维人体重建方法。
PCT/CN2020/072023 2019-02-01 2020-01-14 Smpl参数预测模型的训练方法、计算机设备及存储介质 WO2020156148A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP20748016.1A EP3920146A4 (en) 2019-02-01 2020-01-14 METHOD OF TRAINING AN SMPL PARAMETER FORECAST MODEL, COMPUTER DEVICE, AND STORAGE MEDIA
US17/231,952 US20210232924A1 (en) 2019-02-01 2021-04-15 Method for training smpl parameter prediction model, computer device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910103414.X 2019-02-01
CN201910103414.XA CN109859296B (zh) 2019-02-01 2019-02-01 Smpl参数预测模型的训练方法、服务器及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/231,952 Continuation US20210232924A1 (en) 2019-02-01 2021-04-15 Method for training smpl parameter prediction model, computer device, and storage medium

Publications (1)

Publication Number Publication Date
WO2020156148A1 true WO2020156148A1 (zh) 2020-08-06

Family

ID=66897461

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/072023 WO2020156148A1 (zh) 2019-02-01 2020-01-14 Smpl参数预测模型的训练方法、计算机设备及存储介质

Country Status (4)

Country Link
US (1) US20210232924A1 (zh)
EP (1) EP3920146A4 (zh)
CN (1) CN109859296B (zh)
WO (1) WO2020156148A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113487575A (zh) * 2021-07-13 2021-10-08 中国信息通信研究院 用于训练医学影像检测模型的方法及装置、设备、可读存储介质
CN114373033A (zh) * 2022-01-10 2022-04-19 腾讯科技(深圳)有限公司 图像处理方法、装置、设备、存储介质及计算机程序

Families Citing this family (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109859296B (zh) * 2019-02-01 2022-11-29 腾讯科技(深圳)有限公司 Smpl参数预测模型的训练方法、服务器及存储介质
US11423630B1 (en) 2019-06-27 2022-08-23 Amazon Technologies, Inc. Three-dimensional body composition from two-dimensional images
CN110415336B (zh) * 2019-07-12 2021-12-14 清华大学 高精度人体体态重建方法及系统
CN110428493B (zh) * 2019-07-12 2021-11-02 清华大学 基于网格形变的单图像人体三维重建方法及系统
CN110599540B (zh) * 2019-08-05 2022-06-17 清华大学 多视点相机下的实时三维人体体型与姿态重建方法及装置
US11903730B1 (en) 2019-09-25 2024-02-20 Amazon Technologies, Inc. Body fat measurements from a two-dimensional image
CN110838179B (zh) * 2019-09-27 2024-01-19 深圳市三维人工智能科技有限公司 基于体测数据的人体建模方法、装置及电子设备
CN112419419A (zh) * 2019-11-27 2021-02-26 上海联影智能医疗科技有限公司 用于人体姿势和形状估计的系统和方法
CN110930436B (zh) * 2019-11-27 2023-04-14 深圳市捷顺科技实业股份有限公司 一种目标跟踪方法及设备
CN111105489A (zh) * 2019-12-23 2020-05-05 北京奇艺世纪科技有限公司 数据合成方法和装置、存储介质和电子装置
US11526697B1 (en) * 2020-03-10 2022-12-13 Amazon Technologies, Inc. Three-dimensional pose estimation
CN111047548B (zh) * 2020-03-12 2020-07-03 腾讯科技(深圳)有限公司 姿态变换数据处理方法、装置、计算机设备和存储介质
CN111401234B (zh) * 2020-03-13 2022-06-14 深圳普罗米修斯视觉技术有限公司 三维人物模型构建方法、装置及存储介质
CN113449570A (zh) * 2020-03-27 2021-09-28 虹软科技股份有限公司 图像处理方法和装置
CN111582036B (zh) * 2020-04-09 2023-03-07 天津大学 可穿戴设备下基于形状和姿态的跨视角人物识别方法
CN113689578B (zh) * 2020-05-15 2024-01-02 杭州海康威视数字技术股份有限公司 一种人体数据集生成方法及装置
CN111968217B (zh) * 2020-05-18 2021-08-20 北京邮电大学 基于图片的smpl参数预测以及人体模型生成方法
GB202009515D0 (en) * 2020-06-22 2020-08-05 Ariel Ai Ltd 3D object model reconstruction from 2D images
CN111783609A (zh) * 2020-06-28 2020-10-16 北京百度网讯科技有限公司 行人再识别的方法、装置、设备和计算机可读存储介质
CN112116984B (zh) * 2020-09-16 2023-08-29 无锡职业技术学院 面对肥胖大学生群体的肥胖分析干预方法
CN112307940A (zh) * 2020-10-28 2021-02-02 有半岛(北京)信息科技有限公司 模型训练方法、人体姿态检测方法、装置、设备及介质
CN112270711B (zh) * 2020-11-17 2023-08-04 北京百度网讯科技有限公司 模型训练以及姿态预测方法、装置、设备以及存储介质
CN112714263B (zh) * 2020-12-28 2023-06-20 北京字节跳动网络技术有限公司 视频生成方法、装置、设备及存储介质
CN112652057B (zh) * 2020-12-30 2024-05-07 北京百度网讯科技有限公司 生成人体三维模型的方法、装置、设备以及存储介质
CN112819944B (zh) * 2021-01-21 2022-09-27 魔珐(上海)信息科技有限公司 三维人体模型重建方法、装置、电子设备及存储介质
CN112802161B (zh) * 2021-01-27 2022-11-15 青岛联合创智科技有限公司 一种三维虚拟角色智能蒙皮方法
CN112991515B (zh) * 2021-02-26 2022-08-19 山东英信计算机技术有限公司 一种三维重建方法、装置及相关设备
CN113079136B (zh) * 2021-03-22 2022-11-15 广州虎牙科技有限公司 动作捕捉方法、装置、电子设备和计算机可读存储介质
CN113096249B (zh) * 2021-03-30 2023-02-17 Oppo广东移动通信有限公司 训练顶点重建模型的方法、图像重建方法及电子设备
CN113569627A (zh) * 2021-06-11 2021-10-29 北京旷视科技有限公司 人体姿态预测模型训练方法、人体姿态预测方法及装置
US11854146B1 (en) * 2021-06-25 2023-12-26 Amazon Technologies, Inc. Three-dimensional body composition from two-dimensional images of a portion of a body
CN113610889B (zh) * 2021-06-30 2024-01-16 奥比中光科技集团股份有限公司 一种人体三维模型获取方法、装置、智能终端及存储介质
CN113628322B (zh) * 2021-07-26 2023-12-05 阿里巴巴(中国)有限公司 图像处理、ar显示与直播方法、设备及存储介质
US11887252B1 (en) 2021-08-25 2024-01-30 Amazon Technologies, Inc. Body model composition update from two-dimensional face images
US11861860B2 (en) 2021-09-29 2024-01-02 Amazon Technologies, Inc. Body dimensions from two-dimensional body images
US11941738B2 (en) * 2021-10-28 2024-03-26 Shanghai United Imaging Intelligence Co., Ltd. Systems and methods for personalized patient body modeling
WO2023214093A1 (en) * 2022-05-06 2023-11-09 MAX-PLANCK-Gesellschaft zur Förderung der Wissenschaften e.V. Accurate 3d body shape regression using metric and/or semantic attributes
CN114998514A (zh) * 2022-05-16 2022-09-02 聚好看科技股份有限公司 一种虚拟角色的生成方法及设备
CN115049764B (zh) * 2022-06-24 2024-01-16 苏州浪潮智能科技有限公司 Smpl参数预测模型的训练方法、装置、设备及介质
CN115496864B (zh) * 2022-11-18 2023-04-07 苏州浪潮智能科技有限公司 模型构建方法、重建方法、装置、电子设备及存储介质
CN117115363B (zh) * 2023-10-24 2024-03-26 清华大学 人体胸部平面估计方法和装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108594997A (zh) * 2018-04-16 2018-09-28 腾讯科技(深圳)有限公司 手势骨架构建方法、装置、设备及存储介质
CN109285215A (zh) * 2018-08-28 2019-01-29 腾讯科技(深圳)有限公司 一种人体三维模型重建方法、装置和存储介质
CN109859296A (zh) * 2019-02-01 2019-06-07 腾讯科技(深圳)有限公司 Smpl参数预测模型的训练方法、服务器及存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10395411B2 (en) * 2015-06-24 2019-08-27 MAX-PLANCK-Gesellschaft zur Förderung der Wissenschaften e.V. Skinned multi-person linear model
CN108053437B (zh) * 2017-11-29 2021-08-03 奥比中光科技集团股份有限公司 基于体态的三维模型获取方法及装置
CN108960036B (zh) * 2018-04-27 2021-11-09 北京市商汤科技开发有限公司 三维人体姿态预测方法、装置、介质及设备
CN108629801B (zh) * 2018-05-14 2020-11-24 华南理工大学 一种视频序列的三维人体模型姿态与形状重构方法
CN108898087B (zh) * 2018-06-22 2020-10-16 腾讯科技(深圳)有限公司 人脸关键点定位模型的训练方法、装置、设备及存储介质
CN109191554B (zh) * 2018-09-04 2021-01-01 清华-伯克利深圳学院筹备办公室 一种超分辨图像重建方法、装置、终端和存储介质

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108594997A (zh) * 2018-04-16 2018-09-28 腾讯科技(深圳)有限公司 手势骨架构建方法、装置、设备及存储介质
CN109285215A (zh) * 2018-08-28 2019-01-29 腾讯科技(深圳)有限公司 一种人体三维模型重建方法、装置和存储介质
CN109859296A (zh) * 2019-02-01 2019-06-07 腾讯科技(深圳)有限公司 Smpl参数预测模型的训练方法、服务器及存储介质

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PENGFEI DOU, SHISHIR K. SHAH, AND IOANNIS A. KAKADIARIS: "End-to-end 3D Face Reconstruction with Deep Neural Networks", 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, 9 November 2017 (2017-11-09), pages 1503 - 1512, XP055724139, DOI: 10.1109/CVPR.2017.164 *
See also references of EP3920146A4

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113487575A (zh) * 2021-07-13 2021-10-08 中国信息通信研究院 用于训练医学影像检测模型的方法及装置、设备、可读存储介质
CN113487575B (zh) * 2021-07-13 2024-01-16 中国信息通信研究院 用于训练医学影像检测模型的方法及装置、设备、可读存储介质
CN114373033A (zh) * 2022-01-10 2022-04-19 腾讯科技(深圳)有限公司 图像处理方法、装置、设备、存储介质及计算机程序

Also Published As

Publication number Publication date
CN109859296B (zh) 2022-11-29
CN109859296A (zh) 2019-06-07
US20210232924A1 (en) 2021-07-29
EP3920146A4 (en) 2022-10-19
EP3920146A1 (en) 2021-12-08

Similar Documents

Publication Publication Date Title
WO2020156148A1 (zh) Smpl参数预测模型的训练方法、计算机设备及存储介质
CN110599528B (zh) 一种基于神经网络的无监督三维医学图像配准方法及系统
US10679046B1 (en) Machine learning systems and methods of estimating body shape from images
CN108509848B (zh) 三维物体的实时检测方法及系统
CN111598998B (zh) 三维虚拟模型重建方法、装置、计算机设备和存储介质
WO2020042720A1 (zh) 一种人体三维模型重建方法、装置和存储介质
WO2021093453A1 (zh) 三维表情基的生成方法、语音互动方法、装置及介质
Feixas et al. A unified information-theoretic framework for viewpoint selection and mesh saliency
CN109377520A (zh) 基于半监督循环gan的心脏图像配准系统及方法
Wu et al. Handmap: Robust hand pose estimation via intermediate dense guidance map supervision
WO2021253788A1 (zh) 一种人体三维模型构建方法及装置
JP2023545200A (ja) パラメータ推定モデルの訓練方法、パラメータ推定モデルの訓練装置、デバイスおよび記憶媒体
WO2021063271A1 (zh) 人体模型重建方法、重建系统及存储介质
CN111768375A (zh) 一种基于cwam的非对称gm多模态融合显著性检测方法及系统
CN116097316A (zh) 用于非模态中心预测的对象识别神经网络
CN116188695A (zh) 三维手部姿态模型的构建方法和三维手部姿态估计方法
CN114972634A (zh) 基于特征体素融合的多视角三维可变形人脸重建方法
CN111709269B (zh) 一种深度图像中基于二维关节信息的人手分割方法和装置
Kazmi et al. Efficient sketch‐based creation of detailed character models through data‐driven mesh deformations
CN113439909A (zh) 一种对象的三维尺寸测量方法和移动终端
CN111275610A (zh) 一种人脸变老图像处理方法及系统
CN112381825B (zh) 用于病灶区图像几何特征提取的方法和相关产品
Alsmirat et al. Building an image set for modeling image re-targeting using deep learning
Bouafif et al. Monocular 3D head reconstruction via prediction and integration of normal vector field
CN117726822B (zh) 基于双分支特征融合的三维医学图像分类分割系统及方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20748016

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020748016

Country of ref document: EP

Effective date: 20210901