WO2022156533A1 - 三维人体模型重建方法、装置、电子设备及存储介质 - Google Patents

三维人体模型重建方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2022156533A1
WO2022156533A1 PCT/CN2022/070258 CN2022070258W WO2022156533A1 WO 2022156533 A1 WO2022156533 A1 WO 2022156533A1 CN 2022070258 W CN2022070258 W CN 2022070258W WO 2022156533 A1 WO2022156533 A1 WO 2022156533A1
Authority
WO
WIPO (PCT)
Prior art keywords
human body
parameters
camera
model
dimensional
Prior art date
Application number
PCT/CN2022/070258
Other languages
English (en)
French (fr)
Inventor
张建杰
柴金祥
李妙鹏
熊兴堂
蒋利国
王从艺
Original Assignee
魔珐(上海)信息科技有限公司
上海墨舞科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 魔珐(上海)信息科技有限公司, 上海墨舞科技有限公司 filed Critical 魔珐(上海)信息科技有限公司
Publication of WO2022156533A1 publication Critical patent/WO2022156533A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30204Marker

Definitions

  • the present application relates to the technical field of computer vision, and in particular, to a method, device, electronic device and storage medium for reconstructing a three-dimensional human body model.
  • Human pose capture plays an important role in many fields, typically, such as movies, games, criminal investigation, video surveillance, and so on.
  • relatively accurate human body shape and posture can be captured by some methods with high cost and complicated process.
  • the method of reflectively collecting marker points on the human body can obtain accurate human body shape and posture.
  • this method requires a number of expensive cameras and a specific venue, which is costly and brings serious discomfort to users, so the development and promotion of this method are hindered.
  • Single-camera human body image acquisition has the characteristics of low cost, easy installation, and user-friendliness, but single-camera human body images are generally two-dimensional information with only one viewing angle, and it is difficult to provide three-dimensional information. Therefore, in order to obtain a vivid and realistic human pose, it is necessary to reconstruct a three-dimensional human body model from a single-camera human image.
  • the current 3D human body models reconstructed based on single-camera human images often have low accuracy and are difficult to capture realistic human poses.
  • the purpose of the embodiments of the present application is to provide a three-dimensional human body model reconstruction method, device, electronic device and storage medium, which can quickly and accurately reconstruct a three-dimensional human body model.
  • the three-dimensional human body model reconstruction method, device, electronic device, and storage medium provided by the embodiments of the present application are implemented as follows:
  • a three-dimensional human body model reconstruction method includes:
  • the three-dimensional human body model is determined by using the body parameters and the action posture parameters, so that the human body information determined by the three-dimensional human body model matches the target human body information.
  • the target human body information includes at least one of the following information: two-dimensional human body joint points, bone orientation, foreground and background segmentation results, and texture mapping information.
  • the acquiring a single-camera human body image includes:
  • Human body detection is performed on the single-camera image, and a single-camera human body image is intercepted from the single-camera image.
  • the human body information prediction model is set to be obtained by training in the following manner:
  • the model parameters are iteratively adjusted until the difference meets a preset requirement.
  • the single-camera human body sample image is set to be acquired in the following manner:
  • the human body information separate human body images from the multiple single-camera images, and use the multiple human body images as single-camera human body sample images used for training the human body information prediction model.
  • determining a three-dimensional human body model using body parameters and action posture parameters, so that the human body information determined by the three-dimensional human body model matches the target human body information including:
  • the alternately fixing body parameters or action posture parameters, adjusting action posture parameters or body parameters, and generating a three-dimensional human body model including:
  • the action The posture parameter and/or the body parameter are iteratively adjusted until at least one of the difference or the number of iterations satisfies a preset requirement.
  • the use of body parameters and action posture parameters Determining a three-dimensional human body model so that the human body information determined by the three-dimensional human body model matches the target human body information, including:
  • N single-camera body images Based on the N single-camera body images, alternately fix the body parameters or action posture parameters, adjust the action posture parameters or body parameters, and jointly optimize to generate N 3D body models with the same body parameters, so that the N 3D body models are respectively
  • the determined human body information is matched with the corresponding target human body information.
  • the body parameters or action posture parameters are alternately fixed, the action posture parameters or body parameters are adjusted, and N three-dimensional human models with the same body parameters are generated by joint optimization, including:
  • the three-dimensional human body model includes a three-dimensional model composed of a preset number of multiple polygonal meshes connected to each other, and the positions of the mesh vertices of the polygonal meshes are determined by the The figure parameter and the action posture parameter are determined.
  • the body parameter includes at least one of a height parameter, a bone length parameter, and a fat or thin parameter.
  • a three-dimensional human body model reconstruction device comprising:
  • an information prediction module configured to input the single-camera human body image into a human body information prediction model, and output the target human body information in the single-camera human body image through the human body information prediction model;
  • the model determination module is used for determining a three-dimensional human body model by using the body parameters and action posture parameters, so that the human body information determined by the three-dimensional human body model matches the target human body information.
  • An electronic device includes a processor and a memory for storing instructions executable by the processor, and the processor implements the three-dimensional human body model reconstruction method when the processor executes the instructions.
  • a non-transitory computer-readable storage medium when the instructions in the storage medium are executed by a processor, enable the processor to execute the three-dimensional human body model reconstruction method.
  • the three-dimensional human body model reconstruction method can reconstruct a three-dimensional human body model based on a single-camera human body image, and takes advantage of the advantages of low-cost, easy installation, and user-friendliness of single-camera image acquisition. Based on this, the advantage of easy acquisition of human body images from a single camera can not only reduce the construction cost of human information prediction models, but also make the reconstruction of 3D human body models faster and easier. In the reconstruction process, predicting at least one kind of human body information output by the model based on the human body information can effectively improve the accuracy and robustness of the model reconstruction.
  • the three-dimensional human body model is obtained through the reconstruction of body parameters and action posture parameters, which provides accurate and reliable technical solutions for the technical fields of single-camera virtual live broadcast, single-camera intelligent interaction technology, human body recognition, criminal investigation monitoring, and movie games.
  • FIG. 1 is a schematic flowchart of a method for reconstructing a three-dimensional human body model according to an exemplary embodiment.
  • FIG. 2 is a schematic flowchart of a method for reconstructing a three-dimensional human body model according to an exemplary embodiment.
  • FIG. 3 is a schematic flowchart of a method for reconstructing a three-dimensional human body model according to an exemplary embodiment.
  • Fig. 4 is a schematic flowchart of a method for reconstructing a three-dimensional human body model according to an exemplary embodiment.
  • FIG. 5 is a schematic flowchart of a method for reconstructing a three-dimensional human body model according to an exemplary embodiment.
  • Fig. 6 is a schematic flowchart of a method for reconstructing a three-dimensional human body model according to an exemplary embodiment.
  • Fig. 7 is a block diagram of an apparatus for reconstructing a three-dimensional human body model according to an exemplary embodiment.
  • Fig. 8 is a block diagram of an apparatus for reconstructing a three-dimensional human body model according to an exemplary embodiment.
  • the existing RGB single-camera acquisition technology still has problems such as poor accuracy, requiring human intervention (such as needing to perform a full-body scan of the performer or providing the performer's initial state, etc.), unable to run in real time, and lag.
  • the present invention focuses on solving the above pain points, and proposes a single-camera human motion capture scheme with high precision, full automation, strong robustness and real-time operation.
  • This solution has no requirements on the initial state and can run in real time. For example, it can achieve 30 frames per second, or 20 seconds per second, and about 25 frames per second.
  • FIG. 1 is a schematic flowchart of an embodiment of a three-dimensional human body model reconstruction method provided by the present application.
  • the present application provides method operation steps as shown in the following embodiments or drawings, more or less operation steps may be included in the method based on routine or without creative effort. In steps that logically do not have a necessary causal relationship, the execution order of these steps is not limited to the execution order provided by the embodiments of the present application.
  • the method may be executed sequentially or in parallel (eg, in a parallel processor or multi-threaded processing environment) according to the methods shown in the embodiments or the accompanying drawings.
  • FIG. 1 an embodiment of the three-dimensional human body model reconstruction method provided by the present application is shown in FIG. 1 , and the method may include:
  • S101 Acquire a single-camera human body image.
  • S103 Input the single-camera human body image into a human body information prediction model, and output the target human body information in the single-camera human body image through the human body information prediction model.
  • S105 Determine a three-dimensional human body model by using body parameters and action posture parameters, so that the human body information determined by the three-dimensional human body model matches the target human body information.
  • the single-camera human body image may include a human body image captured by a single camera
  • the single camera may include a single camera device, such as a single-lens reflex camera, a smart device with a camera function (such as a smart phone) , tablet computer, smart wearable device, etc.), the camera can be an RGB camera, or an RGBD camera, etc.
  • the single-camera human body image may include an image in any format such as an RGB image, a grayscale image, and the like.
  • the images captured by the camera not only include human body images, but also include background images of people other than human bodies.
  • a single-camera human body image that only contains the human body as much as possible can be cut out from the image captured by the single-camera.
  • a single-camera image including a human body image may be acquired.
  • human body detection can be performed on the single-camera image, and a single-camera human body image can be intercepted from the single-camera image.
  • a human body image in the single-camera image can be detected by using a machine learning-based human body detection algorithm, and the single-camera human body image can be cut out from the single-camera image.
  • the human body detection algorithm may include algorithms such as R-CNN, Fast R-CNN, Faster R-CNN, TCDCN, MTCNN, YOLOV3, SSD, etc., which are not limited here.
  • the single-camera human body image after the single-camera human body image is acquired, the single-camera human body image can be input into a human body information prediction model, and the target in the single-camera human body image is output through the human body information prediction model body information.
  • the target human body information may include at least one of the following information: two-dimensional joint points of the human body, bone orientation, foreground and background segmentation results, and texture mapping information.
  • the two-dimensional joint points of the human body include key points used to represent movable joints of the human body.
  • the two-dimensional joint points of the human body may include joints in parts such as head, shoulder, neck, and limbs.
  • the Kinect algorithm can extract 25 two-dimensional joint points of the human body.
  • the bone orientations may include two-dimensional bone orientations and three-dimensional bone orientations.
  • the two-dimensional bone direction may include the two-dimensional connection relationship and the two-dimensional connection direction of the human body joint points on the single camera image, for example, the connection relationship and connection direction of the top joint point and the neck joint point.
  • the three-dimensional bone orientation may include the bone connection relationship between the three-dimensional joint points of the human body and the orientation of the bone in three-dimensional space, such as the bone connection from the top of the head to the neck and the three-dimensional orientation of the bone from the top of the head to the neck, and the neck to the neck.
  • the foreground and background segmentation results may include segmentation results between the human body and its background, and the segmentation results may include expressions such as segmentation curves, segmentation rectangles, segmentation masks, and the like.
  • the texture mapping information refers to the mapping relationship between the human body image pixels and the three-dimensional human body model established by the texture coordinates.
  • the three-dimensional human body model may include a three-dimensional model composed of a preset number of polygonal meshes connected to each other, and the positions of the mesh vertices of the polygonal meshes are determined by body parameters and action poses. Parameters are determined.
  • the polygonal meshes may include triangular meshes, pentagonal meshes, hexagonal meshes, and the like. It should be noted that the edges of the polygon meshes are shared with the adjacent polygon meshes.
  • the process of reconstructing the three-dimensional human body model is the process of adjusting the positions of the vertices of the mesh.
  • the mesh vertices may have unique identifiers, for example, the unique identifiers may include (u, v) coordinates of texture mapping, so that the texture mapping information may include human image pixels to the mesh.
  • the mapping relationship of the unique identification of the vertex may include that the pixel point with the coordinate position (29, 76) in the single-camera image corresponds to the mesh vertex numbered with the texture coordinate (0.6, 0.3) in the three-dimensional human model.
  • the above-mentioned various kinds of human body information in the single-camera human body image can be determined by using the one model of the human body information prediction model.
  • the human body information prediction model may include a multi-task machine learning model, and the multi-task machine learning model may implement various tasks, such as including a multi-task deep learning network.
  • a variety of human body information is fused into the same model for learning. The correlation between various kinds of information improves the accuracy of the human body information prediction model.
  • S201 Acquire a plurality of single-camera human body sample images, where human body information is marked in the single-camera human body sample images.
  • S203 Build a human body information prediction model, where model parameters are set in the human body information prediction model.
  • S205 Input the single-camera human body sample image into the human body information prediction model to generate a prediction result.
  • the preset requirement may include, for example, that the value of the difference is smaller than a preset threshold.
  • the prediction results may include various information in human body 2D joint points, 3D bone orientation, foreground and background segmentation results, and texture mapping information, and the differences may include a plurality of prediction results corresponding to The sum of the differences between two-dimensional information.
  • the human body information that can be output by the human body information prediction model is not limited to the above examples, and can also include any other human body information, which is not limited in this application.
  • the machine learning algorithm for training the human body information prediction model may include a Resnet backbone network, a MobileNet backbone network, a VGG backbone network, etc., which are not limited herein.
  • the single-camera human body sample images are marked with human body information such as two-dimensional joint points, bone orientation, foreground and background segmentation results, and texture mapping information. Based on this, in an embodiment of the present application, as shown in FIG. 3 , the single-camera human body sample image can be acquired in the following manner:
  • S301 Utilize multiple cameras to simultaneously acquire multiple single-camera images of the same human body from different angles.
  • S305 Project the three-dimensional human body model into the multiple single-camera images, and acquire human body information in the multiple single-camera images respectively.
  • S307 According to the human body information, separate human body images from the multiple single-camera images respectively, and use the multiple human body images as single-camera human body sample images used for training the human body information prediction model.
  • multiple cameras can be used to photograph the same human body from multiple angles at the same time, so that multiple single-camera images of the human body can be acquired.
  • 5 images are obtained by shooting with 5 cameras, so that 5 single-camera images can be obtained at one time.
  • a three-dimensional human body model of the human body can be reconstructed by using the multiple single-camera images, and body parameters and action posture parameters can be determined through the three-dimensional human body model.
  • the three-dimensional human body model may be projected back into multiple single-camera images, so as to obtain at least one of the two-dimensional joint points of the human body, the orientation of bones, the foreground and background segmentation results, and the texture mapping information in each single-camera image. species of human information.
  • the 3D human body model used in multi-camera reconstruction and the 3D human body model used in subsequent single-camera reconstruction need to have the same topology, that is, the same vertex connection relationship.
  • the single-camera image includes human body information
  • the single-camera image can be segmented according to the human body information
  • the segmented human body image is used as the single-body image used for training the human body information prediction model.
  • Camera body sample image by projecting the three-dimensional human body model into the single-camera image, the human body information of the single-camera image can be obtained, wherein the human body information can include the foreground and background segmentation results. Therefore, the human body image can be segmented from the single-camera image according to the foreground-background segmentation result to obtain a single-camera human body sample image.
  • the BoundingBox algorithm can be utilized for image separation.
  • the method of generating the single-camera human body sample image in this embodiment can greatly save the cost of manual labeling, and can acquire a large amount of sample data in less time, thereby reducing the cost of acquiring training samples.
  • a three-dimensional human body model determined by body parameters and action posture parameters can be reconstructed, so that the human body information determined from the three-dimensional human body model match with the target human body information.
  • the body parameters may be used to represent the body characteristics of the human body, which may include information such as height, bone length, fat and thinness, and the like.
  • the three-dimensional human body model can be uniquely determined through the body parameters and the action posture parameters. Based on this, in the reconstruction of the three-dimensional human body model in the present application, the body information determined by the generated three-dimensional human body model can be matched with the target human body information by continuously adjusting the body parameter variables and the action posture parameter variables.
  • an analysis-by-synthesis algorithm can be used to adjust parameters to determine the three-dimensional human body model.
  • body parameters or action posture parameters can be fixed alternately, and action posture parameters or body parameters can be adjusted to generate a three-dimensional human body model, so that the human body information determined by the three-dimensional human body model matches the target human body information.
  • the alternate optimization strategy of "fixing body parameters to optimize action posture parameters" and "fixing action posture parameters to optimize body parameters” can be used. Compared with the strategy of optimizing body parameters and action posture parameters at the same time, this alternate optimization strategy The method can make the three-dimensional human body model converge quickly and improve the optimization efficiency.
  • the alternately fixing body parameters or action posture parameters, adjusting the action posture parameters or body parameters, and generating a three-dimensional human body model may include:
  • S401 Alternately fix body parameters or action posture parameters, adjust action posture parameters or body parameters, and generate a predicted three-dimensional human body model;
  • S403 Project the predicted three-dimensional human body model into the single-camera human body image to obtain predicted human body information
  • S405 Based on the difference between the predicted human body information and the target human body information, iteratively adjust the action posture parameter or the body parameter until at least one of the difference or the number of iterations meets a preset requirement.
  • an initial three-dimensional human body model can be provided, and the initial three-dimensional human body model is the three-dimensional human body model when parameters have not yet been optimized.
  • the initial three-dimensional human body model It can be generated based on default body parameters and default action pose parameters.
  • the default parameters may be determined according to the average value of the body parameters and action posture parameters stored in the preset database, or may be determined by using the body parameters and action posture parameters reconstructed from the previous frame of a single-camera human body image, which are not limited in this application. .
  • the present application does not limit whether to "fix body parameters to optimize action posture parameters" or to "fix action posture parameters to optimize body parameters" first.
  • the body parameters can be fixed to optimize the action pose parameters.
  • the target human body information of the single-camera human body image 1 can be determined, such as 18 two-dimensional joint points, 17 bone directions, foreground and background segmentation results, and texture coordinates.
  • the initial three-dimensional human body model can be projected into the single-camera human body image 1 to obtain predicted human body information, such as 18 predicted two-dimensional joint points, 17 predicted bone orientations, predicted foreground and background segmentation results, and predicted texture coordinates, And determine the difference between the predicted human body information and the target human body information.
  • the action gesture parameter may be adjusted based on the difference.
  • the action posture parameters are fixed to optimize the body parameters, and the adjustment method is the same as the adjustment method in the fixed body parameters to optimize the action posture parameters, which will not be repeated here.
  • the action posture parameter and the figure parameter are adjusted alternately and iteratively, until at least one of the difference between the predicted human body information and the target human body information or the number of iterations satisfies a preset requirement.
  • the iterative adjustment method may include a gradient descent optimization algorithm (Gradient-based Optimization), a particle swarm optimization algorithm (Particle Swarm Optimization), etc., which is not limited in this application.
  • the preset requirement corresponding to the difference may include that the value of the difference is less than or equal to a preset threshold, and the preset threshold may be set to a value such as 0 or 0.01.
  • the preset requirement corresponding to the number of iterations may include that the number of iterations is less than the preset number of times, and the preset number of times may be set to, for example, 5 times, 7 times, and the like. If the set of parameters determined when at least one of the difference or the number of iterations meets the preset requirements is (body parameter 1, action posture parameter 1), the first parameter can be determined by (body parameter 1, action posture parameter 1) A predicted three-dimensional human body model.
  • the reconstructed 3D human body model has many possibilities, and therefore, there may be a certain degree of ambiguity.
  • the reconstructed 3D human body model is not a natural and realistic human state.
  • the prior probability distribution result and prior probability target value of at least one parameter of the body parameter and the action posture parameter can also be obtained, and by comparing the prior probability distribution result with the prior probability distribution result
  • the prior probability target value is set to avoid the prior probability distribution result from exceeding a reasonable range.
  • the prior probability target value can be determined according to a large amount of real collected human body data, so the ambiguity of the reconstructed three-dimensional human body model can be effectively reduced.
  • the difference between the prior probability distribution result of the action posture parameter and/or the body parameter and the prior probability target value can also be used as a condition for the convergence of the three-dimensional human body model, which can effectively reduce the The ambiguity of the reconstructed 3D human model.
  • multiple single-camera human images of the user can be jointly optimized at the same time to improve the reconstruction efficiency.
  • multiple single-camera human body images of the same user may be acquired, such as 20 frames of human body images with different poses captured in real time.
  • the alternate optimization strategy of "fixed body parameters to optimize action posture parameters" and "fixed action posture parameters to optimize body parameters” can also be used.
  • N the number of single-camera human body images participating in the joint optimization is N, and the N single-camera human body images belong to the same human body.
  • the N single-camera human body images alternately fixing body parameters or action posture parameters, adjusting action posture parameters or body parameters, and jointly optimizing to generate N three-dimensional body models with the same body parameters, so that the N
  • the human body information determined by the three-dimensional human body model is matched with the corresponding target human body information.
  • Figure 6 in a specific embodiment, it includes:
  • S601 Alternately fix the body parameters or the action posture parameters, adjust the action posture parameters or the body parameters, and generate N predicted three-dimensional human body models, wherein, in the case where the body parameters are adjusted by the fixed action posture parameters, jointly optimize the N predicted three-dimensional human body models The body parameters of the model;
  • S603 Project the N predicted three-dimensional human body models into the corresponding single-camera human body images respectively, to obtain predicted human body information;
  • S605 Based on the difference between the predicted human body information and the target human body information, iteratively adjust the action posture parameter or the body parameter until at least one of the difference or the number of iterations meets a preset requirement.
  • the body parameters of the same human body are the same, the body parameters of the N predicted three-dimensional body models can be jointly optimized when the body parameters are adjusted by the fixed action posture parameters.
  • the technical solution of the above embodiment not only has the advantages of fast convergence speed and high reconstruction efficiency of alternate optimization, but also uses the same characteristics of the same human body identity parameters to jointly optimize multiple single-camera human body images, so that through one optimization, reconstruction can be performed. Obtain multiple 3D human models, greatly improving reconstruction efficiency.
  • the jointly optimized N predicted human body models are used and fixed.
  • the body parameters of the three-dimensional human model are adjusted, and the action posture parameters are adjusted until at least one of the differences or the number of iterations meets the preset requirements.
  • the body parameters can be fixed to optimize the action and posture parameters.
  • the target human body information of the N single-camera human body images is determined respectively by using the implementations of S101 and S103.
  • N initial three-dimensional human body models may be acquired, and the manner of acquiring the initial three-dimensional human body models may refer to the foregoing embodiment, which is not limited herein.
  • N pieces of first predicted human body information can be obtained, and the N pieces of first predicted human body information and the corresponding target human body can be determined respectively. N differences between information.
  • N N first predicted three-dimensional human body models can be determined.
  • the action pose parameters can be fixed, and the body parameters can be jointly optimized.
  • the N first predicted three-dimensional human body models may be projected into the corresponding single-camera human body images, to obtain N second predicted human body information, and determine the N second predicted human body information respectively
  • the N differences corresponding to the target human body information ⁇ 1, ⁇ 2, ..., ⁇ N
  • N body parameter X, action posture parameter 1
  • body parameter X, action posture parameter 2 body parameter X, Action posture parameter N
  • body parameter X, Action posture parameter N body parameter X, Action posture parameter N
  • the action posture parameter and the figure parameter are adjusted alternately and iteratively, until at least one of the difference between the predicted human body information and the target human body information or the number of iterations satisfies a preset requirement.
  • the body parameter X obtained in the above joint optimization process can be used, and in the process of optimizing the single-camera image obtained subsequently. , you can only optimize the action pose parameters, which simplifies the optimization process and improves the reconstruction efficiency of the human body model.
  • the iterative adjustment method may include a gradient descent optimization algorithm (Gradient-based Optimization), a particle swarm optimization algorithm (Particle Swarm Optimization), etc., which is not limited in this application.
  • the preset requirement corresponding to the difference may include that the value of the difference is less than or equal to a preset threshold, and the preset threshold may be set to a value such as 0 or 0.01.
  • the preset requirement corresponding to the number of iterations may include that the number of iterations is less than the preset number of times, and the preset number of times may be set to, for example, 5 times, 7 times, and the like.
  • the prior probability may also be used to constrain the body parameter and/or the action posture parameter in a scenario where N single-camera human body images are jointly optimized, so that the reconstructed three-dimensional human body model is more realistic.
  • a vivid and realistic human image is obtained according to the three-dimensional human body model.
  • a three-dimensional human body model of a background actor can be rendered into an animated character to generate a vivid scene of live broadcast of the animated character.
  • the three-dimensional human body model of the player can be rendered into the game character to generate a vivid game scene.
  • animation production and movie production which is not limited in this application.
  • the three-dimensional human body model reconstruction method provided in this application can be used in offline mode and real-time mode.
  • the offline mode includes a method for reconstructing a three-dimensional human body model based on an offline video. It does not need to output the three-dimensional human body model immediately, and can be used in the post-production animation film and television.
  • the real-time mode can also run in areas that require real-time interaction with users, such as interactive games and live broadcasts. After GPU acceleration in real-time applications, it can run in real-time (that is, after obtaining pictures, the three-dimensional human model is immediately output, and the delay between these is not easy. perceived by the user).
  • the three-dimensional human model reconstruction method can have offline mode and real-time mode, so that it can be more widely used.
  • the three-dimensional human body model reconstruction method can reconstruct a three-dimensional human body model based on a single-camera human body image, and takes advantage of the advantages of low-cost, easy installation, and user-friendliness of single-camera image acquisition. Based on this, the advantage of easy acquisition of human body images from a single camera can not only reduce the construction cost of human information prediction models, but also make the reconstruction of 3D human body models faster and easier. In the reconstruction process, predicting at least one kind of human body information output by the model based on the human body information can effectively improve the accuracy and robustness of the model reconstruction.
  • the three-dimensional human body model is obtained through the reconstruction of body parameters and action posture parameters, which provides accurate and reliable technical solutions for the technical fields of single-camera virtual live broadcast, single-camera intelligent interaction technology, human body recognition, criminal investigation monitoring, and movie games.
  • the present application also provides an electronic device, including a processor and a memory for storing executable instructions of the processor, and the processor can implement the instructions when executing the instructions.
  • an electronic device including a processor and a memory for storing executable instructions of the processor, and the processor can implement the instructions when executing the instructions.
  • the apparatus 800 may include:
  • an acquisition module 801 configured to acquire a single-camera human body image
  • an information prediction module 803 configured to input the single-camera human body image into a human body information prediction model, and output the target human body information in the single-camera human body image through the human body information prediction model;
  • the model determination module 805 is configured to determine a three-dimensional human body model by using the body parameters and action posture parameters, so that the human body information determined by the three-dimensional human body model matches the target human body information.
  • the target human body information includes at least one of the following information: two-dimensional human body joint points, bone orientation, foreground and background segmentation results, and texture mapping information.
  • the acquisition module includes:
  • Image acquisition sub-module used to acquire single-camera images including human body images
  • the human body detection sub-module is used for performing human body detection on the single-camera image, and intercepting the single-camera human body image from the single-camera image.
  • the human body information prediction model is set to be obtained by training the following sub-modules:
  • a sample acquisition sub-module for acquiring a plurality of single-camera human body sample images, wherein human body information is marked in the single-camera human body sample images
  • a model building submodule used for building a human body information prediction model, wherein model parameters are set in the human body information prediction model
  • a prediction result generation sub-module for inputting the single-camera human body sample image into the human body information prediction model to generate a prediction result
  • the iterative adjustment sub-module is configured to iteratively adjust the model parameters based on the difference between the prediction result and the human body information, until the difference meets a preset requirement.
  • the single-camera human body sample image is set to be acquired according to the following modules:
  • the image acquisition sub-module is used to acquire multiple single-camera images of the same human body simultaneously from different angles by using multiple cameras;
  • a model reconstruction sub-module for reconstructing the three-dimensional human body model of the human body by using the plurality of single-camera images
  • a human body information acquisition sub-module configured to project the three-dimensional human body model of the human body into the multiple single-camera images respectively, and obtain the human body information in the multiple single-camera images respectively;
  • an image segmentation sub-module configured to separate a human body image from the multiple single-camera images according to the human body information, and use the multiple human body images as a single-camera human body used for training the human body information prediction model Sample image.
  • the model determination module includes:
  • the alternate adjustment sub-module is used to alternately fix body parameters or action posture parameters, adjust action posture parameters or body parameters, and generate a three-dimensional human body model, so that the human body information determined by the three-dimensional human body model matches the target human body information.
  • the alternate adjustment sub-module includes:
  • the prediction model generation unit is used to alternately fix the body parameters or the action posture parameters, adjust the action posture parameters or the body parameters, and generate a predicted three-dimensional human body model;
  • a human body information obtaining unit configured to project the predicted three-dimensional human body model into the single-camera human body image to obtain predicted human body information
  • an iterative adjustment unit configured to iteratively adjust the action posture parameter or body parameter based on the difference between the predicted human body information and the target human body information, until at least one of the difference or the number of iterations satisfies a preset Require.
  • the iterative adjustment unit includes:
  • a priori result obtaining subunit configured to obtain the prior probability distribution result and prior probability target value of the action posture parameter and/or the body parameter
  • an iterative adjustment subunit configured to be based on the difference between the predicted human body information and the target human body information, and the prior probability distribution result of the action posture parameter and/or the body parameter and the prior probability target value
  • the difference between the values of the motion and posture parameters and/or the body parameter is iteratively adjusted until at least one of the difference or the number of iterations satisfies a preset requirement.
  • the model determination module when the number N of the single-camera human body images is greater than or equal to 2, and the N single-camera human body images belong to the same human body, the model determination module includes:
  • the multi-model determination submodule is used to alternately fix body parameters or action posture parameters based on the N single-camera body images, adjust action posture parameters or body parameters, and jointly optimize to generate N three-dimensional body models with the same body parameters, respectively.
  • the human body information determined by the N three-dimensional human body models is matched with the corresponding target human body information.
  • the multi-model determination submodule includes:
  • the prediction model generation unit is used to alternately fix the body parameters or the action posture parameters, adjust the action posture parameters or the body parameters, and generate N predicted three-dimensional human body models, wherein, in the case of fixing the action posture parameters to adjust the body parameters, jointly optimize the N predicted body parameters of the 3D human body model;
  • a predicted human body information obtaining unit configured to respectively project the N predicted three-dimensional human body models into the corresponding single-camera human body images to obtain predicted human body information
  • an iterative adjustment unit configured to iteratively adjust the action posture parameter or body parameter based on the difference between the predicted human body information and the target human body information, until at least one of the difference or the number of iterations satisfies a preset Require.
  • the multi-model determination submodule further includes:
  • the optimization and adjustment unit is used to use and fix the body parameters of the N predicted three-dimensional human body models in the process of reconstructing the three-dimensional human body for the subsequent single-camera human body images, and adjust the action and posture parameters until the difference Or at least one of the number of iterations satisfies a preset requirement.
  • the three-dimensional human body model includes a three-dimensional model composed of a preset number of multiple polygonal meshes connected to each other, and the positions of the mesh vertices of the polygonal meshes are determined by the The figure parameter and the action posture parameter are determined.
  • the body parameter includes at least one of a height parameter, a bone length parameter, and a fat or thin parameter.
  • Another aspect of the present application further provides a computer-readable storage medium, on which computer instructions are stored, and when the instructions are executed, implement the steps of the method in any of the foregoing embodiments.
  • the computer-readable storage medium may include a physical device for storing information, usually after digitizing the information and then storing it in a medium using electrical, magnetic or optical means.
  • the computer-readable storage medium described in this embodiment may include: devices that use electrical energy to store information, such as various memories, such as RAM, ROM, etc.; devices that use magnetic energy to store information, such as hard disks, floppy disks, magnetic tapes, magnetic Core memory, magnetic bubble memory, U disk; devices that use optical means to store information such as CD or DVD.
  • devices that use electrical energy to store information such as various memories, such as RAM, ROM, etc.
  • devices that use magnetic energy to store information such as hard disks, floppy disks, magnetic tapes, magnetic Core memory, magnetic bubble memory, U disk
  • devices that use optical means to store information such as CD or DVD.
  • there are other readable storage media such as quantum memory, graphene memory, and so on.
  • a Programmable Logic Device (such as a Field Programmable Gate Array (FPGA)) is an integrated circuit whose logic function is determined by user programming of the device.
  • HDL Hardware Description Language
  • ABEL Advanced Boolean Expression Language
  • AHDL Altera Hardware Description Language
  • HDCal JHDL
  • Lava Lava
  • Lola MyHDL
  • PALASM RHDL
  • VHDL Very-High-Speed Integrated Circuit Hardware Description Language
  • Verilog Verilog
  • the controller may be implemented in any suitable manner, for example, the controller may take the form of eg a microprocessor or processor and a computer readable medium storing computer readable program code (eg software or firmware) executable by the (micro)processor , logic gates, switches, application specific integrated circuits (ASICs), programmable logic controllers and embedded microcontrollers, examples of controllers include but are not limited to the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicon Labs C8051F320, the memory controller can also be implemented as part of the control logic of the memory.
  • the controller may take the form of eg a microprocessor or processor and a computer readable medium storing computer readable program code (eg software or firmware) executable by the (micro)processor , logic gates, switches, application specific integrated circuits (ASICs), programmable logic controllers and embedded microcontrollers
  • ASICs application specific integrated circuits
  • controllers include but are not limited to
  • the controller in addition to implementing the controller in the form of pure computer-readable program code, the controller can be implemented as logic gates, switches, application-specific integrated circuits, programmable logic controllers and embedded devices by logically programming the method steps.
  • the same function can be realized in the form of a microcontroller, etc. Therefore, such a controller can be regarded as a hardware component, and the devices included therein for realizing various functions can also be regarded as a structure within the hardware component. Or even, the means for implementing various functions can be regarded as both a software module implementing a method and a structure within a hardware component.
  • a typical implementation device is a computer.
  • the computer can be, for example, a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or A combination of any of these devices.
  • embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • Memory may include forms of non-persistent memory, random access memory (RAM) and/or non-volatile memory in computer readable media, such as read only memory (ROM) or flash memory (flash RAM).
  • RAM random access memory
  • ROM read only memory
  • flash RAM flash memory
  • Computer-readable media includes both persistent and non-permanent, removable and non-removable media, and storage of information may be implemented by any method or technology.
  • Information may be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • computer-readable media does not include transitory computer-readable media, such as modulated data signals and carrier waves.
  • the embodiments of the present application may be provided as a method, a system or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • the application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer.
  • program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
  • the application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network.
  • program modules may be located in both local and remote computer storage media including storage devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

本申请关于一种三维人体模型重建方法、装置、电子设备及存储介质。所述方法包括:获取单相机人体图像;将所述单相机人体图像输入至人体信息预测模型中,经所述人体信息预测模型输出所述单相机人体图像中的目标人体信息;利用身材参数、动作姿态参数确定三维人体模型,使得所述三维人体模型确定的人体信息与所述目标人体信息相匹配。利用本申请各个实施例提供的三维人体模型重建方法、装置、电子设备及存储介质,可以快速、准确地重建得到三维人体模型。

Description

三维人体模型重建方法、装置、电子设备及存储介质
本申请要求于2021年1月21日提交中国专利局、申请号为202110083740.6、申请名称为“三维人体模型重建方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机视觉技术领域,尤其涉及一种三维人体模型重建方法、装置、电子设备及存储介质。
背景技术
人体姿态捕捉在很多领域都发挥着重要作用,典型地,比如电影、游戏、刑侦、视频监控等等。相关技术中,采用一些成本较高且过程复杂的方式确实可以捕捉到比较准确的人体身材和姿态,例如在人体上采用反射性采集标记点的方式能够可以获取到准确的人体身材和姿态。但是,该方式需要多个昂贵的相机和特定的场地,花费成本较大,并且给用户带来的不舒适感严重,因此该方式的发展和推广受到阻碍。
单相机人体图像采集具有低成本、易安装、用户友好等特点,但是单相机人体图像一般是只有一种视角的二维信息,比较难以提供三维信息。因此,要想获取生动逼真的人体姿态,则需要对单相机人体图像进行三维人体模型重建。但是,目前基于单相机人体图像重建的三维人体模型往往精确度较低,难以捕捉到逼真的人体姿态。
因此,相关技术中亟需一种精确度更高的基于单相机人体图像的三维人体模型重建方式。
发明内容
本申请实施例的目的在于提供一种三维人体模型重建方法、装置、电子设备及存储介质,可以快速、准确地重建得到三维人体模型。
本申请实施例提供的三维人体模型重建方法、装置、电子设备及存储介质是这样实现的:
一种三维人体模型重建方法,所述方法包括:
获取单相机人体图像;
将所述单相机人体图像输入至人体信息预测模型中,经所述人体信息预测模型输出所述单相机人体图像中的目标人体信息;
利用身材参数、动作姿态参数确定三维人体模型,使得所述三维人体模型确定的人体信息与所述目标人体信息相匹配。
可选的,在本申请的一个实施例中,所述目标人体信息包括下述中的至少一种信息:人体二维关节点、骨骼方向、前景背景分割结果、纹理映射信息。
可选的,在本申请的一个实施例中,所述获取单相机人体图像,包括:
获取包含人体图像在内的单相机图像;
对所述单相机图像进行人体检测,并从所述单相机图像中截取单相机人体图像。
可选的,在本申请的一个实施例中,所述人体信息预测模型被设置为按照下述方式训练得到:
获取多个单相机人体样本图像,所述单相机人体样本图像中标注有人体信息;
构建人体信息预测模型,所述人体信息预测模型中设置有模型参数;
将所述单相机人体样本图像输入至所述人体信息预测模型中,生成预测结果;
基于所述预测结果与所述人体信息之间的差异,对所述模型参数进行迭代调整,直至所述差异满足预设要求。
可选的,在本申请的一个实施例中,所述单相机人体样本图像被设置为按照下述方式获取:
利用多相机同时从不同角度采集得到同一人体的多个单相机图像;
利用所述多个单相机图像重建得到所述人体的三维人体模型;
将所述人体的所述三维人体模型分别投影至所述多个单相机图像中,分别获取所述多个单相机图像中的人体信息;
根据所述人体信息,分别从所述多个单相机图像中分割出人体图像,并将多个所述人体图像作为训练所述人体信息预测模型所使用的单相机人体样本图像。
可选的,在本申请的一个实施例中,所述利用身材参数、动作姿态参数确定三维人体模型,使得所述三维人体模型确定的人体信息与所述目标人体信息相匹配,包括:
交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,生成三维人体模型,使得所述三维人体模型确定的人体信息与所述目标人体信息相匹配。
可选的,在本申请的一个实施例中,所述交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,生成三维人体模型,包括:
交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,生成预测三维人体模型;
将所述预测三维人体模型投影至所述单相机人体图像中,获取预测人体信息;
基于所述预测人体信息与所述目标人体信息之间的差异,对所述动作姿态参数或者身材参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求。
可选的,在本申请的一个实施例中,所述基于所述预测人体信息与所述目标人体信息之间的差异,对所述动作姿态参数或者身材参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求,包括:
获取所述动作姿态参数和/或所述身材参数的先验概率分布结果和先验概率目标值;
基于所述预测人体信息与所述目标人体信息之间的差异以及所述动作姿态参数和/或所述身材参数的先验概率分布结果与所述先验概率目标值的差异,对所述动作姿态参数和/或所述身材参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求。
可选的,在本申请的一个实施例中,在所述单相机人体图像的数量N大于等于2,且N个单相机人体图像属于同一人体的情况下,所述利用身材参数、动作姿态参数确定三维人体模型,使得所述三维人体模型确定的人体信息与所述目标人体信息相匹配,包括:
基于所述N个单相机人体图像,交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,联合优化生成N个具有相同身材参数的三维人体模型,分别使得所述N个三维人体模型确定的人体信息与对应的所述目标人体信息相匹配。
可选的,在本申请的一个实施例中所述交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,联合优化生成N个具有相同身材参数的三维人体模型,包括:
交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,生成N个预测三维人体模型,其中,在固定动作姿态参数调整身材参数的情况下,联合优化所述N个预测三维人体模型的身材参数;
分别将所述N个预测三维人体模型投影至对应的所述单相机人体图像中,获取预测人体信息;
基于所述预测人体信息与所述目标人体信息之间的差异,对所述动作姿态参数或者身材参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求。
可选的,在本申请的一个实施例中,在联合优化所述N个预测三维人体模型的身材 参数后,针对于后续的单相机人体图像进行重建三维人体的过程中,使用并固定所述联合优化所述N个预测三维人体模型的身材参数,调整动作姿态参数,直至所述差异或者迭代次数中的至少一个满足预设要求。
可选的,在本申请的一个实施例中,所述三维人体模型包括由预设数量的多个多边形网格相互连接组成的三维模型,所述多边形网格的网格顶点的位置由所述身材参数和所述动作姿态参数确定。
可选的,在本申请的一个实施例中,所述身材参数包括身高参数、骨骼长度参数、胖瘦参数中的至少一种。
一种三维人体模型重建装置,包括:
获取模块,用于获取单相机人体图像;
信息预测模块,用于将所述单相机人体图像输入至人体信息预测模型中,经所述人体信息预测模型输出所述单相机人体图像中的目标人体信息;
模型确定模块,用于利用身材参数、动作姿态参数确定三维人体模型,使得所述三维人体模型确定的人体信息与所述目标人体信息相匹配。
一种电子设备,包括处理器以及用于存储处理器可执行指令的存储器,所述处理器执行所述指令时实现所述的三维人体模型重建方法。
一种非临时性计算机可读存储介质,当所述存储介质中的指令由处理器执行时,使得处理器能够执行所述的三维人体模型重建方法。
本申请提供的三维人体模型重建方法,可以基于单相机人体图像重建得到三维人体模型,发挥了单相机图像采集的低成本、易安装、用户友好等优势。基于此,单相机人体图像容易采集的优势不仅可以降低人体信息预测模型的构建成本,还可以让重建三维人体模型变得更加快捷、简便。在重建过程中,基于人体信息预测模型所输出的至少一种人体信息,可以有效提高模型重建的准确性和鲁棒性。通过身材参数、动作姿态参数重建得到三维人体模型,为单相机虚拟直播、单相机智能交互技术、人体识别、刑侦监控、电影游戏等技术领域提供准确可靠的技术方案。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请的原理。
图1是根据一示例性实施例示出的一种三维人体模型重建方法流程示意图。
图2是根据一示例性实施例示出的一种三维人体模型重建方法流程示意图。
图3是根据一示例性实施例示出的一种三维人体模型重建方法流程示意图。
图4是根据一示例性实施例示出的一种三维人体模型重建方法流程示意图。
图5是根据一示例性实施例示出的一种三维人体模型重建方法流程示意图。
图6是根据一示例性实施例示出的一种三维人体模型重建方法流程示意图。
图7是根据一示例性实施例示出的一种三维人体模型重建装置的框图。
图8是根据一示例性实施例示出的一种三维人体模型重建装置的框图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。
现有的RGB单相机采集技术,仍然存在着准确性差、需要人为干预(例如需要事先对表演者进行全身扫描或者需要人为提供表演者的初始状态等)、不能实时运行,存在滞后性等问题。本发明着重解决以上痛点问题,提出了一种高精度、全自动、鲁棒性强、实时运行的单相机人体动作捕捉方案。
该方案对初始状态没有要求,且能够实时运行,例如可实现1秒30帧,或者1秒帧20秒,1秒帧25帧左右。
下面结合附图对本申请所述的三维人体模型重建方法进行详细的说明。图1是本申请提供的三维人体模型重建方法的一种实施例的方法流程示意图。虽然本申请提供了如下述实施例或附图所示的方法操作步骤,但基于常规或者无需创造性的劳动在所述方法中可以包括更多或者更少的操作步骤。在逻辑性上不存在必要因果关系的步骤中,这些步骤的执行顺序不限于本申请实施例提供的执行顺序。所述方法在实际中的三维人体模型重建过程中或者装置执行时,可以按照实施例或者附图所示的方法顺序执行或者并行执行(例如并行处理器或者多线程处理的环境)。
具体地,本申请提供的三维人体模型重建方法的一种实施例如图1所示,所述方法可以包括:
S101:获取单相机人体图像。
S103:将所述单相机人体图像输入至人体信息预测模型中,经所述人体信息预测模 型输出所述单相机人体图像中的目标人体信息。
S105:利用身材参数、动作姿态参数确定三维人体模型,使得所述三维人体模型确定的人体信息与所述目标人体信息相匹配。
本申请实施例中,所述单相机人体图像可以包括利用单相机所拍摄的人体图像,所述单相机可以包括单个摄像装置,例如包括单镜头反光相机、具有摄像功能的智能设备(如智能手机、平板电脑、智能穿戴设备等),摄像机可以为RGB相机,或者RGBD相机等等。本申请实施例中,所述单相机人体图像可以包括RGB图像、灰度图像等任何格式的图像。在实际的应用环境中,相机所拍摄的图像不仅仅包括人体图像,还可以包括人体以外的人物背景图像。基于此,可以从单相机捕捉的图像中截取出尽可能只包含人体的单相机人体图像。具体地,在一个实施例中,首先,可以获取包含人体图像在内的单相机图像。然后,可以对所述单相机图像进行人体检测,并从所述单相机图像中截取单相机人体图像。在一个示例中,可以利用基于机器学习的人体检测算法检测出所述单相机图像中的人体图像,并从所述单相机图像中截取出所述单相机人体图像。其中,所述人体检测算法可以包括R-CNN、Fast R-CNN、Faster R-CNN、TCDCN、MTCNN、YOLOV3、SSD等算法,在此不做限制。
本申请实施例中,在获取到所述单相机人体图像之后,可以将所述单相机人体图像输入至人体信息预测模型中,经所述人体信息预测模型输出所述单相机人体图像中的目标人体信息。所述目标人体信息可以包括下述中的至少一种信息:人体二维关节点、骨骼方向、前景背景分割结果、纹理映射信息。其中,所述人体二维关节点包括用于表征人体活动关节的关键点,在一些示例中,所述人体二维关节点可以包括头部、肩部、颈部、四肢等部位的关节。例如Kinect算法可以提取出25个人体二维关节点,当然,在其他算法中,还可以提取17个、18个关键点,当然,也可以自定义人体二维关节点的位置,本申请对于人体二维特征点的数量不做限制。所述骨骼方向可以包括二维骨骼方向和三维骨骼方向。所述二维骨骼方向可以包括人体关节点在单相机图像上的二维连接关系和二维连接方向,例如,头顶关节点和脖子关节点的连接关系和连接方向。所述三维骨骼方向可以包括所述人体三维关节点之间的骨骼连接关系以及该骨骼在三维空间的朝向,例如头顶到颈部的骨骼连接和头顶到颈部的骨骼的三维朝向,颈部到双肩的骨骼连接和颈部到双肩的骨骼三维朝向,肩部到肘部到手部的骨骼连接和肩部到肘部到手部的三维朝向等等。所述前景背景分割结果可以包括人体与其背景之间的分割结果,所述分割结果可以包括分割曲线、分割矩形框、分割掩膜等表达方式。所述纹理映射信息指通过纹理坐标建立的人体图像像素到 三维人体模型的映射关系。在本申请的一个实施例中,所述三维人体模型可以包括由预设数量的多个多边形网格相互连接组成的三维模型,所述多边形网格的网格顶点的位置由身材参数和动作姿态参数确定。所述多边形网格可以包括三角形网格、五边形网格、六边形网格等等。需要说明的是,所述多边形网格的边与其相邻的多边形网格共享。由于所述多边形网格的数量为固定的,因此,所述三维人体模型的网格顶点的数量也是固定。由于在初始阶段所述身材参数和动作姿态参数是未知的,因此,所述三维人体模型中网格顶点的位置为默认位置。后续实施例中,重建所述三维人体模型的过程即为调整所述网格顶点的位置的过程。在一些示例中,所述网格顶点可以具有唯一标识,所述唯一标识例如可以包括纹理映射的(u,v)坐标,这样,所述纹理映射信息可以包括人体图像像素点到所述网格顶点的唯一标识的映射关系。在一个示例中,所述映射关系可以包括单相机图像中坐标位置为(29,76)的像素点对应于所述三维人体模型中编号为纹理坐标为(0.6,0.3)的网格顶点。
在本申请实施例中,利用所述人体信息预测模型这一个模型可以确定所述单相机人体图像中的上述多种人体信息。在一个实施例中,所述人体信息预测模型可以包括多任务机器学习模型,所述多任务机器学习模型可以实现多种任务,例如包括多任务深度学习网络,本申请实施例的所述多任务深度学习网络可以实现四种预测任务。本申请实施例中,由于人体二维关节点、骨骼方向、前景背景分割结果、纹理映射信息等人体信息之间具有相关性,因此,将多种人体信息融合到同一个模型中学习,可以利用多种信息之间的相关性提升所述人体信息预测模型的准确性。
本申请实施例中,在训练得到所述人体信息预测模型的一个实施例中,如图2所示,可以包括下述步骤:
S201:获取多个单相机人体样本图像,所述单相机人体样本图像中标注有人体信息。
S203:构建人体信息预测模型,所述人体信息预测模型中设置有模型参数。
S205:将所述单相机人体样本图像输入至所述人体信息预测模型中,生成预测结果。
S207:基于所述预测结果与所述人体信息之间的差异,对所述模型参数进行迭代调整,直至所述差异满足预设要求。
本申请实施例中,所述预设要求例如可以包括所述差异的数值小于预设阈值。由于是多任务学习,所述预测结果中可以包括人体二维关节点、三维骨骼方向、前景背景分割结果、纹理映射信息中的多种信息,所述差异可以包括多个预测结果分别与对应的二维信息之间的差异之和。当然,所述人体信息预测模型可以输出的人体信息不限于上述举例,还 可以包括其他任何人体信息,本申请在此不做限制。需要说明的是,训练所述人体信息预测模型的机器学习算法可以包括Resnet骨干网络、MobileNet骨干网络、VGG骨干网络等等,在此不做限制。
在实际应用中,训练得到准确的模型往往需要较多的样本数据,而样本数据的标注需要耗费较多的时间成本和人力成本,尤其对于基于多任务学习的人体信息预测模型,可能需要在所述单相机人体样本图像上标注人体二维关节点、骨骼方向、前景背景分割结果、纹理映射信息等多种人体信息。基于此,在本申请的一个实施例中,如图3所示,可以按照下述方式获取所述单相机人体样本图像:
S301:利用多相机同时从不同角度采集得到同一人体的多个单相机图像。
S303:利用所述多个单相机图像重建得到所述人体的三维人体模型。
S305:将所述三维人体模型投影至所述多个单相机图像中,分别获取所述多个单相机图像中的人体信息。
S307:根据所述人体信息,分别从所述多个单相机图像中分割出人体图像,并将多个所述人体图像作为训练所述人体信息预测模型所使用的单相机人体样本图像。
本申请实施例中,可以利用多个相机同时从多个角度拍摄同一人体,这样可以获取该人体的多个单相机图像。例如,利用5个相机拍摄得到5张图像,这样,可以一次性获取到5张单相机图像。然后,可以利用所述多个单相机图像重建得到所述人体的三维人体模型,通过所述三维人体模型可以确定身材参数和动作姿态参数。最后,可以将所述三维人体模型分别投影回多个单相机图像中,以分别获得到各个单相机图像中的包括人体二维关节点、骨骼方向、前景背景分割结果、纹理映射信息中至少一种的人体信息。此处多相机重建所用到的三维人体模型和后续单相机重建所用到的三维人体模型需要具有相同的拓扑结构,即相同的顶点连接关系。
在实际应用场景下,所述单相机图像中包括人体信息,可以根据人体信息对所述单相机图像进行人体图像分割,并将分割得到的人体图像作为训练所述人体信息预测模型所使用的单相机人体样本图像。在上述实施例中,将所述三维人体模型投影至所述单相机图像中,可以获取到所述单相机图像的人体信息,其中,所述人体信息可以包括前景背景分割结果。因此,可以根据所述前景背景分割结果,将人体图像从所述单相机图像分割出来,得到单相机人体样本图像。在一个示例中,可以利用BoundingBox算法进行图像分离。
本实施例中生成所述单相机人体样本图像的方式可以大量节省人工标注成本,并且可以用较少的时间获取到大量的样本的数据,降低获取训练样本的成本。
本申请实施例中,在利用所述人体信息预测模型输出所述目标人体信息之后,可以重建得到由身材参数和动作姿态参数所确定的三维人体模型,使得从所述三维人体模型确定的人体信息与所述目标人体信息相匹配。其中,所述身材参数可以用于表征人体的身材特点,其中可以包括身高、骨骼长度、胖瘦等信息。通过所述身材参数和所述动作姿态参数可以唯一地确定三维人体模型。基于此,在本申请重建三维人体模型中,可以通过不断调整身材参数变量和动作姿态参数变量,使得生成的三维人体模型所确定的人体信息与所述目标人体信息相匹配。
在本申请的一个实施例中,可以利用analysis-by-synthesis(合成-分析)算法调整参数确定所述三维人体模型。在本申请实施例中,可以交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,生成三维人体模型,使得所述三维人体模型确定的人体信息与所述目标人体信息相匹配。本实施例中,可以采用“固定身材参数优化动作姿态参数”和“固定动作姿态参数优化身材参数”这种交替优化的策略,相对于同时优化身材参数和动作姿态参数的策略,这种交替优化的方式能够使得所述三维人体模型快速收敛,提升优化效率。
具体地,在本申请的一个实施例中,如图4所示,所述交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,生成三维人体模型,可以包括:
S401:交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,生成预测三维人体模型;
S403:将所述预测三维人体模型投影至所述单相机人体图像中,获取预测人体信息;
S405:基于所述预测人体信息与所述目标人体信息之间的差异,对所述动作姿态参数或者身材参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求。
上述实施例提供了交替优化的一种具体实施方式,在优化初始时刻,可以提供初始三维人体模型,所述初始三维人体模型为还未开始优化参数时的三维人体模型,所述初始三维人体模型可以基于默认身材参数和默认动作姿态参数生成。所述默认参数可以根据预设数据库中存储的身材参数、动作姿态参数的平均值确定,也可以利用上一帧单相机人体图像重建得到的身材参数、动作姿态参数,本申请在此不做限制。另外,先“固定身材参数优化动作姿态参数”还是先“固定动作姿态参数优化身材参数”,本申请在此不做限制。
在一个具体的示例中,首先可以固定身材参数优化动作姿态参数。具体地,可以确定单相机人体图像1的目标人体信息,如18个二维关节点、17个骨骼方向、前景背景分割结果和纹理坐标。然后,可以将所述初始三维人体模型投影到单相机人体图像1中,获取 到预测人体信息,如18个预测二维关节点、17个预测骨骼方向、预测前景背景分割结果和预测纹理坐标,并确定所述预测人体信息与所述目标人体信息之间的差异。然后,可以基于所述差异,对所述动作姿态参数进行调整。此后,再固定动作姿态参数优化身材参数,调整的方式与固定身材参数优化动作姿态参数中调整的方式相同,在此不再赘述。通过交替迭代调整所述动作姿态参数和所述身材参数,直至所述预测人体信息与所述目标人体信息之间的差异或者迭代次数中的至少一个满足预设要求。
需要说明的是,所述迭代调整的方式可以包括梯度下降优化算法(Gradient-based Optimization)、粒子群优化算法(Particle Swarm Optimization)等等,本申请在此不做限制。所述差异对应的预设要求可以包括所述差异的数值小于等于预设阈值等,所述预设阈值可以设置为0、0.01等数值。所述迭代次数对应的预设要求可以包括迭代次数小于预设次数,所述预设次数例如可以设置为5次、7次等等。若所述差异或者所述迭代次数中的至少一个满足预设要求时所确定的一组参数为(身材参数1,动作姿态参数1),由(身材参数1,动作姿态参数1)可以确定第一预测三维人体模型。
实际应用中,重建得到的三维人体模型会有很多可能性,因此,可能会产生一定程度的歧义性。例如,重建出来的三维人体模型不是自然逼真的人体状态。基于此,本申请实施例中,还可以获取所述身材参数、所述动作姿态参数中至少一个参数的先验概率分布结果和先验概率目标值,通过比较所述先验概率分布结果和所述先验概率目标值,避免所述先验概率分布结果超出合理的范围。本申请实施例中,所述先验概率目标值可以根据大量真实采集的人体数据确定,因此,可以有效降低重建得到的三维人体模型的歧义性。
具体地,在本申请的一个实施例中,如图5所示,所述基于所述预测人体信息与所述目标人体信息之间的差异,对所述动作姿态参数或者身材参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求,可以包括:
S501:获取所述动作姿态参数和/或所述身材参数的先验概率分布结果和先验概率目标值;
S503:基于所述预测人体信息与所述目标人体信息之间的差异以及所述动作姿态参数和/或所述身材参数的先验概率分布结果与所述先验概率目标值的差异,对所述动作姿态参数和/或所述身材参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求。
本申请实施例中,可以将所述动作姿态参数和/或所述身材参数的先验概率分布结果和先验概率目标值之间的差异也作为所述三维人体模型收敛的条件,可以有效降低重建得 到的三维人体模型的歧义性。
在进行实时三维人体模型重建的很多应用场景下,例如直播、拍摄影片等,往往是在一段时间内,只拍摄具有同一身份的人,即只拍摄同一个人体。针对相同的人体,那么其身材参数是固定的,在对后续的单相机人体图像重建三维人体模型的过程中,可以继续使用所述单相机人体图像1的身材参数,并在优化过程中,可以只对动作姿态参数优化,简化优化过程,提高人体模型重建效率。
针对上述连续长时间拍摄同一人体的应用场景,为了提升身材参数的稳定性和准确性,可以同时对该用户的多个单相机人体图像进行联合优化,以提升重建效率。在一个实施例中,可以获取到同一用户的多个单相机人体图像,如实时拍摄的20帧不同姿态的人体图像。同样可以采用“固定身材参数优化动作姿态参数”和“固定动作姿态参数优化身材参数”这种交替优化的策略。在该示例中,假设参与联合优化的单相机人体图像的数量为N,且N个单相机人体图像属于同一人体。
进一步,可以基于所述N个单相机人体图像,交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,联合优化生成N个具有相同身材参数的三维人体模型,分别使得所述N个三维人体模型确定的人体信息与对应的所述目标人体信息相匹配。如图6所示,在一个具体的实施例中,包括:
S601:交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,生成N个预测三维人体模型,其中,在固定动作姿态参数调整身材参数的情况下,联合优化所述N个预测三维人体模型的身材参数;
S603:分别将所述N个预测三维人体模型投影至对应的所述单相机人体图像中,获取预测人体信息;
S605:基于所述预测人体信息与所述目标人体信息之间的差异,对所述动作姿态参数或者身材参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求。
本申请实施例中,由于同一个人体的身材参数是相同的,因此,在固定动作姿态参数调整身材参数的情况下,可以联合优化所述N个预测三维人体模型的身材参数。使得上述实施例的技术方案不仅具有交替优化的收敛速度快、重建效率高的优势,还利用同一人体身份参数相同的特点对多张单相机人体图像进行联合优化,这样,通过一次优化,可以重建得到多个三维人体模型,大大提升重建效率。
本申请实施例中,在联合优化所述N个预测三维人体模型的身材参数后,针对于后续的单相机人体图像进行重建三维人体的过程中,使用并固定所述联合优化所述N个预 测三维人体模型的身材参数,调整动作姿态参数,直至所述差异或者迭代次数中的至少一个满足预设要求。
下面通过一个具体的示例说明上述实施例。首先,可以固定身材参数优化动作姿态参数,具体地,利用S101和S103的实施方式分别确定N个单相机人体图像的目标人体信息。然后,可以获取N个初始三维人体模型,所述初始三维人体模型的获取方式可以参考上述实施例,在此不做限制。将所述N个初始三维人体模型分别投影至对应的所述单相机人体图像中,可以获取到N个第一预测人体信息,并分别确定N个第一预测人体信息与对应的所述目标人体信息之间的N个差异。基于所述N个差异,分别对N个模型的所述动作姿态参数进行调整,得到N组参数为(身材参数1,动作姿态参数1)、(身材参数1,动作姿态参数2)……(身材参数1,动作姿态参数N),根据N组参数,可以确定N个第一预测三维人体模型。然后,可以固定动作姿态参数,联合优化身材参数。具体地,可以分别将所述N个第一预测三维人体模型投影至对应的所述单相机人体图像中,获取到N个第二预测人体信息,并确定所述N个第二预测人体信息分别与对应所述目标人体信息之间的N个差异:△1、△2、……、△N,确定N个差异的和为Σ△=△1+△2+……+△N。基于Σ△,分别对N个模型的所述身材参数进行调整,得到N组参数为(身材参数X,动作姿态参数1)、(身材参数X,动作姿态参数2)……(身材参数X,动作姿态参数N),根据N组参数,可以确定N个所述第二预测三维人体模型。通过交替迭代调整所述动作姿态参数和所述身材参数,直至所述预测人体信息与所述目标人体信息之间的差异或者迭代次数中的至少一个满足预设要求。
进一步,在获得身材参数X之后,对于后续获得的单相机图像,在重建三维人体模型的过程中,可以使用上述联合优化过程中获得的身材参数X,并在优化后续获得的单相机图像过程中,可以只对动作姿态参数优化,这样简化优化过程,提高人体模型重建效率。
需要说明的是,所述迭代调整的方式可以包括梯度下降优化算法(Gradient-based Optimization)、粒子群优化算法(Particle Swarm Optimization)等等,本申请在此不做限制。所述差异对应的预设要求可以包括所述差异的数值小于等于预设阈值等,所述预设阈值可以设置为0、0.01等数值。所述迭代次数对应的预设要求可以包括迭代次数小于预设次数,所述预设次数例如可以设置为5次、7次等等。
本申请实施例中,也可以在对N张单相机人体图像进行联合优化的场景中利用先验概率约束所述身材参数和/或所述动作姿态参数,使得重建得到的三维人体模型更加逼真。
本申请实施例中,在确定所述三维人体模型之后,根据所述三维人体模型获取到生动 逼真的人体形象。例如,在直播场景中,可以将后台演员的三维人体模型渲染至动画人物中,产生动画人物直播的生动场景。在游戏场景中,可以将玩家的三维人体模型渲染至游戏人物中,产生生动的游戏场景。当然,还可以使用到动画制作,电影制作等其他多种应用场景中,本申请在此不做限制。
本申请提供的三维人体模型重建方法,可以采用在离线模式和实时模式上,离线模式包括根据离线视频进行三维人体模型重建方法,无需立刻输出三维人体模型,可以用在后期制作动画影视。实时模式包括还可以在交互游戏、直播等需要和用户进行实时互动的领域运行,实时应用时经过GPU加速之后,可以实时运行(即获得图片后,即刻输出三维人体模型,这之间的延迟不易被用户感知到)。三维人体模型重建方法可以有离线模式和实时模式,使得该可以得到更加广泛的应用。
本申请提供的三维人体模型重建方法,可以基于单相机人体图像重建得到三维人体模型,发挥了单相机图像采集的低成本、易安装、用户友好等优势。基于此,单相机人体图像容易采集的优势不仅可以降低人体信息预测模型的构建成本,还可以让重建三维人体模型变得更加快捷、简便。在重建过程中,基于人体信息预测模型所输出的至少一种人体信息,可以有效提高模型重建的准确性和鲁棒性。通过身材参数、动作姿态参数重建得到三维人体模型,为单相机虚拟直播、单相机智能交互技术、人体识别、刑侦监控、电影游戏等技术领域提供准确可靠的技术方案。
对应于上述三维人体模型重建方法,如图7所示,本申请还提供一种电子设备,包括处理器以及用于存储处理器可执行指令的存储器,所述处理器执行所述指令时可以实现上述任一实施例所述的三维人体模型重建方法。
本申请另一方面还提供一种三维人体模型重建装置,如图8所示,所述装置800可以包括:
获取模块801,用于获取单相机人体图像;
信息预测模块803,用于将所述单相机人体图像输入至人体信息预测模型中,经所述人体信息预测模型输出所述单相机人体图像中的目标人体信息;
模型确定模块805,用于利用身材参数、动作姿态参数确定三维人体模型,使得所述三维人体模型确定的人体信息与所述目标人体信息相匹配。
其他的权利要求对应的模块也补充出来。
可选的,在本申请的一个实施例中,所述目标人体信息包括下述中的至少一种信息:人体二维关节点、骨骼方向、前景背景分割结果、纹理映射信息。
可选的,在本申请的一个实施例中,所述获取模块,包括:
图像获取子模块,用于获取包含人体图像在内的单相机图像;
人体检测子模块,用于对所述单相机图像进行人体检测,并从所述单相机图像中截取单相机人体图像。
可选的,在本申请的一个实施例中,所述人体信息预测模型被设置为利用下述子模块训练得到:
样本获取子模块,用于获取多个单相机人体样本图像,所述单相机人体样本图像中标注有人体信息;
模型构建子模块,用于构建人体信息预测模型,所述人体信息预测模型中设置有模型参数;
预测结果生成子模块,用于将所述单相机人体样本图像输入至所述人体信息预测模型中,生成预测结果;
迭代调整子模块,用于基于所述预测结果与所述人体信息之间的差异,对所述模型参数进行迭代调整,直至所述差异满足预设要求。
可选的,在本申请的一个实施例中,所述单相机人体样本图像被设置为按照下述模块获取:
图像获取子模块,用于利用多相机同时从不同角度采集得到同一人体的多个单相机图像;
模型重建子模块,用于利用所述多个单相机图像重建得到所述人体的三维人体模型;
人体信息获取子模块,用于将所述人体的所述三维人体模型分别投影至所述多个单相机图像中,分别获取所述多个单相机图像中的人体信息;
图像分割子模块,用于根据所述人体信息,分别从所述多个单相机图像中分割出人体图像,并将多个所述人体图像作为训练所述人体信息预测模型所使用的单相机人体样本图像。
可选的,在本申请的一个实施例中,所述模型确定模块包括:
交替调整子模块,用于交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,生成三维人体模型,使得所述三维人体模型确定的人体信息与所述目标人体信 息相匹配。
可选的,在本申请的一个实施例中,所述交替调整子模块,包括:
预测模型生成单元,用于交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,生成预测三维人体模型;
人体信息获取单元,用于将所述预测三维人体模型投影至所述单相机人体图像中,获取预测人体信息;
迭代调整单元,用于基于所述预测人体信息与所述目标人体信息之间的差异,对所述动作姿态参数或者身材参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求。
可选的,在本申请的一个实施例中,所述迭代调整单元,包括:
先验结果获取子单元,用于获取所述动作姿态参数和/或所述身材参数的先验概率分布结果和先验概率目标值;
迭代调整子单元,用于基于所述预测人体信息与所述目标人体信息之间的差异以及所述动作姿态参数和/或所述身材参数的先验概率分布结果与所述先验概率目标值的差异,对所述动作姿态参数和/或所述身材参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求。
可选的,在本申请的一个实施例中,在所述单相机人体图像的数量N大于等于2,且N个单相机人体图像属于同一人体的情况下,所述模型确定模块,包括:
多模型确定子模块,用于基于所述N个单相机人体图像,交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,联合优化生成N个具有相同身材参数的三维人体模型,分别使得所述N个三维人体模型确定的人体信息与对应的所述目标人体信息相匹配。
可选的,在本申请的一个实施例中,所述多模型确定子模块,包括:
预测模型生成单元,用于交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,生成N个预测三维人体模型,其中,在固定动作姿态参数调整身材参数的情况下,联合优化所述N个预测三维人体模型的身材参数;
预测人体信息获取单元,用于分别将所述N个预测三维人体模型投影至对应的所述单相机人体图像中,获取预测人体信息;
迭代调整单元,用于基于所述预测人体信息与所述目标人体信息之间的差异,对所述动作姿态参数或者身材参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足 预设要求。
可选的,在本申请的一个实施例中,所述多模型确定子模块还包括:
优化调整单元,用于针对于后续的单相机人体图像进行重建三维人体的过程中,使用并固定所述联合优化所述N个预测三维人体模型的身材参数,调整动作姿态参数,直至所述差异或者迭代次数中的至少一个满足预设要求。
可选的,在本申请的一个实施例中,所述三维人体模型包括由预设数量的多个多边形网格相互连接组成的三维模型,所述多边形网格的网格顶点的位置由所述身材参数和所述动作姿态参数确定。
可选的,在本申请的一个实施例中,所述身材参数包括身高参数、骨骼长度参数、胖瘦参数中的至少一种。
本申请另一方面还提供一种计算机可读存储介质,其上存储有计算机指令,所述指令被执行时实现上述任一实施例所述方法的步骤。
所述计算机可读存储介质可以包括用于存储信息的物理装置,通常是将信息数字化后再以利用电、磁或者光学等方式的媒体加以存储。本实施例所述的计算机可读存储介质有可以包括:利用电能方式存储信息的装置如,各式存储器,如RAM、ROM等;利用磁能方式存储信息的装置如,硬盘、软盘、磁带、磁芯存储器、磁泡存储器、U盘;利用光学方式存储信息的装置如,CD或DVD。当然,还有其他方式的可读存储介质,例如量子存储器、石墨烯存储器等等。
在20世纪90年代,对于一个技术的改进可以很明显地区分是硬件上的改进(例如,对二极管、晶体管、开关等电路结构的改进)还是软件上的改进(对于方法流程的改进)。然而,随着技术的发展,当今的很多方法流程的改进已经可以视为硬件电路结构的直接改进。设计人员几乎都通过将改进的方法流程编程到硬件电路中来得到相应的硬件电路结构。因此,不能说一个方法流程的改进就不能用硬件实体模块来实现。例如,可编程逻辑器件(Programmable Logic Device,PLD)(例如现场可编程门阵列(Field Programmable Gate Array,FPGA))就是这样一种集成电路,其逻辑功能由用户对器件编程来确定。由设计人员自行编程来把一个数字系统“集成”在一片PLD上,而不需要请芯片制造厂商来设计和制作专用的集成电路芯片。而且,如今,取代手工地制作集成电路芯片,这种编程也多半改用“逻辑编译器(logic compiler)”软件来实现,它与程序开发撰写时所用的软件编译 器相类似,而要编译之前的原始代码也得用特定的编程语言来撰写,此称之为硬件描述语言(Hardware Description Language,HDL),而HDL也并非仅有一种,而是有许多种,如ABEL(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language)等,目前最普遍使用的是VHDL(Very-High-Speed Integrated Circuit Hardware Description Language)与Verilog。本领域技术人员也应该清楚,只需要将方法流程用上述几种硬件描述语言稍作逻辑编程并编程到集成电路中,就可以很容易得到实现该逻辑方法流程的硬件电路。
控制器可以按任何适当的方式实现,例如,控制器可以采取例如微处理器或处理器以及存储可由该(微)处理器执行的计算机可读程序代码(例如软件或固件)的计算机可读介质、逻辑门、开关、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程逻辑控制器和嵌入微控制器的形式,控制器的例子包括但不限于以下微控制器:ARC 625D、Atmel AT91SAM、Microchip PIC18F26K20以及Silicone Labs C8051F320,存储器控制器还可以被实现为存储器的控制逻辑的一部分。本领域技术人员也知道,除了以纯计算机可读程序代码方式实现控制器以外,完全可以通过将方法步骤进行逻辑编程来使得控制器以逻辑门、开关、专用集成电路、可编程逻辑控制器和嵌入微控制器等的形式来实现相同功能。因此这种控制器可以被认为是一种硬件部件,而对其内包括的用于实现各种功能的装置也可以视为硬件部件内的结构。或者甚至,可以将用于实现各种功能的装置视为既可以是实现方法的软件模块又可以是硬件部件内的结构。
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机。具体的,计算机例如可以为个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任何设备的组合。
为了描述的方便,描述以上装置时以功能分为各种单元分别描述。当然,在实施本申请时可以把各单元的功能在同一个或多个软件和/或硬件中实现。
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机 可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。
本领域技术人员应明白,本申请的实施例可提供为方法、系统或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
以上所述仅为本申请的实施例而已,并不用于限制本申请。对于本领域技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本申请的权利要求范围之内。

Claims (16)

  1. 一种三维人体模型重建方法,其特征在于,所述方法包括:
    获取单相机人体图像;
    将所述单相机人体图像输入至人体信息预测模型中,经所述人体信息预测模型输出所述单相机人体图像中的目标人体信息;
    利用身材参数、动作姿态参数确定三维人体模型,使得所述三维人体模型确定的人体信息与所述目标人体信息相匹配。
  2. 根据权利要求1所述的方法,其特征在于,所述目标人体信息包括下述中的至少一种信息:人体二维关节点、骨骼方向、前景背景分割结果、纹理映射信息。
  3. 根据权利要求1所述的方法,其特征在于,所述获取单相机人体图像,包括:
    获取包含人体图像在内的单相机图像;
    对所述单相机图像进行人体检测,并从所述单相机图像中截取单相机人体图像。
  4. 根据权利要求1所述的方法,其特征在于,所述人体信息预测模型被设置为按照下述方式训练得到:
    获取多个单相机人体样本图像,所述单相机人体样本图像中标注有人体信息;
    构建人体信息预测模型,所述人体信息预测模型中设置有模型参数;
    将所述单相机人体样本图像输入至所述人体信息预测模型中,生成预测结果;
    基于所述预测结果与所述人体信息之间的差异,对所述模型参数进行迭代调整,直至所述差异满足预设要求。
  5. 根据权利要求4所述的方法,其特征在于,所述单相机人体样本图像被设置为按照下述方式获取:
    利用多相机同时从不同角度采集得到同一人体的多个单相机图像;
    利用所述多个单相机图像重建得到所述人体的三维人体模型;
    将所述人体的所述三维人体模型分别投影至所述多个单相机图像中,分别获取所述多个单相机图像中的人体信息;
    根据所述人体信息,分别从所述多个单相机图像中分割出人体图像,并将多个所述人 体图像作为训练所述人体信息预测模型所使用的单相机人体样本图像。
  6. 根据权利要求1所述的方法,其特征在于,所述利用身材参数、动作姿态参数确定三维人体模型,使得所述三维人体模型确定的人体信息与所述目标人体信息相匹配,包括:
    交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,生成三维人体模型,使得所述三维人体模型确定的人体信息与所述目标人体信息相匹配。
  7. 根据权利要求6所述的方法,其特征在于,所述交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,生成三维人体模型,包括:
    交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,生成预测三维人体模型;
    将所述预测三维人体模型投影至所述单相机人体图像中,获取预测人体信息;
    基于所述预测人体信息与所述目标人体信息之间的差异,对所述动作姿态参数或者身材参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求。
  8. 根据权利要求7所述的方法,其特征在于,所述基于所述预测人体信息与所述目标人体信息之间的差异,对所述动作姿态参数或者身材参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求,包括:
    获取所述动作姿态参数和/或所述身材参数的先验概率分布结果和先验概率目标值;
    基于所述预测人体信息与所述目标人体信息之间的差异以及所述动作姿态参数和/或所述身材参数的先验概率分布结果与所述先验概率目标值的差异,对所述动作姿态参数和/或所述身材参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求。
  9. 根据权利要求1所述的方法,其特征在于,在所述单相机人体图像的数量N大于等于2,且N个单相机人体图像属于同一人体的情况下,所述利用身材参数、动作姿态参数确定三维人体模型,使得所述三维人体模型确定的人体信息与所述目标人体信息相匹配,包括:
    基于所述N个单相机人体图像,交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,联合优化生成N个具有相同身材参数的三维人体模型,分别使得 所述N个三维人体模型确定的人体信息与对应的所述目标人体信息相匹配。
  10. 根据权利要求9所述的方法,其特征在于,所述交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,联合优化生成N个具有相同身材参数的三维人体模型,包括:
    交替固定身材参数或者动作姿态参数,调整动作姿态参数或者身材参数,生成N个预测三维人体模型,其中,在固定动作姿态参数调整身材参数的情况下,联合优化所述N个预测三维人体模型的身材参数;
    分别将所述N个预测三维人体模型投影至对应的所述单相机人体图像中,获取预测人体信息;
    基于所述预测人体信息与所述目标人体信息之间的差异,对所述动作姿态参数或者身材参数进行迭代调整,直至所述差异或者迭代次数中的至少一个满足预设要求。
  11. 根据权利要求10所述的方法,其特征在于,在所述联合优化所述N个预测三维人体模型的身材参数后,所述方法还包括:
    针对于后续的单相机人体图像进行重建三维人体的过程中,使用并固定所述联合优化所述N个预测三维人体模型的身材参数,调整动作姿态参数,直至所述差异或者迭代次数中的至少一个满足预设要求。
  12. 根据权利要求1所述的方法,其特征在于,所述三维人体模型包括由预设数量的多个多边形网格相互连接组成的三维模型,所述多边形网格的网格顶点的位置由所述身材参数和所述动作姿态参数确定。
  13. 根据权利要求1所述的方法,其特征在于,所述身材参数包括身高参数、骨骼长度参数、胖瘦参数中的至少一种。
  14. 一种三维人体模型重建装置,其特征在于,包括:
    获取模块,用于获取单相机人体图像;
    信息预测模块,用于将所述单相机人体图像输入至人体信息预测模型中,经所述人体信息预测模型输出所述单相机人体图像中的目标人体信息;
    模型确定模块,用于利用身材参数、动作姿态参数确定三维人体模型,使得所述三维人体模型确定的人体信息与所述目标人体信息相匹配。
  15. 一种电子设备,其特征在于,包括处理器以及用于存储处理器可执行指令的存储器,所述处理器执行所述指令时实现权利要求1-12任一项所述的三维人体模型重建方法。
  16. 一种非临时性计算机可读存储介质,其特征在于,当所述存储介质中的指令由处理器执行时,使得处理器能够执行权利要求1-12任意一项所述的三维人体模型重建方法。
PCT/CN2022/070258 2021-01-21 2022-01-05 三维人体模型重建方法、装置、电子设备及存储介质 WO2022156533A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110083740.6 2021-01-21
CN202110083740.6A CN112819944B (zh) 2021-01-21 2021-01-21 三维人体模型重建方法、装置、电子设备及存储介质

Publications (1)

Publication Number Publication Date
WO2022156533A1 true WO2022156533A1 (zh) 2022-07-28

Family

ID=75858618

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/070258 WO2022156533A1 (zh) 2021-01-21 2022-01-05 三维人体模型重建方法、装置、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN112819944B (zh)
WO (1) WO2022156533A1 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116437063A (zh) * 2023-06-15 2023-07-14 广州科伊斯数字技术有限公司 一种三维图像显示系统及方法
CN117077723A (zh) * 2023-08-15 2023-11-17 支付宝(杭州)信息技术有限公司 一种数字人动作生产方法及装置
CN117830564A (zh) * 2024-03-05 2024-04-05 之江实验室 一种姿态分布指导的三维虚拟人模型重建方法
CN117876610A (zh) * 2024-03-12 2024-04-12 之江实验室 针对三维构建模型的模型训练方法、装置、存储介质
CN117893697A (zh) * 2024-03-15 2024-04-16 之江实验室 一种三维人体视频重建方法、装置、存储介质及电子设备
CN117978937A (zh) * 2024-03-28 2024-05-03 之江实验室 一种视频生成的方法、装置、存储介质及电子设备
CN118015161A (zh) * 2024-04-08 2024-05-10 之江实验室 一种康复视频的生成方法及装置
WO2024103890A1 (zh) * 2022-11-18 2024-05-23 苏州元脑智能科技有限公司 模型构建方法、重建方法、装置、电子设备及非易失性可读存储介质

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884881B (zh) * 2021-01-21 2022-09-27 魔珐(上海)信息科技有限公司 三维人脸模型重建方法、装置、电子设备及存储介质
CN112819944B (zh) * 2021-01-21 2022-09-27 魔珐(上海)信息科技有限公司 三维人体模型重建方法、装置、电子设备及存储介质
CN113298858A (zh) * 2021-05-21 2021-08-24 广州虎牙科技有限公司 一种虚拟形象的动作生成方法、装置、终端以及存储介质
CN115457104B (zh) * 2022-10-28 2023-01-24 北京百度网讯科技有限公司 人体信息的确定方法、装置及电子设备
CN115953548A (zh) * 2022-12-15 2023-04-11 阿里巴巴(中国)有限公司 一种人体身材模型的创建方法、装置及电子设备
CN117893696B (zh) * 2024-03-15 2024-05-28 之江实验室 一种三维人体数据生成方法、装置、存储介质及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180053309A1 (en) * 2016-08-22 2018-02-22 Seiko Epson Corporation Spatial Alignment of M-Tracer and 3-D Human Model For Golf Swing Analysis Using Skeleton
CN108629801A (zh) * 2018-05-14 2018-10-09 华南理工大学 一种视频序列的三维人体模型姿态与形状重构方法
CN110298916A (zh) * 2019-06-21 2019-10-01 湖南大学 一种基于合成深度数据的三维人体重建方法
CN110415336A (zh) * 2019-07-12 2019-11-05 清华大学 高精度人体体态重建方法及系统
CN112819944A (zh) * 2021-01-21 2021-05-18 魔珐(上海)信息科技有限公司 三维人体模型重建方法、装置、电子设备及存储介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109285215B (zh) * 2018-08-28 2021-01-08 腾讯科技(深圳)有限公司 一种人体三维模型重建方法、装置和存储介质
CN109712234B (zh) * 2018-12-29 2023-04-07 北京卡路里信息技术有限公司 三维人体模型的生成方法、装置、设备和存储介质
CN109859296B (zh) * 2019-02-01 2022-11-29 腾讯科技(深圳)有限公司 Smpl参数预测模型的训练方法、服务器及存储介质
CN110335343B (zh) * 2019-06-13 2021-04-06 清华大学 基于rgbd单视角图像人体三维重建方法及装置
US10813715B1 (en) * 2019-10-16 2020-10-27 Nettelo Incorporated Single image mobile device human body scanning and 3D model creation and analysis
CN111127631B (zh) * 2019-12-17 2023-07-28 深圳先进技术研究院 基于单图像的三维形状和纹理重建方法、系统及存储介质
CN111784818B (zh) * 2020-06-01 2024-04-16 北京沃东天骏信息技术有限公司 生成三维人体模型的方法、装置及计算机可读存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180053309A1 (en) * 2016-08-22 2018-02-22 Seiko Epson Corporation Spatial Alignment of M-Tracer and 3-D Human Model For Golf Swing Analysis Using Skeleton
CN108629801A (zh) * 2018-05-14 2018-10-09 华南理工大学 一种视频序列的三维人体模型姿态与形状重构方法
CN110298916A (zh) * 2019-06-21 2019-10-01 湖南大学 一种基于合成深度数据的三维人体重建方法
CN110415336A (zh) * 2019-07-12 2019-11-05 清华大学 高精度人体体态重建方法及系统
CN112819944A (zh) * 2021-01-21 2021-05-18 魔珐(上海)信息科技有限公司 三维人体模型重建方法、装置、电子设备及存储介质

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Computer vision - ECCV 2016 : 14th European conference, Amsterdam, The Netherlands, October 11-14, 2016 : proceedings", vol. 9909 Chap.34, 16 September 2016, SPRINGER , Berlin, Heidelberg , ISBN: 978-3-319-46453-4, article BOGO FEDERICA; KANAZAWA ANGJOO; LASSNER CHRISTOPH; GEHLER PETER; ROMERO JAVIER; BLACK MICHAEL J.: "Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image", pages: 561 - 578, XP047355099, DOI: 10.1007/978-3-319-46454-1_34 *
ZENG ZHICHAO: "Video-based 3D Human Model Pose and Shape Reconstruction", MASTER THESIS, TIANJIN POLYTECHNIC UNIVERSITY, CN, 15 January 2019 (2019-01-15), CN , XP055953537, ISSN: 1674-0246 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024103890A1 (zh) * 2022-11-18 2024-05-23 苏州元脑智能科技有限公司 模型构建方法、重建方法、装置、电子设备及非易失性可读存储介质
CN116437063A (zh) * 2023-06-15 2023-07-14 广州科伊斯数字技术有限公司 一种三维图像显示系统及方法
CN117077723A (zh) * 2023-08-15 2023-11-17 支付宝(杭州)信息技术有限公司 一种数字人动作生产方法及装置
CN117830564A (zh) * 2024-03-05 2024-04-05 之江实验室 一种姿态分布指导的三维虚拟人模型重建方法
CN117830564B (zh) * 2024-03-05 2024-06-11 之江实验室 一种姿态分布指导的三维虚拟人模型重建方法
CN117876610A (zh) * 2024-03-12 2024-04-12 之江实验室 针对三维构建模型的模型训练方法、装置、存储介质
CN117876610B (zh) * 2024-03-12 2024-05-24 之江实验室 针对三维构建模型的模型训练方法、装置、存储介质
CN117893697A (zh) * 2024-03-15 2024-04-16 之江实验室 一种三维人体视频重建方法、装置、存储介质及电子设备
CN117893697B (zh) * 2024-03-15 2024-05-31 之江实验室 一种三维人体视频重建方法、装置、存储介质及电子设备
CN117978937A (zh) * 2024-03-28 2024-05-03 之江实验室 一种视频生成的方法、装置、存储介质及电子设备
CN118015161A (zh) * 2024-04-08 2024-05-10 之江实验室 一种康复视频的生成方法及装置

Also Published As

Publication number Publication date
CN112819944A (zh) 2021-05-18
CN112819944B (zh) 2022-09-27

Similar Documents

Publication Publication Date Title
WO2022156533A1 (zh) 三维人体模型重建方法、装置、电子设备及存储介质
WO2022156532A1 (zh) 三维人脸模型重建方法、装置、电子设备及存储介质
Kartynnik et al. Real-time facial surface geometry from monocular video on mobile GPUs
Baruch et al. Arkitscenes: A diverse real-world dataset for 3d indoor scene understanding using mobile rgb-d data
US11238606B2 (en) Method and system for performing simultaneous localization and mapping using convolutional image transformation
CN112189335B (zh) 用于低功率移动平台的cmos辅助内向外动态视觉传感器跟踪
US20240046557A1 (en) Method, device, and non-transitory computer-readable storage medium for reconstructing a three-dimensional model
WO2014117446A1 (zh) 基于单个视频摄像机的实时人脸动画方法
JP2016537901A (ja) ライトフィールド処理方法
WO2023015409A1 (zh) 物体姿态的检测方法、装置、计算机设备和存储介质
Wang et al. Instance shadow detection with a single-stage detector
Chen et al. Efficient human pose estimation via 3d event point cloud
CN114766042A (zh) 目标检测方法、装置、终端设备及介质
CN108229281B (zh) 神经网络的生成方法和人脸检测方法、装置及电子设备
Baudron et al. E3d: event-based 3d shape reconstruction
Shan et al. Discrete spherical image representation for cnn-based inclination estimation
Tian et al. Occlusion and collision aware smartphone AR using time-of-flight camera
CN116245961B (zh) 一种基于多类传感器信息的融合感知方法及系统
Jiang et al. Evhandpose: Event-based 3d hand pose estimation with sparse supervision
US10783704B2 (en) Dense reconstruction for narrow baseline motion observations
Khan et al. Skeleton based human action recognition using a structured-tree neural network
Zhang et al. 3D Gesture Estimation from RGB Images Based on DB-InterNet
TWI823491B (zh) 深度估計模型的優化方法、裝置、電子設備及存儲介質
WO2017173977A1 (zh) 一种移动终端目标跟踪方法、装置和移动终端
Qian et al. Multi-Scale tiny region gesture recognition towards 3D object manipulation in industrial design

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22742012

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22742012

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 11.12.2023)