US20240153213A1 - Data acquisition and reconstruction method and system for human body three-dimensional modeling based on single mobile phone - Google Patents

Data acquisition and reconstruction method and system for human body three-dimensional modeling based on single mobile phone Download PDF

Info

Publication number
US20240153213A1
US20240153213A1 US18/542,825 US202318542825A US2024153213A1 US 20240153213 A1 US20240153213 A1 US 20240153213A1 US 202318542825 A US202318542825 A US 202318542825A US 2024153213 A1 US2024153213 A1 US 2024153213A1
Authority
US
United States
Prior art keywords
implicit
human body
estimation model
data acquisition
human
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US18/542,825
Inventor
Hujun Bao
Jiaming SUN
Yunsheng LUO
Zhiyuan Yu
Hongcheng ZHAO
Xiaowei Zhou
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Image Derivative Inc
Original Assignee
Image Derivative Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Image Derivative Inc filed Critical Image Derivative Inc
Assigned to IMAGE DERIVATIVE INC. reassignment IMAGE DERIVATIVE INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ZHAO, Hongcheng, LUO, Yunsheng, YU, ZHIYUAN, ZHOU, XIAOWEI, BAO, HUJUN, SUN, JIAMING
Publication of US20240153213A1 publication Critical patent/US20240153213A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/08Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • the present application belongs to the field of computer vision, in particular to a data acquisition and reconstruction method and system for human body three-dimensional modeling based on a single mobile phone.
  • Human body reconstruction is the basis of interactive immersive applications such as virtual reality and augmented reality content creation, film and television creation and virtual try-on.
  • High-quality human body reconstruction is the premise of many application scenarios related to digital people in virtual reality and augmented reality.
  • scanning the human body with a professional multi-camera acquisition system to acquire data which is very expensive and occupies a large area, limiting the large-scale use and commercialization of high-precision human body reconstruction.
  • Another method uses a single portable device, such as a smart phone, instead of a professional device for image acquisition, and uses a multi-view stereo reconstruction method for human body reconstruction.
  • This kind of method has a weak ability to deal with the texture-less part of a human body , and cannot model the tiny movements of the human body in the process of acquisition, which easily leads to low integrity of reconstruction results and cannot meet the requirements of high-precision human body reconstruction.
  • the object of the present application is to provide a method and a system for data acquisition and reconstruction of human body three-dimensional modeling based on a single mobile phone, so as to solve the problems existing in the traditional static human body model reconstruction solution.
  • a first aspect of the present application provides a data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone, including the following steps:
  • S1 data acquisition based on augmented reality technology, including:
  • step S1.1 the subject stands in the center of the scene, keep a posture with spread human body surface which is conducive to reconstruction, and the user captures 360° view on the subject for one circle with the mobile phone.
  • step S1.1 specifically includes the following steps.
  • the step of determining whether the observation of a single face at a current perspective is effective or not includes the following steps.
  • the distance is less than a set distance threshold, considering that the face meets a distance standard for effective observation at the current perspective.
  • step S1.2 if a face meets both the distance threshold and the line-of-sight angle standard for effective observation at a certain perspective, an effective observation count of the face is increased by one; if the effective observation count of the face reaches a set number threshold, it is considered that the face has an enough number of observations, the color mark of the face is changed, and the user is indicated that acquisition at the position of this face has been completed; the camera is moved to acquire data in areas that have not been observed enough; when all faces on the human parametric template grid change in color, the data acquisition process is completed.
  • the step S2.1 specifically includes the following steps: performing sparse reconstruction for the image sequences by a structure from motion method, wherein an input of the structure from motion method is a series of image frames captured by the mobile phone around the human body, and an output is the camera pose and camera intrinsics corresponding to these images and sparse point clouds reconstructed according to these images.
  • step S2.2 specifically includes the following steps:
  • an input of the implicit spatial deformation field estimation model is a coordinate of a three-dimensional point in the observation frame coordinate system, and an output is a coordinate of the three-dimensional point in a canonical coordinate system
  • an input of the implicit signed distance field estimation model is a coordinate of the three-dimensional point in the canonical space, and the output is signed distance and geometric characteristics of the three-dimensional point
  • an input of the implicit color estimation model is geometric feature of the three-dimensional point output by the implicit signed distance field estimation model and a vector representing a view direction , and an output is a color of each sampling point along a specific line of sight estimated by the model
  • a density of sampling points is calculated according to a signed distance of each sampling point, and a rendering result is obtained by volume rendering technology according to the density and color of sampling points.
  • deformation codes and implicit spatial deformation field estimation model, the implicit signed distance field estimation model and the implicit color estimation model of each observation frame are updated by back propagation according to a loss function of image reconstruction and a regularization loss function of the signed distance field.
  • a data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone including a data acquisition module and a reconstruction module.
  • the data acquisition module is used to virtually place a human parametric template mesh in an acquisition scene by augmented reality technology so that a user acquires video data following a visual guidance on the human parametric template mesh, and extract image frames from the video data and sent to the reconstruction module.
  • the reconstruction module is used to estimate a camera pose and camera intrinsics corresponding to all image frames, use a deformable implicit neural radiance field to model a human body in three dimensions, and optimize an implicit spatial deformation field estimation model, an implicit signed distance field estimation model and an implicit color estimation model by volume rendering to obtain a three-dimensional human body model.
  • the present application can only use a single smart phone, and guides users to acquire high-quality video data input for the reconstruction algorithm by augmented reality technology, so as to ensure that the subsequent human body reconstruction algorithm can stably obtain a high-quality three-dimensional human body model.
  • the present application designs a deformable implicit neural radiance field; the use of an implicit spatial deformation field estimation model solves the problem that the subject has small motion in the process of data acquisition with a single mobile phone; the implicit signed distance field is used to represent human geometry, which has rich expressive ability and improves the accuracy of three-dimensional human model reconstruction.
  • the present application realizes reliable data acquisition and reconstruction for human body high-quality three-dimensional modeling based on a single mobile phone.
  • FIG. 1 is a flowchart of a data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone according to an embodiment of the present application;
  • FIGS. 2 ( a ), 2 ( b ), 2 ( c ) and 2 ( d ) show a flow chart and effect of a data acquisition part according to an embodiment of the present application
  • FIGS. 3 ( a ) and 3 ( b ) show effect of a still human body reconstruction result according to an embodiment of the present application.
  • FIG. 4 is a structural diagram of a data acquisition and reconstruction system for human body three-dimensional modeling based on a single mobile phone according to an embodiment of the present application.
  • the present application provides a high-quality three-dimensional human body model reconstruction method based on a deformable implicit neural radiance field, optimizes the data acquisition process for the specific task of human body reconstruction, provides a data acquisition method for high-quality three-dimensional human body modeling by augmented reality technology, and designs data acquisition applications to guide users to efficiently acquire high-quality data for human body reconstruction.
  • An embodiment of the present application provides a data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone, which mainly includes two parts: data acquisition based on augmented reality technology and high-quality three-dimensional human body model reconstruction based on a deformable hidden nerve radiation field.
  • the method flow is shown in FIG. 1 , and the specific implementation steps are as follows:
  • the subject stands in the center of the scene during the video acquisition process, and keep a posture that is conducive to reconstruction, such as maintaining a A-shaped posture.
  • the user opens the data acquisition application in this embodiment using a mobile phone and captures 360 degrees view on the subject for one circle.
  • the data acquisition application will run the real-time localization and mapping algorithm and the human keypoint detection algorithm in the background to obtain the camera pose and the human keypoints in the captured image in real time.
  • this embodiment will automatically fit a human parametric template mesh according to the body shape and posture of the subject, and use augmented reality technology to render the human parametric template mesh in the scene where the subject is standing, so as to approximately achieve the visual effect that the human parametric template mesh and the subject are overlapped.
  • the human parametric template mesh can adopt any existing human parametric template mesh model, and the method of fitting the human parametric template mesh can adopt any existing method of fitting the human parametric template mesh from continuous image frames.
  • the human parametric template mesh obtained based on fitting in S1.1 is used to guide the user's data acquisition process, with the purpose of ensuring that every face on the human parametric template mesh is observed enough.
  • each face on the parameterized template grid of human body is sufficiently observed, which means that the subject is sufficiently observed by acquired data.
  • the validity of observation is measured by the distance between the optical center of the camera and the face and the angle between the camera line of sight and the normal vector of the face.
  • the distance between the optical center of the camera and the center point of this face can be calculated.
  • this distance is less than a set distance threshold (which is set to one meter in this embodiment), it is considered that this face meets the distance standard for effective 20 observation at the current perspective.
  • the connecting line between the optical center of the camera and the center point of the face can be calculated. If the angle between the connecting line and the normal vector of the face is less than a set line-of-sight angle threshold (which is set to be 60 in this embodiment), it is considered that the face meets the line-of-sight angle standard for effective observation at the current perspective.
  • a set line-of-sight angle threshold which is set to be 60 in this embodiment
  • the effective observation count of the face is increased by one; if the effective observation count of the face reaches a set 30 number threshold (which is set to 5 in this embodiment), it is considered that there are enough observations on the face, and the color mark of the face is changed.
  • the color of the face is changed from white to green, indicating to the user that the acquisition of the location of the face has been completed, and the mobile camera can acquire data of areas that have not been sufficiently observed.
  • the video captured in S1.2 is extracted into a series of image sequences captured around the human body, and the camera pose, camera intrinsics and sparse point clouds corresponding to the acquired images are estimated according to the matching relationship of feature points among the images.
  • This step can be based on any existing structure from motion method.
  • This step can take the real-time localization result of the camera obtained in S 1 as a priori, and further optimize it based on a structure from motion method.
  • the human body is modeled with high precision by using a deformable implicit neural radiance field, which includes an implicit spatial deformation field estimation model R, an implicit signed distance field estimation model S, and an implicit color estimation model G.
  • a deformable implicit neural radiance field which includes an implicit spatial deformation field estimation model R, an implicit signed distance field estimation model S, and an implicit color estimation model G.
  • the input of the implicit spatial deformation field estimation model R is the coordinate of a three-dimensional point in the observation frame coordinate system, and the output is the coordinate of the three-dimensional point in the canonical coordinate system.
  • the input of the implicit signed distance field estimation model Sc is the coordinate of the three-dimensional point in the canonical space, and the output is the signed distance and geometric characteristics of the three-dimensional point, which represents the distance from the three-dimensional point to the human surface.
  • the input of the implicit color estimation model C c is the geometric characteristics of the three-dimensional point and a vector representing the line of sight output by S c , and the output is the color of each sampling point along a specific line of sight estimated by the model.
  • the implicit spatial deformation field estimation model R, the implicit signed distance field estimation model S c and the implicit color estimation model C c can all adopt the common residual neural network model. After the signed distance and color of three-dimensional point are obtained from the above S c and C c the pixel values of two-dimensional images can be rendered by volume rendering technology. The details will be given below:
  • the implicit spatial deformation field estimation model R is used to obtain the coordinate of the three-dimensional point corresponding to the coordinate in the observation frame coordinate system in the canonical coordinate system.
  • x represents the three-dimensional point in the observation frame coordinate system
  • x c represents the three-dimensional point in the canonical space
  • d I represents the specific deformation coding of the observation frame for processing different human motions in each frame; this c deformation coding can be optimized during the back propagation of the neural network.
  • the implicit signed distance field estimation model S c learns the signed distance dsdf and geometrical characteristic z geo of the three-dimensional point.
  • the corresponding density ⁇ (t) can be calculated according to the signed distance d sdf of each three-dimensional point.
  • ⁇ ⁇ ( t ) max ⁇ ( - d ⁇ ⁇ s ( S c ( x c ( t ) ) ) ⁇ s ( S c ( x c ( t ) ) , 0 )
  • x c (t) represents the coordinate of the three-dimensional point when the sampling step along the line-of-sight direction is t
  • S c (x c (t)) is a signed distance value of the three-dimensional point x c (t)
  • ⁇ s ( ⁇ ) is a Sigmoid function.
  • this method inputs the geometric characteristic z geo output by the line-of-sight direction and the implicit signed distance field estimation model S c into the implicit color estimation model C c and outputs the color of each sampling point along the line-of-sight direction v.
  • v represents the line of sight direction calculated by the camera pose coefficient
  • z geo represents the geometric characteristics output by the implicit signed distance field estimation model at x c (t)
  • n represents the normal vector direction at x c (t), which can be obtained from derivation on the estimated signed distance field
  • RGB represents the three-channel colors of three-dimensional point.
  • a rendering result C(w,h) can be obtained by the following integration method:
  • C(w,h) represents the rendered color value at the two-dimensional image (w,h)
  • t f and t n respectively represent the farthest and nearest sampling steps along the line-of-sight direction
  • C c (v,x c (t)) is the color value of x c (t) in the line-of-sight direction v
  • T(t) represents the permeability at x c (t), which is obtained from integration of ⁇ (t):
  • W and H represent the width and height of the input image respectively
  • (w,h) represents the pixel coordinate of the image.
  • the present application also adds a signed distance field regularization loss function reg to constrain the estimated signed distance field to retain the mathematical property that the normal vector modulus of the center point in the signed distance field is 1:
  • ⁇ circumflex over (p) ⁇ k,i is the three-dimensional point coordinate of the i th sampling point of the k th line of sight traversed
  • S c ( ⁇ circumflex over (p) ⁇ k,i ) is the signed distance value of the three-dimensional point ⁇ circumflex over (p) ⁇ k,1 .
  • the loss function value is used to update the parameters by back propagation for the neural network and the deformation coding of each observation frame.
  • the implicit signed distance field of the deformable implicit neural radiance field is post-processed by an isosurface extraction method, and a high-quality explicit three-dimensional human model is obtained.
  • FIGS. 3 ( a ) and 3 ( b ) are effect diagrams of the reconstruction result of a still human body according to an embodiment of the present application.
  • the present application also provides an embodiment of a data acquisition and reconstruction system for human body three-dimensional modeling based on a single mobile phone, corresponding to the embodiment of a data acquisition and reconstruction method for human body three-dimensional modeling.
  • the data acquisition and reconstruction system human body for three-dimensional modeling based on a single mobile phone includes a data acquisition module and a reconstruction module.
  • the data acquisition module is used to virtually place a human parametric template mesh in an acquisition scene by augmented reality technology so that a user acquires video data following a visual guidance on the human parametric template mesh, and extract image frames from the video data and sent to the reconstruction module; refer to the above step S1 for the implementation of this module.
  • the reconstruction module is used to estimate a camera pose and camera intrinsics corresponding to all image frames, use a deformable implicit neural radiance field to model a human body in three dimensions, and optimize an implicit spatial deformation field estimation model, an implicit signed distance field estimation model and an implicit color estimation model by volume rendering to obtain a three-dimensional human body model; refer to the above step S2 for the implementation of this module.
  • a computer device which includes a memory and a processor; computer-readable instructions are stored in the memory, and when the computer-readable instructions are executed by the processor, the processor is caused to execute the steps in the data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone in the above embodiment.
  • a storage medium storing computer-readable instructions, and when the computer-readable instructions are executed by one or more processors, the one or more processors execute the steps in the data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone in the above embodiment.
  • the storage medium can be a non-volatile storage medium.
  • a program which can be stored in a computer-readable storage medium, which can include a read-only memory, a random access memory, a magnetic disk or an optical disk, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Processing Or Creating Images (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

A data acquisition and reconstruction method and a data acquisition and reconstruction system for human body three-dimensional modeling based on a single mobile phone. In the aspect of data acquisition, the present application only uses a single smart phone, and uses augmented reality technology to guide users to collect high-quality video data input for a reconstruction algorithm, so as to ensure that the subsequent human body reconstruction algorithm can stably obtain a high-quality three-dimensional human body model. In the aspect of a reconstruction algorithm, the present application designs a deformable implicit neural radiance field. The use of an implicit spatial deformation field estimation model solves the problem that the subject has small motion in the process of collecting data with a single mobile phone; the implicit signed distance field is used to represent human geometry, which has rich expressive ability and improves the accuracy of three-dimensional human model reconstruction.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • The present application is a National Phase of International Application No. PCT/CN2022/125581, filed on Oct. 17, 2022, which claims priority to Chinese Application No. 202210788579.7, filed on Jul. 6, 2022, the contents of both of which are incorporated herein by reference in their entireties.
  • TECHNICAL FIELD
  • The present application belongs to the field of computer vision, in particular to a data acquisition and reconstruction method and system for human body three-dimensional modeling based on a single mobile phone.
  • BACKGROUND
  • Human body reconstruction is the basis of interactive immersive applications such as virtual reality and augmented reality content creation, film and television creation and virtual try-on. High-quality human body reconstruction is the premise of many application scenarios related to digital people in virtual reality and augmented reality. At present, there are two kinds of human body acquisition and reconstruction solutions: scanning the human body with a professional multi-camera acquisition system to acquire data, which is very expensive and occupies a large area, limiting the large-scale use and commercialization of high-precision human body reconstruction. Another method uses a single portable device, such as a smart phone, instead of a professional device for image acquisition, and uses a multi-view stereo reconstruction method for human body reconstruction. This kind of method has a weak ability to deal with the texture-less part of a human body , and cannot model the tiny movements of the human body in the process of acquisition, which easily leads to low integrity of reconstruction results and cannot meet the requirements of high-precision human body reconstruction.
  • SUMMARY
  • The object of the present application is to provide a method and a system for data acquisition and reconstruction of human body three-dimensional modeling based on a single mobile phone, so as to solve the problems existing in the traditional static human body model reconstruction solution.
  • The purpose of the present application is achieved through the following technical solution:
  • A first aspect of the present application provides a data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone, including the following steps:
  • S1, data acquisition based on augmented reality technology, including:
  • S1.1, a subject standing in a scene, a user capturing 360° view on the subject with a mobile phone, fitting a human parametric template mesh according to a body shape and a posture of the subject from multiple view angles, and rendering the human parametric template mesh in a scene position where the subject stands by augmented reality technology, so as to approximately achieve a visual effect that the human parametric template mesh and the subject are overlapped;
  • S1.2, guiding the user for a data acquisition process by using the fitted human parametric template mesh, determining whether the observation of a single face on the human parametric template mesh at a current perspective is effective or not in the data acquisition process, changing a color mark of the face when the single face is effectively observed by a sufficient number of perspectives, indicating to the user that the acquisition at the position of the face has been completed, and moving the perspective of the user to the part of the human parametric template mesh that has not been sufficiently observed;
  • S2, reconstruction of a three-dimensional human model based on a deformable implicit neural radiance field, comprising:
  • S2.1, extracting a video acquired in S1.2 into a series of image sequences captured around a human body, and estimating a camera pose and camera intrinsics corresponding to acquired images according to a matching relationship of feature points among the images;
  • S2.2, using the deformable implicit neural radiance field to model the human body in three dimensions, and optimizing an implicit spatial deformation field estimation model, an implicit signed distance field estimation model and an implicit color estimation model by means of volume rendering to obtain a three-dimensional human body model.
  • Further, in step S1.1, the subject stands in the center of the scene, keep a posture with spread human body surface which is conducive to reconstruction, and the user captures 360° view on the subject for one circle with the mobile phone.
  • Further, the step S1.1 specifically includes the following steps.
  • Running a localization and mapping algorithm to obtain the camera pose in real time during data acquisition.
  • Running a human keypoint detection algorithm to obtain human keypoint positions on the captured images in real time.
  • Fitting the human parametric template mesh to the captured positions according to the camera pose and the human body keypoint positions, so as to achieve the visual effect that the human parametric template mesh and the subject are overlapped visually, and completing data acquisition according to the guidance of the human parametric template mesh.
  • Further, in the step S1.2, the step of determining whether the observation of a single face at a current perspective is effective or not includes the following steps.
  • Calculating a distance between an optical center of the camera and a center point of the face based on a real-time localization result of the camera.
  • If the distance is less than a set distance threshold, considering that the face meets a distance standard for effective observation at the current perspective.
  • Calculating a connecting line between the optical center of the camera and the center point of the face based on the real-time localization result of the camera, wherein if an angle between the connecting line and a normal vector of the face is less than a predefined angle threshold, it is considered that the face meets the angle standard for effective observation at the current perspective.
  • Further, in the step S1.2, if a face meets both the distance threshold and the line-of-sight angle standard for effective observation at a certain perspective, an effective observation count of the face is increased by one; if the effective observation count of the face reaches a set number threshold, it is considered that the face has an enough number of observations, the color mark of the face is changed, and the user is indicated that acquisition at the position of this face has been completed; the camera is moved to acquire data in areas that have not been observed enough; when all faces on the human parametric template grid change in color, the data acquisition process is completed.
  • Further, the step S2.1 specifically includes the following steps: performing sparse reconstruction for the image sequences by a structure from motion method, wherein an input of the structure from motion method is a series of image frames captured by the mobile phone around the human body, and an output is the camera pose and camera intrinsics corresponding to these images and sparse point clouds reconstructed according to these images.
  • Further, the step S2.2 specifically includes the following steps:
  • Establishing the implicit signed distance field estimation model for expressing a canonical shape in a canonical space using a neural network;
  • Establishing the implicit color estimation model for observing colors of three-dimensional points from a specific direction in the canonical space by the neural network.
  • Establishing the implicit spatial deformation field estimation model from an observation frame coordinate system corresponding to each image frame to the canonical space by the neural network.
  • Optimizing the implicit spatial deformation field estimation model, the implicit signed distance field estimation model and the implicit color estimation model based on the camera pose and camera intrinsics corresponding to the images obtained in S2.1 by volume rendering on an input image set to obtain an implicit three-dimensional human body model.
  • Post-processing an implicit signed distance field of the deformable implicit neural radiance field by an isosurface extraction method to obtain an explicit three-dimensional human model.
  • Further, in the step S2.2, an input of the implicit spatial deformation field estimation model is a coordinate of a three-dimensional point in the observation frame coordinate system, and an output is a coordinate of the three-dimensional point in a canonical coordinate system; an input of the implicit signed distance field estimation model is a coordinate of the three-dimensional point in the canonical space, and the output is signed distance and geometric characteristics of the three-dimensional point; an input of the implicit color estimation model is geometric feature of the three-dimensional point output by the implicit signed distance field estimation model and a vector representing a view direction , and an output is a color of each sampling point along a specific line of sight estimated by the model; and a density of sampling points is calculated according to a signed distance of each sampling point, and a rendering result is obtained by volume rendering technology according to the density and color of sampling points.
  • Further, in the step S2.2, deformation codes and implicit spatial deformation field estimation model, the implicit signed distance field estimation model and the implicit color estimation model of each observation frame are updated by back propagation according to a loss function of image reconstruction and a regularization loss function of the signed distance field.
  • According to a second aspect of the present application, there is provided a data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone, including a data acquisition module and a reconstruction module.
  • The data acquisition module is used to virtually place a human parametric template mesh in an acquisition scene by augmented reality technology so that a user acquires video data following a visual guidance on the human parametric template mesh, and extract image frames from the video data and sent to the reconstruction module.
  • The reconstruction module is used to estimate a camera pose and camera intrinsics corresponding to all image frames, use a deformable implicit neural radiance field to model a human body in three dimensions, and optimize an implicit spatial deformation field estimation model, an implicit signed distance field estimation model and an implicit color estimation model by volume rendering to obtain a three-dimensional human body model.
  • The present application has the following beneficial effects:
  • In terms of data acquisition, the present application can only use a single smart phone, and guides users to acquire high-quality video data input for the reconstruction algorithm by augmented reality technology, so as to ensure that the subsequent human body reconstruction algorithm can stably obtain a high-quality three-dimensional human body model.
  • In terms of reconstruction algorithms, the present application designs a deformable implicit neural radiance field; the use of an implicit spatial deformation field estimation model solves the problem that the subject has small motion in the process of data acquisition with a single mobile phone; the implicit signed distance field is used to represent human geometry, which has rich expressive ability and improves the accuracy of three-dimensional human model reconstruction.
  • By combining data acquisition and reconstruction algorithm, the present application realizes reliable data acquisition and reconstruction for human body high-quality three-dimensional modeling based on a single mobile phone.
  • BRIEF DESCRIPTION OF DRAWINGS
  • FIG. 1 is a flowchart of a data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone according to an embodiment of the present application;
  • FIGS. 2(a), 2(b), 2(c) and 2(d) show a flow chart and effect of a data acquisition part according to an embodiment of the present application;
  • FIGS. 3(a) and 3(b) show effect of a still human body reconstruction result according to an embodiment of the present application; and
  • FIG. 4 is a structural diagram of a data acquisition and reconstruction system for human body three-dimensional modeling based on a single mobile phone according to an embodiment of the present application.
  • DESCRIPTION OF EMBODIMENTS
  • The technical solution in the embodiment of the present application will be described below clearly and completely with reference to the attached drawings. Obviously, the described embodiments are only part of, not all of the embodiments of the present application. Based on the embodiments of the present application, all other embodiments obtained by those skilled in the art without creative labor shall belong to the protection scope of the present application.
  • In the field of three-dimensional mannequin reconstruction, traditional image-based methods either need complex acquisition equipment and environment construction, or are limited by the reconstruction ability of traditional multi-view geometric methods, and cannot reconstruct high-quality three-dimensional mannequins with only a single portable device. The present application provides a high-quality three-dimensional human body model reconstruction method based on a deformable implicit neural radiance field, optimizes the data acquisition process for the specific task of human body reconstruction, provides a data acquisition method for high-quality three-dimensional human body modeling by augmented reality technology, and designs data acquisition applications to guide users to efficiently acquire high-quality data for human body reconstruction.
  • An embodiment of the present application provides a data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone, which mainly includes two parts: data acquisition based on augmented reality technology and high-quality three-dimensional human body model reconstruction based on a deformable hidden nerve radiation field. The method flow is shown in FIG. 1 , and the specific implementation steps are as follows:
  • S1, data are acquired based on augmented reality technology, and the process is shown in FIGS. 2(a), 2(b), 2(c) and 2(d).
  • S1.1, the subject stands in the center of the scene during the video acquisition process, and keep a posture that is conducive to reconstruction, such as maintaining a A-shaped posture. The user opens the data acquisition application in this embodiment using a mobile phone and captures 360 degrees view on the subject for one circle. During this process, the data acquisition application will run the real-time localization and mapping algorithm and the human keypoint detection algorithm in the background to obtain the camera pose and the human keypoints in the captured image in real time. According to the results of camera localization and keypoints of human body, this embodiment will automatically fit a human parametric template mesh according to the body shape and posture of the subject, and use augmented reality technology to render the human parametric template mesh in the scene where the subject is standing, so as to approximately achieve the visual effect that the human parametric template mesh and the subject are overlapped.
  • In an embodiment, the human parametric template mesh can adopt any existing human parametric template mesh model, and the method of fitting the human parametric template mesh can adopt any existing method of fitting the human parametric template mesh from continuous image frames.
  • S1.2, the human parametric template mesh obtained based on fitting in S1.1 is used to guide the user's data acquisition process, with the purpose of ensuring that every face on the human parametric template mesh is observed enough.
  • In the case that the parameterized template grid of human body and the subject are approximately are overlapped, each face on the parameterized template grid of human body is sufficiently observed, which means that the subject is sufficiently observed by acquired data. The validity of observation is measured by the distance between the optical center of the camera and the face and the angle between the camera line of sight and the normal vector of the face. The following details the specific standards and practical methods.
  • For a single face, based on the real-time localization result of the camera, the distance between the optical center of the camera and the center point of this face can be calculated. When this distance is less than a set distance threshold (which is set to one meter in this embodiment), it is considered that this face meets the distance standard for effective 20 observation at the current perspective.
  • For a single face, based on the real-time localization result of the camera, the connecting line between the optical center of the camera and the center point of the face can be calculated. If the angle between the connecting line and the normal vector of the face is less than a set line-of-sight angle threshold (which is set to be 60 in this embodiment), it is considered that the face meets the line-of-sight angle standard for effective observation at the current perspective.
  • If a face meets both the distance standard and the line-of-sight angle standard for effective observation under the observation of a certain perspective, the effective observation count of the face is increased by one; if the effective observation count of the face reaches a set 30 number threshold (which is set to 5 in this embodiment), it is considered that there are enough observations on the face, and the color mark of the face is changed. In this embodiment, the color of the face is changed from white to green, indicating to the user that the acquisition of the location of the face has been completed, and the mobile camera can acquire data of areas that have not been sufficiently observed. When all the faces on the human parametric template grid have turned green, the data acquisition process is completed, and the video will be automatically exported to the subsequent reconstruction process.
  • S2, a high-quality three-dimensional human model reconstruction is caried out based on deformable implicit neural radiance field.
  • S2.1, the video captured in S1.2 is extracted into a series of image sequences captured around the human body, and the camera pose, camera intrinsics and sparse point clouds corresponding to the acquired images are estimated according to the matching relationship of feature points among the images.
  • This step can be based on any existing structure from motion method. This step can take the real-time localization result of the camera obtained in S1 as a priori, and further optimize it based on a structure from motion method.
  • S2.2, the human body is modeled with high precision by using a deformable implicit neural radiance field, which includes an implicit spatial deformation field estimation model R, an implicit signed distance field estimation model S, and an implicit color estimation model G.
  • In an embodiment, the input of the implicit spatial deformation field estimation model R is the coordinate of a three-dimensional point in the observation frame coordinate system, and the output is the coordinate of the three-dimensional point in the canonical coordinate system. The input of the implicit signed distance field estimation model Sc is the coordinate of the three-dimensional point in the canonical space, and the output is the signed distance and geometric characteristics of the three-dimensional point, which represents the distance from the three-dimensional point to the human surface. The input of the implicit color estimation model Cc is the geometric characteristics of the three-dimensional point and a vector representing the line of sight output by Sc, and the output is the color of each sampling point along a specific line of sight estimated by the model. The implicit spatial deformation field estimation model R, the implicit signed distance field estimation model Sc and the implicit color estimation model Cc can all adopt the common residual neural network model. After the signed distance and color of three-dimensional point are obtained from the above Sc and Cc the pixel values of two-dimensional images can be rendered by volume rendering technology. The details will be given below:
  • the use of the volume rendering technology needs to sample Nc three-dimensional points x (Nc=64 in this embodiment) in the observation frame coordinate system along the line of sight in space. Firstly, the implicit spatial deformation field estimation model R is used to obtain the coordinate of the three-dimensional point corresponding to the coordinate in the observation frame coordinate system in the canonical coordinate system.

  • R: (x, dI)→xc
  • where x represents the three-dimensional point in the observation frame coordinate system, xc represents the three-dimensional point in the canonical space, and dI represents the specific deformation coding of the observation frame for processing different human motions in each frame; this c deformation coding can be optimized during the back propagation of the neural network.
  • The implicit signed distance field estimation model Sc learns the signed distance dsdf and geometrical characteristic zgeo of the three-dimensional point.

  • Sc: (xc)→{dsdf, zgeo}
  • The corresponding density ρ(t) can be calculated according to the signed distance dsdf of each three-dimensional point.
  • ρ ( t ) = max ( - d Φ s ( S c ( x c ( t ) ) ) Φ s ( S c ( x c ( t ) ) ) , 0 )
  • where t is a sampling step along the line-of-sight direction, xc(t) represents the coordinate of the three-dimensional point when the sampling step along the line-of-sight direction is t, Sc(xc(t))is a signed distance value of the three-dimensional point xc(t), and Φs(·) is a Sigmoid function.
  • Then, this method inputs the geometric characteristic zgeo output by the line-of-sight direction and the implicit signed distance field estimation model Sc into the implicit color estimation model Cc and outputs the color of each sampling point along the line-of-sight direction v.

  • Cc(v, xc(t))→Cc(v, zgeo, n)→RGB
  • where v represents the line of sight direction calculated by the camera pose coefficient, zgeo represents the geometric characteristics output by the implicit signed distance field estimation model at xc(t), n represents the normal vector direction at xc(t), which can be obtained from derivation on the estimated signed distance field, and RGB represents the three-channel colors of three-dimensional point.
  • After obtaining the estimated density and color at each sampling point, a rendering result C(w,h) can be obtained by the following integration method:

  • C(w,h)=∫t n t f T(t)·ρ(t)·Cc(v,xc(t))dt
  • where C(w,h) represents the rendered color value at the two-dimensional image (w,h), tf and tn respectively represent the farthest and nearest sampling steps along the line-of-sight direction, Cc(v,xc(t)) is the color value of xc(t) in the line-of-sight direction v, and T(t) represents the permeability at xc(t), which is obtained from integration of ρ(t):

  • T(t)=exp(−∫t n t f ρ(u)du)
  • Finally, the image C rendered by the deformable implicit neural radiance field and
  • the original image I are used together to calculate the image authenticity loss function
    Figure US20240153213A1-20240509-P00001
    photo:
  • photo = w = 0 W h = 0 H I ( w , h ) - C ( w , h )
  • where W and H represent the width and height of the input image respectively, and (w,h) represents the pixel coordinate of the image.
  • In addition to the image authenticity loss function, the present application also adds a signed distance field regularization loss function
    Figure US20240153213A1-20240509-P00002
    reg to constrain the estimated signed distance field to retain the mathematical property that the normal vector modulus of the center point in the signed distance field is 1:
  • reg = 1 aN c k , i ( "\[LeftBracketingBar]" S c ( p ^ k , i ) "\[RightBracketingBar]" - 1 ) 2
  • where a is the number of sight lines and Nc is the number of sampling points on a single sight line, and this formula constrains that the normal vector modulus of all sampling points should be 1. {circumflex over (p)}k,i is the three-dimensional point coordinate of the ith sampling point of the kth line of sight traversed, and Sc({circumflex over (p)}k,i) is the signed distance value of the three-dimensional point {circumflex over (p)}k,1.
  • By combining the image authenticity loss function
    Figure US20240153213A1-20240509-P00003
    photo and the signed distance field regularization loss function
    Figure US20240153213A1-20240509-P00003
    reg, a complete loss function
    Figure US20240153213A1-20240509-P00003
    is obtained:

  • Figure US20240153213A1-20240509-P00003
    =
    Figure US20240153213A1-20240509-P00003
    photo+
    Figure US20240153213A1-20240509-P00003
    reg
  • The loss function value is used to update the parameters by back propagation for the neural network and the deformation coding of each observation frame.
  • S2.3, the implicit signed distance field of the deformable implicit neural radiance field is post-processed by an isosurface extraction method, and a high-quality explicit three-dimensional human model is obtained.
  • FIGS. 3(a) and 3(b) are effect diagrams of the reconstruction result of a still human body according to an embodiment of the present application.
  • The present application also provides an embodiment of a data acquisition and reconstruction system for human body three-dimensional modeling based on a single mobile phone, corresponding to the embodiment of a data acquisition and reconstruction method for human body three-dimensional modeling.
  • Referring to FIG. 4 , the data acquisition and reconstruction system human body for three-dimensional modeling based on a single mobile phone according to the embodiment of the present application includes a data acquisition module and a reconstruction module.
  • The data acquisition module is used to virtually place a human parametric template mesh in an acquisition scene by augmented reality technology so that a user acquires video data following a visual guidance on the human parametric template mesh, and extract image frames from the video data and sent to the reconstruction module; refer to the above step S1 for the implementation of this module.
  • The reconstruction module is used to estimate a camera pose and camera intrinsics corresponding to all image frames, use a deformable implicit neural radiance field to model a human body in three dimensions, and optimize an implicit spatial deformation field estimation model, an implicit signed distance field estimation model and an implicit color estimation model by volume rendering to obtain a three-dimensional human body model; refer to the above step S2 for the implementation of this module.
  • In one embodiment, a computer device is proposed, which includes a memory and a processor; computer-readable instructions are stored in the memory, and when the computer-readable instructions are executed by the processor, the processor is caused to execute the steps in the data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone in the above embodiment.
  • In one embodiment, a storage medium storing computer-readable instructions is proposed, and when the computer-readable instructions are executed by one or more processors, the one or more processors execute the steps in the data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone in the above embodiment. The storage medium can be a non-volatile storage medium.
  • Those skilled in the art can appreciate that all or part of the steps in various methods of the above embodiments can be completed by instructing related hardware by a program, which can be stored in a computer-readable storage medium, which can include a read-only memory, a random access memory, a magnetic disk or an optical disk, etc.
  • It should also be noted that the terms “including”, “include” or any other variation thereof are intended to cover non-exclusive inclusion, so that a process, method, commodity or equipment including a series of elements includes not only those elements, but also other elements not explicitly listed, or elements inherent to such process, method, commodity or equipment. Without more restrictions, an element defined by the phrase “including one” does not exclude the existence of other identical elements in the process, method, commodity or equipment including the element.
  • The above is only a preferred embodiment of one or more embodiments of this specification, and it is not intended to limit one or more embodiments of this specification. Any modification, equivalent substitution, improvement and the like made within the spirit and principle of one or more embodiments of this description shall be included in the scope of protection of one or more embodiments of this description.

Claims (6)

What is claimed is:
1. A data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone, comprising:
step S1, data acquisition based on augmented reality technology, comprising:
step S1.1, a subject standing in a scene, keeping a posture with spread human body surface conducive to reconstruction, a user capturing 360° view on the subject via a mobile phone, fitting a human parametric template mesh according to a body shape and a posture of the subject from multiple view angles, and rendering the human parametric template mesh in a scene position where the subject stands by augmented reality technology, so as to approach a visual effect that the human parametric template mesh and the subject are overlapped; and
step S1.2, during the data acquisition, guiding the user for a data acquisition process by using the fitted human parametric template mesh, determining whether a single face on the human parametric template mesh at a current perspective is effectively observed, wherein when a face meets both the distance standard and the line-of-sight angle standard for effective observation at a certain perspective, an effective observation count of the face being increased by one; and
wherein when the effective observation count of the face reaches a set number threshold, the face has an enough number of observations, the color mark of the face is changed, and the user is indicated that acquisition at a position of the face has been completed; the camera is moved to acquire data in areas that have not been observed enough; when all faces on the human parametric template grid change in color, the data acquisition process is completed;
step S2, reconstruction of a three-dimensional human model based on a deformable implicit neural radiance field, comprising:
step S2.1, extracting a video acquired in S1.2 into a series of image sequences captured around a human body, and estimating a camera pose and camera intrinsics corresponding to captured images according to a matching relationship of feature points among the images; and
step S2.2, modelling a human body in three dimensions using the deformable implicit neural radiance field, wherein the deformable implicit neural radiance field comprises an implicit spatial deformation field estimation model, an implicit signed distance field estimation model and an implicit color estimation model;
establishing the implicit spatial deformation field estimation model from an observation frame coordinate system corresponding to each image frame to a canonical space using a neural network, wherein an input of the implicit spatial deformation field estimation model is a coordinate of a three-dimensional point in the observation frame coordinate system, and an output of the implicit spatial deformation field estimation model is a coordinate of the three-dimensional point in a canonical coordinate system;
establishing the implicit signed distance field estimation model for expressing a canonical shape in the canonical space using the neural network, wherein an input of the implicit signed distance field estimation model is a coordinate of the three-dimensional point in the canonical space, and an output of the implicit signed distance field estimation model is a signed distance and geometric characteristics of the three-dimensional point;
establishing the implicit color estimation model for observing colors of the three-dimensional point from a specific direction in the canonical space using the neural network, wherein an input of the implicit color estimation model is geometric characteristics of the three-dimensional point and a vector representing a line of sight output by the implicit signed distance field estimation model, and an output is a color of each sampling point along a specific line of sight estimated by the model;
optimizing the implicit spatial deformation field estimation model, the implicit signed distance field estimation model and the implicit color estimation model based on the camera pose and the camera intrinsics corresponding to the images obtained in S2.1 by volume rendering on an input image set to obtain an implicit three-dimensional human body model; and
post-processing an implicit signed distance field of the deformable implicit neural radiance field by an isosurface extraction method to obtain an explicit three-dimensional human model.
2. The data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone according to claim 1, wherein the step S1.1 comprises:
running a localization and mapping algorithm to obtain the camera pose in real time during data acquisition;
running a human keypoint detection algorithm to obtain human keypoint positions on the captured images in real time; and
fitting the human parametric template mesh to captured positions according to the camera pose and the human body keypoint positions, so as to achieve the visual effect that the human parametric template mesh and the subject are overlapped visually, and completing data acquisition according to guidance of the human parametric template mesh.
3. The data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone according to claim 1, wherein in the step S1.2 further comprises:
calculating a distance between an optical center of a camera and a center point of the single face based on a real-time localization result of the camera, wherein when the distance is less than a set distance threshold, the face meets a distance standard for effective observation at the current perspective; and
calculating a connecting line between the optical center of the camera and the center point of the face based on the real-time localization result of the camera, wherein when an angle between the connecting line and a normal vector of the face is less than a set line-of-sight angle threshold, the face meets a view direction standard for effective observation at the current perspective.
4. The data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone according to claim 1, wherein the step S2.1 further comprises: performing sparse reconstruction for the image sequences by a structure from motion method, wherein an input of the structure from motion method is a series of image frames captured by the mobile phone around the human body, and an output is the camera pose and camera intrinsics corresponding to the captured images and sparse point clouds reconstructed according to the captured images.
5. The data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone according to claim 1, wherein in the step S2.2, a deformation coding and implicit spatial deformation field estimation model, the implicit signed distance field estimation model and the implicit color estimation model of each observation frame are updated by back propagation according to a loss function of image authenticity and a regularization loss function of a signed di stance field.
6. A data acquisition and reconstruction method for human body three-dimensional modeling based on a single mobile phone, for implementing the method according to claim 1, comprising a data acquisition module and a reconstruction module;
wherein the data acquisition module is configured to virtually place a human parametric template mesh in an acquisition scene by augmented reality technology so that a user acquires video data following a visual guidance on the human parametric template mesh, and extract image frames from the video data and sent to the reconstruction module; and
wherein the reconstruction module is configured to estimate a camera pose and camera intrinsics corresponding to all image frames, use a deformable implicit neural radiance field to model a human body in three dimensions, and optimize an implicit spatial deformation field estimation model, an implicit signed distance field estimation model and an implicit color estimation model by volume rendering to obtain a three-dimensional human body model.
US18/542,825 2022-07-06 2023-12-18 Data acquisition and reconstruction method and system for human body three-dimensional modeling based on single mobile phone Pending US20240153213A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN202210788579.7 2022-07-06
CN202210788579.7A CN114863037B (en) 2022-07-06 2022-07-06 Single-mobile-phone-based human body three-dimensional modeling data acquisition and reconstruction method and system
PCT/CN2022/125581 WO2024007478A1 (en) 2022-07-06 2022-10-17 Three-dimensional human body modeling data collection and reconstruction method and system based on single mobile phone

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/125581 Continuation WO2024007478A1 (en) 2022-07-06 2022-10-17 Three-dimensional human body modeling data collection and reconstruction method and system based on single mobile phone

Publications (1)

Publication Number Publication Date
US20240153213A1 true US20240153213A1 (en) 2024-05-09

Family

ID=82626064

Family Applications (1)

Application Number Title Priority Date Filing Date
US18/542,825 Pending US20240153213A1 (en) 2022-07-06 2023-12-18 Data acquisition and reconstruction method and system for human body three-dimensional modeling based on single mobile phone

Country Status (3)

Country Link
US (1) US20240153213A1 (en)
CN (1) CN114863037B (en)
WO (1) WO2024007478A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114863037B (en) * 2022-07-06 2022-10-11 杭州像衍科技有限公司 Single-mobile-phone-based human body three-dimensional modeling data acquisition and reconstruction method and system
CN116703995B (en) * 2022-10-31 2024-05-14 荣耀终端有限公司 Video blurring processing method and device
CN116468767B (en) * 2023-03-28 2023-10-13 南京航空航天大学 Airplane surface reconstruction method based on local geometric features and implicit distance field
CN117333637B (en) * 2023-12-01 2024-03-08 北京渲光科技有限公司 Modeling and rendering method, device and equipment for three-dimensional scene
CN117765187B (en) * 2024-02-22 2024-04-26 成都信息工程大学 Monocular saphenous nerve mapping method based on multi-modal depth estimation guidance
CN117953544A (en) * 2024-03-26 2024-04-30 安徽农业大学 Target behavior monitoring method and system

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ITTO20111150A1 (en) * 2011-12-14 2013-06-15 Univ Degli Studi Genova PERFECT THREE-DIMENSIONAL STEREOSCOPIC REPRESENTATION OF VIRTUAL ITEMS FOR A MOVING OBSERVER
US10540817B2 (en) * 2017-03-03 2020-01-21 Augray Pvt. Ltd. System and method for creating a full head 3D morphable model
CN112446961A (en) * 2019-08-30 2021-03-05 中兴通讯股份有限公司 Scene reconstruction system and method
CN114245000A (en) * 2020-09-09 2022-03-25 北京小米移动软件有限公司 Shooting method and device, electronic equipment and storage medium
KR20230079177A (en) * 2020-09-30 2023-06-05 스냅 인코포레이티드 Procedurally generated augmented reality content creators
US20240005590A1 (en) * 2020-11-16 2024-01-04 Google Llc Deformable neural radiance fields
CN112465955B (en) * 2020-12-10 2023-04-07 浙江大学 Dynamic human body three-dimensional reconstruction and visual angle synthesis method
CN113421328B (en) * 2021-05-27 2022-03-11 中国人民解放军军事科学院国防科技创新研究院 Three-dimensional human body virtual reconstruction method and device
CN114118367B (en) * 2021-11-16 2024-03-29 上海脉衍人工智能科技有限公司 Method and equipment for constructing incremental nerve radiation field
CN114241113A (en) * 2021-11-26 2022-03-25 浙江大学 Efficient nerve radiation field rendering method based on depth-guided sampling
CN114004941B (en) * 2022-01-04 2022-08-16 苏州浪潮智能科技有限公司 Indoor scene three-dimensional reconstruction system and method based on nerve radiation field
CN114119839B (en) * 2022-01-24 2022-07-01 阿里巴巴(中国)有限公司 Three-dimensional model reconstruction and image generation method, equipment and storage medium
CN114581571A (en) * 2022-03-04 2022-06-03 杭州像衍科技有限公司 Monocular human body reconstruction method and device based on IMU and forward deformation field
CN114648613B (en) * 2022-05-18 2022-08-23 杭州像衍科技有限公司 Three-dimensional head model reconstruction method and device based on deformable nerve radiation field
CN114863037B (en) * 2022-07-06 2022-10-11 杭州像衍科技有限公司 Single-mobile-phone-based human body three-dimensional modeling data acquisition and reconstruction method and system

Also Published As

Publication number Publication date
WO2024007478A1 (en) 2024-01-11
CN114863037A (en) 2022-08-05
CN114863037B (en) 2022-10-11

Similar Documents

Publication Publication Date Title
US20240153213A1 (en) Data acquisition and reconstruction method and system for human body three-dimensional modeling based on single mobile phone
CN111598998B (en) Three-dimensional virtual model reconstruction method, three-dimensional virtual model reconstruction device, computer equipment and storage medium
US10679046B1 (en) Machine learning systems and methods of estimating body shape from images
WO2021175050A1 (en) Three-dimensional reconstruction method and three-dimensional reconstruction device
CN110276317B (en) Object size detection method, object size detection device and mobile terminal
CN111932678B (en) Multi-view real-time human motion, gesture, expression and texture reconstruction system
WO2020134818A1 (en) Image processing method and related product
CN113674400A (en) Spectrum three-dimensional reconstruction method and system based on repositioning technology and storage medium
CN114913552B (en) Three-dimensional human body density corresponding estimation method based on single-view-point cloud sequence
Zhang et al. Data-driven flower petal modeling with botany priors
CN112598780A (en) Instance object model construction method and device, readable medium and electronic equipment
CN111325828B (en) Three-dimensional face acquisition method and device based on three-dimensional camera
WO2022110877A1 (en) Depth detection method and apparatus, electronic device, storage medium and program
CN115039137A (en) Method for rendering virtual objects based on luminance estimation, method for training a neural network, and related product
CN112102504A (en) Three-dimensional scene and two-dimensional image mixing method based on mixed reality
CN109166176B (en) Three-dimensional face image generation method and device
CN114913287B (en) Three-dimensional human body model reconstruction method and system
CN113808256B (en) High-precision holographic human body reconstruction method combined with identity recognition
CN112435345B (en) Human body three-dimensional measurement method and system based on deep learning
CN112613357B (en) Face measurement method, device, electronic equipment and medium
CN112711984B (en) Fixation point positioning method and device and electronic equipment
WO2022011560A1 (en) Image cropping method and apparatus, electronic device, and storage medium
CN108566545A (en) The method that three-dimensional modeling is carried out to large scene by mobile terminal and ball curtain camera
CN113011250A (en) Hand three-dimensional image recognition method and system
Agus et al. PEEP: Perceptually Enhanced Exploration of Pictures.

Legal Events

Date Code Title Description
AS Assignment

Owner name: IMAGE DERIVATIVE INC., CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BAO, HUJUN;SUN, JIAMING;LUO, YUNSHENG;AND OTHERS;SIGNING DATES FROM 20231128 TO 20231130;REEL/FRAME:065989/0545

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED