WO2019219013A1 - 联合优化人体体态与外观模型的三维重建方法及系统 - Google Patents

联合优化人体体态与外观模型的三维重建方法及系统 Download PDF

Info

Publication number
WO2019219013A1
WO2019219013A1 PCT/CN2019/086890 CN2019086890W WO2019219013A1 WO 2019219013 A1 WO2019219013 A1 WO 2019219013A1 CN 2019086890 W CN2019086890 W CN 2019086890W WO 2019219013 A1 WO2019219013 A1 WO 2019219013A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
human body
vertex
rigid motion
motion
Prior art date
Application number
PCT/CN2019/086890
Other languages
English (en)
French (fr)
Inventor
刘烨斌
戴琼海
方璐
徐枫
Original Assignee
清华大学
清华大学深圳研究生院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 清华大学, 清华大学深圳研究生院 filed Critical 清华大学
Publication of WO2019219013A1 publication Critical patent/WO2019219013A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Definitions

  • the invention relates to the technical field of computer vision and computer graphics, in particular to a three-dimensional reconstruction method and system for jointly optimizing a human body posture and appearance model.
  • Dynamic three-dimensional reconstruction of the human body is a key issue in the field of computer graphics and computer vision.
  • High-quality dynamic three-dimensional models of human subjects such as human body, animals, human faces, human hands, etc.
  • the acquisition of high-quality 3D models usually relies on expensive laser scanners or multi-camera array systems.
  • the accuracy is high, there are also some shortcomings: First, the object is required to remain absolutely still during the scanning process. Movement will lead to obvious errors in the scanning results. Second, the fraud is expensive and difficult to spread to the daily lives of ordinary people, often applied to large companies or national statistical departments. Third, the speed is slow, and it often takes at least 10 minutes to several hours to reconstruct a 3D model. The cost of reconstructing a dynamic model sequence is greater.
  • the dynamic human body reconstruction method focuses on three aspects of research, one is to reconstruct the dynamic appearance surface of the object, but due to the rich and diverse appearance surface, it is generally required to collect complex device inputs, such as acquisition and reconstruction through a multi-camera array.
  • the second is the reconstruction of the human body shape and posture. Generally, the parameters are reconstructed through the shape and posture. The reconstructed variables are greatly reduced.
  • the prior art can be reconstructed in real time under a single depth camera, but this method cannot obtain the three-dimensional model of the surface appearance of the object.
  • the reconstruction method of the frame-by-frame dynamic fusion surface can realize the dynamic three-dimensional reconstruction without template. However, only the non-rigid surface deformation method is used, and the tracking reconstruction has low robustness.
  • the present invention aims to solve at least one of the technical problems in the related art to some extent.
  • an object of the present invention is to propose a three-dimensional reconstruction method for jointly optimizing a human body posture and appearance model, which effectively improves the real-time, robustness and accuracy of reconstruction, is highly scalable, and is simple and easy to implement.
  • Another object of the present invention is to propose a three-dimensional reconstruction system that jointly optimizes the human body posture and appearance model.
  • an embodiment of the present invention provides a three-dimensional reconstruction method for jointly optimizing a human body posture and appearance model, including the following steps: performing depth map shooting on a human body to obtain a single depth image; Converting the depth image into a three-dimensional point cloud, and acquiring a matching point pair between the three-dimensional point cloud and the reconstructed model vertex and the parametric human body model vertex; establishing an energy function according to the matching point pair, and jointly solving the reconstruction model Non-rigid motion position transformation parameters of each vertex and parameterized human body model parameters; solving the energy function, and aligning the reconstruction model with the three-dimensional point cloud according to the solution result; updating and supplementing through the depth map Fully aligned model for real-time human dynamic 3D reconstruction.
  • the method for jointly optimizing the three-dimensional reconstruction of the human body posture and the appearance model according to the embodiment of the present invention, the three-dimensional information of the dynamic object surface is merged frame by frame by the real-time non-rigid alignment method, in order to achieve robust tracking, the three-dimensional frame without the first frame is realized.
  • Robust real-time human body object dynamic 3D reconstruction under template conditions which effectively improves the real-time, robustness and accuracy of reconstruction, and is scalable and easy to implement.
  • the three-dimensional reconstruction method for jointly optimizing the human body posture and appearance model according to the above embodiment of the present invention may further have the following additional technical features:
  • the energy function is:
  • E mot ⁇ data E data + ⁇ bind E bind + ⁇ reg E reg + ⁇ prigr E prior ,
  • E mot is the total energy term for motion
  • E data is the data item, which contains data items of non-rigid motion tracking and data items of parametric human body model.
  • E bind is parameterized human body and non-rigid motion consistency.
  • Constraint term E reg is a local rigid motion constraint term, which acts on non-rigid motion data items
  • E pri is a regular term of body motion, which is used to constrain the rationality of the human body posture calculated by the solution
  • ⁇ data , ⁇ reg , ⁇ Bind and ⁇ pri are the weight coefficients corresponding to the respective constraint items.
  • u represents the position coordinates of the 3D point cloud in the same matching point pair
  • P represents the correspondence from the reconstructed model and the parametric human body model to the 3D point cloud observation.
  • Point collection with Representing the vertex coordinates of the model and its normal direction driven by the human body parameters, ⁇ 1 , ⁇ 2 and ⁇ 3 are all indicative functions to determine the mode of corresponding point selection.
  • i represents the model.
  • the depth map projection formula is:
  • u, v are pixel coordinates
  • d(u, v) is a depth value at a pixel (u, v) position on the depth image
  • model vertices are driven according to non-rigid motion and human body model parameters, wherein the calculation formula is:
  • a deformation matrix acting on the vertex v i including two parts of rotation and translation; Is the rotating portion of the deformation matrix; a set of bones that have a driving effect on the vertex v i ; ⁇ i, j is the weight of the driving action of the jth bone on the i-th model vertex, indicating the strength of the bone driving the vertex; T bj is the first The motion deformation matrix of j skeletons themselves, rot(T bj ) is the rotation part of the deformation matrix.
  • another embodiment of the present invention provides a three-dimensional reconstruction system for jointly optimizing a human body posture and appearance model, including: a depth camera for performing depth map shooting on a human body to obtain a single depth image; a module, configured to transform the single depth image into a three-dimensional point cloud, and acquire a matching point pair between the three-dimensional point cloud and the reconstructed model vertex and the parametric human body model vertex; the motion solving module is used according to the The matching point pairs establish an energy function, and jointly solve non-rigid motion position transformation parameters and parameterized human body posture model parameters of each vertex on the reconstruction model; a solution module is configured to solve the energy function and solve the problem according to the solution As a result, the reconstruction model is aligned with the three-dimensional point cloud; the model updating module is configured to update and complement the aligned model through the depth map to perform real-time human body dynamic three-dimensional reconstruction.
  • the three-dimensional reconstruction system for jointly optimizing the body posture and appearance model of the embodiment of the invention combines the three-dimensional information of the dynamic object surface frame by frame by real-time non-rigid alignment method, in order to achieve robust tracking, realizing the three-dimensional frame without the first frame Robust real-time human body object dynamic 3D reconstruction under template conditions, which effectively improves the real-time, robustness and accuracy of reconstruction, and is scalable and easy to implement.
  • the three-dimensional reconstruction system for jointly optimizing the human body posture and appearance model according to the above embodiment of the present invention may further have the following additional technical features:
  • the energy function is:
  • E mot ⁇ data E data + ⁇ bind E bind + ⁇ reg E reg + ⁇ prior E prior ,
  • E mot is the total energy term for motion
  • E data is the data item, which contains data items of non-rigid motion tracking and data items of parametric human body model.
  • E bind is parameterized human body and non-rigid motion consistency.
  • Constraint term E reg is a local rigid motion constraint term, which acts on non-rigid motion data items
  • E pri is a regular term of body motion, which is used to constrain the rationality of the human body posture calculated by the solution
  • ⁇ data , ⁇ reg , ⁇ Bind and ⁇ pri are the weight coefficients corresponding to the respective constraint items.
  • u represents the position coordinates of the 3D point cloud in the same matching point pair
  • P represents the correspondence from the reconstructed model and the parametric human body model to the 3D point cloud observation.
  • Point collection with Representing the vertex coordinates of the model and its normal direction driven by the human body parameters, ⁇ 1 , ⁇ 2 and ⁇ 3 are all indicative functions to determine the mode of corresponding point selection.
  • i represents the model.
  • the depth map projection formula is:
  • u, v are pixel coordinates
  • d(u, v) is a depth value at a pixel (u, v) position on the depth image
  • model vertices are driven according to non-rigid motion and human body model parameters, wherein the calculation formula is:
  • a deformation matrix acting on the vertex v i including two parts of rotation and translation; Is the rotating portion of the deformation matrix; a set of bones that have a driving effect on the vertex v i ; ⁇ i, j is the weight of the driving action of the jth bone on the i-th model vertex, indicating the strength of the bone driving the vertex; T bj is the first The motion deformation matrix of j skeletons themselves, rot(T bj ) is the rotation part of the deformation matrix.
  • FIG. 1 is a flow chart of a method for jointly optimizing a three-dimensional reconstruction of a human body posture and appearance model according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram showing the structure of a three-dimensional reconstruction system for jointly optimizing a human body posture and appearance model according to an embodiment of the present invention.
  • a three-dimensional reconstruction method and system for jointly optimizing human body posture and appearance model according to an embodiment of the present invention will be described below with reference to the accompanying drawings.
  • a three-dimensional reconstruction method for jointly optimizing human body posture and appearance model according to an embodiment of the present invention will be described with reference to the accompanying drawings. .
  • FIG. 1 is a flow chart of a method for jointly optimizing a three-dimensional reconstruction of a human body posture and appearance model according to an embodiment of the present invention.
  • the method for jointly optimizing the three-dimensional reconstruction of the human body posture and appearance model includes the following steps:
  • step S101 a depth map is taken on the human body to obtain a single depth image.
  • an embodiment of the present invention may take a single depth camera to capture a target person to obtain a dynamic depth image sequence of a video frame rate. That is to say, the dynamic human body is photographed using a depth camera to obtain a continuous single depth image sequence.
  • step S102 the single depth image is transformed into a three-dimensional point cloud, and a matching point pair between the three-dimensional point cloud and the reconstructed model vertex and the parametric human body model vertex is acquired.
  • the depth camera internal reference matrix is acquired, and the depth map is projected into the three-dimensional space according to the internal reference matrix to generate a set of three-dimensional point clouds.
  • the depth map projection formula is:
  • u, v are pixel coordinates
  • d(u, v) is a depth value at a pixel (u, v) position on the depth image
  • the internal parameter matrix of the depth camera is acquired, and the depth map is projected into the three-dimensional space according to the internal reference matrix and transformed into a set of three-dimensional point clouds.
  • the formula of the transformation is: Where u, v are pixel coordinates, and d(u, v) is a depth value at a pixel (u, v) position on the depth image,
  • u, v are pixel coordinates
  • d(u, v) is a depth value at a pixel (u, v) position on the depth image
  • the vertices of the three-dimensional model are projected onto the depth image using a camera projection formula to obtain matching point pairs.
  • step S103 an energy function is established according to the matching point pairs, and the non-rigid motion position transformation parameters and the parametric human body posture model parameters of each vertex on the reconstruction model are jointly solved.
  • non-rigid motion and the human body parameterized body state information are obtained by constructing and solving the energy function according to the matching point pair.
  • the energy function is:
  • E mot ⁇ data E data + ⁇ bind E bind + ⁇ reg E reg + ⁇ prior E prior ,
  • E mot is the total energy term for motion
  • E data is the data item, which contains data items of non-rigid motion tracking and data items of parametric human body model.
  • E bind is parameterized human body and non-rigid motion consistency.
  • Constraint term E reg is a local rigid motion constraint term, which acts on non-rigid motion data items
  • E pri is a regular term of body motion, which is used to constrain the rationality of the human body posture calculated by the solution
  • ⁇ data , ⁇ reg , ⁇ Bind and ⁇ pri are the weight coefficients corresponding to the respective constraint items.
  • u represents the position coordinates of the 3D point cloud in the same matching point pair
  • P represents the correspondence from the reconstructed model and the parametric human body model to the 3D point cloud observation.
  • Point collection with The vertex coordinates and their normal directions of the model driven by the human body parameters are respectively shown.
  • ⁇ 1 , ⁇ 2 and ⁇ 3 are all indicative functions to determine the mode of corresponding point selection.
  • i represents the model.
  • i vertices Represents a collection of adjacent vertices around the ith vertex on the model, with Representing the driving effects of known non-rigid motion on the surface vertices x i and x j of the model, respectively with Representing the positional transformation effect of non-rigid motion acting on x i and x j simultaneously on x j , in the constraint of human body and non-rigid motion, with Representing the vertex coordinates of the model driven by the non-rigid motion, respectively, in the regular motion term of the human body posture, ⁇ j and ⁇ j represent the Gaussian mixture weight and the mean and variance of the Gaussian model, respectively.
  • u represents the position coordinates of the 3D point cloud in the same matching point pair
  • P represents the correspondence from the reconstructed model and the parametric human body model to the 3D point cloud observation.
  • the data item E data ensures that the reconstructed model driven by the non-rigid motion and the parametric human body model are aligned as much as possible with the three-dimensional point cloud obtained from the depth map; with Represents the vertex coordinates of the model and its normal direction driven by the human body parameters.
  • the data item E data ensures that the reconstruction model driven by the non-rigid motion and the human body posture model driven by the human posture are aligned with the three-dimensional point cloud obtained from the depth map as much as possible; the local rigid motion constraint item E reg can make the model The overall non-rigid motion that is guaranteed by the local rigid restraint motion while ensuring a large amplitude can also be well solved, so that the model is more accurately aligned with the three-dimensional point cloud; the human body model and the non-rigid motion consistency constraint
  • the item E bind is used to ensure that the calculated body posture model and the non-rigid motion are as consistent as possible, so that the non-rigid motion calculated by the final solution can be guaranteed to conform to the human skeleton dynamics model, and fully obtained from the depth map.
  • the three-dimensional point cloud alignment; the human posture motion regular term E prior uses the Gaussian mixture model to constrain the correctness of the human posture, and the abnormal solution posture will cause the energy to be large until the posture is solved correctly.
  • i represents the i th vertex on the model
  • 2 ) is a robust penalty function
  • W i and W j respectively represent the driving effects of the human body parametric human body model on the surface vertices x i and x j of the model surface.
  • the value of the robust penalty function is small.
  • the driving effect of the two adjacent vertices is less than the parameterized body model, the value of the robust penalty function is larger.
  • the body posture motion regular term E prior uses the Gaussian mixture model to constrain the correctness of the human body posture.
  • the abnormal solution posture will cause the energy to be large until the attitude is solved correctly.
  • ⁇ j and ⁇ j represent the Gaussian mixture weight and the mean and variance of the Gaussian model, respectively.
  • step S104 the energy function is solved, and the reconstructed model is aligned with the three-dimensional point cloud according to the solution result.
  • model vertices are driven according to non-rigid motion and human body model parameters, wherein the calculation formula is:
  • a deformation matrix acting on the vertex v i including two parts of rotation and translation; Is the rotating portion of the deformation matrix; a set of bones that have a driving effect on the vertex v i ; ⁇ i, j is the weight of the driving action of the jth bone on the i-th model vertex, indicating the strength of the bone driving the vertex; T bj is the first The motion deformation matrix of j skeletons themselves, rot(T bj ) is the rotation part of the deformation matrix.
  • the non-rigid motion position transformation parameters and the human body posture parameters of each vertex on the reconstructed model are jointly solved.
  • the information obtained by the final solution is the transformation matrix of each 3D model vertex and the human body state parameters, that is, the individual transformation matrix of each bone.
  • the method of the embodiment of the present invention approximates the deformation equation by using an exponential mapping method:
  • the cumulative transformation matrix of the model vertex v i up to the previous frame is a known quantity;
  • I is a four-dimensional unit matrix;
  • the linearization of bone movement is the same as that of non-rigid motion.
  • step S105 the aligned model is updated and complemented by the depth map to perform real-time human body dynamic three-dimensional reconstruction.
  • the depth map is used to update and complement the aligned model and further optimize the human body shape parameters of the parameterized human body template.
  • the depth image is used to update and complete the aligned 3D model, and the newly obtained depth information is merged into the 3D model, and the surface vertex position of the 3D model is updated or a new vertex is added to the 3D model to make it more conformable.
  • the expression of the current depth image Since the updated model incorporates new information, the new model is used to more accurately solve the parametric human body shape.
  • the embodiment of the present invention can simultaneously reconstruct a dynamic human appearance surface model (such as a human body garment, a clothes cap, a backpack, etc.) and a dynamic human internal body state model using a depth camera, and is a real-time reconstruction method, and only needs to provide a single depth camera.
  • the system has a simple device, easy to deploy and expand, etc.
  • the required input information is very easy to collect, and the dynamic 3D model can be obtained in real time.
  • the method is accurate, robust, simple and easy to operate, and has a fast running speed. It has broad application prospects and can be quickly implemented on hardware systems such as PCs or workstations.
  • a three-dimensional reconstruction method for jointly optimizing a human body posture and an appearance model is adopted, and a three-dimensional information of a dynamic object surface is fused frame by frame by a real-time non-rigid alignment method, in order to achieve robust tracking, realizing a key without a first frame Robust real-time 3D reconstruction of real-time human objects under frame 3D template conditions, which effectively improves the real-time, robustness and accuracy of reconstruction, and is highly scalable and easy to implement.
  • FIG. 2 is a schematic structural view of a three-dimensional reconstruction system for jointly optimizing a human body posture and appearance model according to an embodiment of the present invention.
  • the three-dimensional reconstruction system 10 for jointly optimizing a human body posture and appearance model includes a depth camera 100, a matching module 200, a motion solving module 300, a solution module 400, and a model updating module 500.
  • the depth camera 100 is configured to perform depth map shooting on a human body to obtain a single depth image.
  • the matching module 200 is configured to transform the single depth image into a three-dimensional point cloud, and acquire a matching point pair between the three-dimensional point cloud and the reconstructed model vertex and the parametric human body model vertex.
  • the motion solution module 300 is configured to establish an energy function according to the pair of matching points, and jointly solve the non-rigid motion position transformation parameters and the parameterized human body model parameters of each vertex on the reconstruction model.
  • the solving module 400 is configured to solve the energy function and align the reconstructed model with the three-dimensional point cloud according to the solution result.
  • the model update module 500 is configured to update and complement the aligned model through the depth map to perform real-time human body dynamic three-dimensional reconstruction.
  • the system 10 of the embodiment of the present invention can effectively improve the real-time performance, robustness and accuracy of the reconstruction, and has strong scalability and is easy to implement.
  • the energy function is:
  • E mot ⁇ data E data + ⁇ bind E bind + ⁇ reg E reg + ⁇ prior E prior ,
  • E mot is the total energy term for motion
  • E data is the data item, which contains data items of non-rigid motion tracking and data items of parametric human body model.
  • E bind is parameterized human body and non-rigid motion consistency.
  • Constraint term E reg is a local rigid motion constraint term, which acts on non-rigid motion data items
  • E pri is a regular term of body motion, which is used to constrain the rationality of the human body posture calculated by the solution
  • ⁇ data , ⁇ reg , ⁇ Bind and ⁇ pri are the weight coefficients corresponding to the respective constraint items.
  • u represents the position coordinates of the 3D point cloud in the same matching point pair
  • P represents the correspondence from the reconstructed model and the parametric human body model to the 3D point cloud observation.
  • Point collection with The vertex coordinates and their normal directions of the model driven by the human body parameters are respectively shown.
  • ⁇ 1 , ⁇ 2 and ⁇ 3 are all indicative functions to determine the mode of corresponding point selection.
  • i represents the model.
  • i vertices Represents a collection of adjacent vertices around the ith vertex on the model, with Representing the driving effects of known non-rigid motion on the surface vertices x i and x j of the model, respectively with Representing the positional transformation effect of non-rigid motion acting on x i and x j simultaneously on x j , in the constraint of human body and non-rigid motion, with Representing the vertex coordinates of the model driven by the non-rigid motion, respectively, in the regular motion term of the human body posture, ⁇ j and ⁇ j represent the Gaussian mixture weight and the mean and variance of the Gaussian model, respectively.
  • the depth map projection formula is:
  • u, v are pixel coordinates
  • d(u, v) is a depth value at a pixel (u, v) position on the depth image
  • model vertices are driven according to non-rigid motion and human body model parameters, wherein the calculation formula is:
  • a deformation matrix acting on the vertex v i including two parts of rotation and translation; Is the rotating portion of the deformation matrix; a set of bones that have a driving effect on the vertex v i ; ⁇ i, j is the weight of the driving action of the jth bone on the i-th model vertex, indicating the strength of the bone driving the vertex; T bj is the first The motion deformation matrix of j skeletons themselves, rot(T bj ) is the rotation part of the deformation matrix.
  • a three-dimensional reconstruction system for jointly optimizing a human body posture and an appearance model is obtained by real-time non-rigid alignment method, and the three-dimensional information of the dynamic object surface is fused frame by frame, in order to achieve robust tracking, realizing the key without the first frame Robust real-time 3D reconstruction of real-time human objects under frame 3D template conditions, which effectively improves the real-time, robustness and accuracy of reconstruction, and is highly scalable and easy to implement.
  • first and second are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated.
  • features defining “first” or “second” may include at least one of the features, either explicitly or implicitly.
  • the meaning of "a plurality” is at least two, such as two, three, etc., unless specifically defined otherwise.
  • the terms “installation”, “connected”, “connected”, “fixed” and the like shall be understood broadly, and may be either a fixed connection or a detachable connection, unless explicitly stated and defined otherwise. , or integrated; can be mechanical or electrical connection; can be directly connected, or indirectly connected through an intermediate medium, can be the internal communication of two elements or the interaction of two elements, unless otherwise specified Limited.
  • the specific meanings of the above terms in the present invention can be understood on a case-by-case basis.
  • the first feature "on” or “under” the second feature may be a direct contact of the first and second features, or the first and second features may be indirectly through an intermediate medium, unless otherwise explicitly stated and defined. contact.
  • the first feature "above”, “above” and “above” the second feature may be that the first feature is directly above or above the second feature, or merely that the first feature level is higher than the second feature.
  • the first feature “below”, “below” and “below” the second feature may be that the first feature is directly below or obliquely below the second feature, or merely that the first feature level is less than the second feature.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Processing Or Creating Images (AREA)

Abstract

一种联合优化人体体态与外观模型的三维重建方法及系统,其中,方法包括以下步骤:对人体进行深度图拍摄,以得到单张深度图像(S101);将单张深度图像变换为三维点云,并获取三维点云和重建模型顶点及参数化人体模型顶点之间的匹配点对(S102);根据匹配点对建立能量函数,并共同求解重建模型上每一个顶点的非刚性运动位置变换参数和参数化人体体态模型参数(S103);对能量函数进行求解,并根据求解结果将重建模型与三维点云进行对齐(S104);通过深度图更新和补全对齐后的模型,以进行实时人体动态三维重建(S105)。该方法可以有效提高重建的实时性、鲁棒性和准确性,扩展性强,简单易实现。

Description

联合优化人体体态与外观模型的三维重建方法及系统
相关申请的交叉引用
本申请要求清华大学于2018年05月15日提交的、发明名称为“联合优化人体体态与外观模型的三维重建方法及系统”的、中国专利申请号“201810460079.4”的优先权。
技术领域
本发明涉及计算机视觉和计算机图形学技术领域,特别涉及一种联合优化人体体态与外观模型的三维重建方法及系统。
背景技术
人体动态三维重建是计算机图形学和计算机视觉领域的重点问题。高质量的人体对象动态三维模型,如人体,动物,人脸,人手部等,在影视娱乐、体育游戏、虚拟现实等领域有着广泛的应用前景和重要的应用价值。但是高质量三维模型的获取通常依靠价格昂贵的激光扫描仪或者多相机阵列系统来实现,虽然精度较高,但是也显著存在着一些缺点:第一,扫描过程中要求对象保持绝对静止,微小的移动就会导致扫描结果存在明显的误差;第二,造假昂贵,很难普及到普通民众日常生活中,往往应用于大公司或国家统计部门。第三,速度慢,往往重建一个三维模型需要至少10分钟到数小时的时间,重建动态模型序列的代价更大。
相关技术中,动态人体重建方法集中于三个方面的研究,一是重建对象的动态外观表面,但由于外观表面丰富多样,一般需要采集复杂的设备输入,如通过多摄像机阵列进行采集重建。二是对人体体型和姿态的重建,一般经过体型和姿态进行参数化,重建的变量大大减少,现有技术在单个深度相机下可实时重建,但该类方法无法获得对象表面外观三维模型。三是逐帧动态融合表面的重建方法虽然可实现无模板的动态三维重建,但仅仅使用非刚性表面形变方法,跟踪重建的鲁棒性低。
发明内容
本发明旨在至少在一定程度上解决相关技术中的技术问题之一。
为此,本发明的一个目的在于提出一种联合优化人体体态与外观模型的三维重建方法,该方法有效提高重建的实时性、鲁棒性和准确性,扩展性强,简单易实现。
本发明的另一个目的在于提出一种联合优化人体体态与外观模型的三维重建系统。
为达到上述目的,本发明一方面实施例提出了一种联合优化人体体态与外观模型的三维重建方法,包括以下步骤:对人体进行深度图拍摄,以得到单张深度图像;将所述单张深度图像变换为三维点云,并获取所述三维点云和重建模型顶点及参数化人体模型顶点之间的匹配点对;根据所述匹配点对建立能量函数,并共同求解所述重建模型上每一个顶点的非刚性运动位置变换参数和参数化人体体态模型参数;对所述能量函数进行求解,并根据求解结果将所述重建模型与所述三维点云进行对齐;通过深度图更新和补全对齐后的模型,以进行实时人体动态三维重建。
本发明实施例的联合优化人体体态与外观模型的三维重建方法,通过实时非刚性对齐的方法,逐帧地融合动态对象表面三维信息,为了实现鲁棒地跟踪,实现在无首帧关键帧三维模板条件下的鲁棒性实时人体对象的动态三维重建,从而有效提高重建的实时性、鲁棒性和准确性,扩展性强,简单易实现。
另外,根据本发明上述实施例的联合优化人体体态与外观模型的三维重建方法还可以具有以下附加的技术特征:
进一步地,在本发明的一个实施例中,所述能量函数为:
E mot=λ dataE databindE bindregE regprigrE prior
其中,E mot为运动求解总能量项,E data为数据项,里面包含了非刚性运动跟踪的数据项和参数化人体体态模型的数据项,E bind为参数化人体体态和非刚性运动一致性约束项,E reg为局部刚性运动约束项,作用于非刚性运动数据项,E pri为人体体态运动的正则项,用于约束解算出来的人体姿态的合理性,λ data、λ reg、λ bind和λ pri分别为对应各个约束项的权重系数。
进一步地,在本发明的一个实施例中,其中,
Figure PCTCN2019086890-appb-000001
Figure PCTCN2019086890-appb-000002
Figure PCTCN2019086890-appb-000003
Figure PCTCN2019086890-appb-000004
其中,
Figure PCTCN2019086890-appb-000005
Figure PCTCN2019086890-appb-000006
分别表示经过非刚性运动驱动后的模型顶点坐标及其法向,u表示同一匹配点对中三维点云的位置坐标,P表示从重建模型和参数化人体体态模型上到三维点云观测的对应点集合,
Figure PCTCN2019086890-appb-000007
Figure PCTCN2019086890-appb-000008
分别表示经过人体体态参数驱动后的模型顶点坐标及其法向,τ 1、τ 2和τ 3均为示性函数以决定对应点选择的模式,所述局部刚性运动约束项中,i表示模型上 第i个顶点,
Figure PCTCN2019086890-appb-000009
表示模型上第i个顶点周围的邻近顶点的集合,
Figure PCTCN2019086890-appb-000010
Figure PCTCN2019086890-appb-000011
分别代表已知非刚性运动对模型表面顶点x i和x j的驱动作用,
Figure PCTCN2019086890-appb-000012
Figure PCTCN2019086890-appb-000013
代表作用在x i和x j上的非刚性运动同时作用在x j上的位置变换效果,所述人体体态和非刚性运动一致性约束项中,
Figure PCTCN2019086890-appb-000014
Figure PCTCN2019086890-appb-000015
分别代表受非刚性运动驱动后的模型顶点坐标,所述人体姿态运动正则项中,
Figure PCTCN2019086890-appb-000016
μ j和δ j分别表示高斯混合权重以及高斯模型的均值和方差。
进一步地,在本发明的一个实施例中,深度图投影公式为:
Figure PCTCN2019086890-appb-000017
其中,u,v为像素坐标,d(u,v)为深度图像上像素(u,v)位置处的深度值,所述
Figure PCTCN2019086890-appb-000018
为深度相机内参矩阵。
进一步地,在本发明的一个实施例中,根据非刚性运动和人体体态模型参数驱动模型顶点,其中,计算公式为:
Figure PCTCN2019086890-appb-000019
Figure PCTCN2019086890-appb-000020
其中,
Figure PCTCN2019086890-appb-000021
为作用于顶点v i的变形矩阵,包括旋转和平移两部分;
Figure PCTCN2019086890-appb-000022
为该变形矩阵的旋转部分;
Figure PCTCN2019086890-appb-000023
为对顶点v i有驱动作用的骨骼的集合;α i,j为第j个骨骼对第i个模型顶点的驱动作用的权重,表示该骨骼对该顶点驱动作用的强弱;T bj为第j个骨骼自身的运动变形矩阵,rot(T bj)为该变形矩阵的旋转部分。
为达到上述目的,本发明另一方面实施例提出了一种联合优化人体体态与外观模型的三维重建系统,包括:深度相机,用于对人体进行深度图拍摄,以得到单张深度图像;匹配模块,用于将所述单张深度图像变换为三维点云,并获取所述三维点云和重建模型顶点及参数化人体模型顶点之间的匹配点对;运动解算模块,用于根据所述匹配点对建立能量函数,并共同求解所述重建模型上每一个顶点的非刚性运动位置变换参数和参数化人体体态模型参数;求解模块,用于对所述能量函数进行求解,并根据求解结果将所述重建模型与所述三维点云进行对齐;模型更新模块,用于通过深度图更新和补全对齐后的模型,以进行实时人体动态三维重建。
本发明实施例的联合优化人体体态与外观模型的三维重建系统,通过实时非刚性对齐的方法,逐帧地融合动态对象表面三维信息,为了实现鲁棒地跟踪,实现在无首帧关键帧三维模板条件下的鲁棒性实时人体对象的动态三维重建,从而有效提高重建的实时性、鲁 棒性和准确性,扩展性强,简单易实现。
另外,根据本发明上述实施例的联合优化人体体态与外观模型的三维重建系统还可以具有以下附加的技术特征:
进一步地,在本发明的一个实施例中,所述能量函数为:
E mot=λ dataE databindE bindregE regpriorE prior
其中,E mot为运动求解总能量项,E data为数据项,里面包含了非刚性运动跟踪的数据项和参数化人体体态模型的数据项,E bind为参数化人体体态和非刚性运动一致性约束项,E reg为局部刚性运动约束项,作用于非刚性运动数据项,E pri为人体体态运动的正则项,用于约束解算出来的人体姿态的合理性,λ data、λ reg、λ bind和λ pri分别为对应各个约束项的权重系数。
进一步地,在本发明的一个实施例中,其中,
Figure PCTCN2019086890-appb-000024
Figure PCTCN2019086890-appb-000025
Figure PCTCN2019086890-appb-000026
Figure PCTCN2019086890-appb-000027
其中,
Figure PCTCN2019086890-appb-000028
Figure PCTCN2019086890-appb-000029
分别表示经过非刚性运动驱动后的模型顶点坐标及其法向,u表示同一匹配点对中三维点云的位置坐标,P表示从重建模型和参数化人体体态模型上到三维点云观测的对应点集合,
Figure PCTCN2019086890-appb-000030
Figure PCTCN2019086890-appb-000031
分别表示经过人体体态参数驱动后的模型顶点坐标及其法向,τ 1、τ 2和τ 3均为示性函数以决定对应点选择的模式,所述局部刚性运动约束项中,i表示模型上第i个顶点,
Figure PCTCN2019086890-appb-000032
表示模型上第i个顶点周围的邻近顶点的集合,
Figure PCTCN2019086890-appb-000033
Figure PCTCN2019086890-appb-000034
分别代表已知非刚性运动对模型表面顶点x i和x j的驱动作用,
Figure PCTCN2019086890-appb-000035
Figure PCTCN2019086890-appb-000036
代表作用在x i和x j上的非刚性运动同时作用在x j上的位置变换效果,所述人体体态和非刚性运动一致性约束项中,
Figure PCTCN2019086890-appb-000037
Figure PCTCN2019086890-appb-000038
分别代表受非刚性运动驱动后的模型顶点坐标,所述人体姿态运动正则项中,
Figure PCTCN2019086890-appb-000039
μ j和δ j分别表示高斯混合权重以及高斯模型的均值和方差。
进一步地,在本发明的一个实施例中,深度图投影公式为:
Figure PCTCN2019086890-appb-000040
其中,u,v为像素坐标,d(u,v)为深度图像上像素(u,v)位置处的深度值,所述
Figure PCTCN2019086890-appb-000041
为深度 相机内参矩阵。
进一步地,在本发明的一个实施例中,根据非刚性运动和人体体态模型参数驱动模型顶点,其中,计算公式为:
Figure PCTCN2019086890-appb-000042
Figure PCTCN2019086890-appb-000043
其中,
Figure PCTCN2019086890-appb-000044
为作用于顶点v i的变形矩阵,包括旋转和平移两部分;
Figure PCTCN2019086890-appb-000045
为该变形矩阵的旋转部分;
Figure PCTCN2019086890-appb-000046
为对顶点v i有驱动作用的骨骼的集合;α i,j为第j个骨骼对第i个模型顶点的驱动作用的权重,表示该骨骼对该顶点驱动作用的强弱;T bj为第j个骨骼自身的运动变形矩阵,rot(T bj)为该变形矩阵的旋转部分。
本发明附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。
附图说明
本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:
图1为根据本发明一个实施例的联合优化人体体态与外观模型的三维重建方法的流程图;
图2为根据本发明一个实施例的联合优化人体体态与外观模型的三维重建系统的结构示意图。
具体实施方式
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本发明,而不能理解为对本发明的限制。
下面参照附图描述根据本发明实施例提出的联合优化人体体态与外观模型的三维重建方法及系统,首先将参照附图描述根据本发明实施例提出的联合优化人体体态与外观模型的三维重建方法。
图1是本发明一个实施例的联合优化人体体态与外观模型的三维重建方法的流程图。
如图1所示,该联合优化人体体态与外观模型的三维重建方法包括以下步骤:
在步骤S101中,对人体进行深度图拍摄,以得到单张深度图像。
例如,本发明实施例可以采用单个深度相机对目标人物进行拍摄以得到视频帧率的动态深度图像序列。也就是说,使用深度相机对动态人体进行拍摄,获得连续的单张深度图像序列。
在步骤S102中,将单张深度图像变换为三维点云,并获取三维点云和重建模型顶点及参数化人体模型顶点之间的匹配点对。
可以理解的是,获取深度相机内参矩阵,根据内参矩阵将深度图投影到三维空间中生成一组三维点云。
进一步地,在本发明的一个实施例中,深度图投影公式为:
Figure PCTCN2019086890-appb-000047
其中,u,v为像素坐标,d(u,v)为深度图像上像素(u,v)位置处的深度值,
Figure PCTCN2019086890-appb-000048
为深度相机内参矩阵。
具体而言,获取深度相机的内参矩阵,根据内参矩阵将深度图投影到三维空间中变换为一组三维点云。其中,变换的公式为:
Figure PCTCN2019086890-appb-000049
其中,u,v为像素坐标,d(u,v)为深度图像上像素(u,v)位置处的深度值,
Figure PCTCN2019086890-appb-000050
为深度相机内参矩阵。在获取匹配点对方面,使用相机投影公式将三维模型的顶点投影到深度图像上以获得匹配点对。
在步骤S103中,根据匹配点对建立能量函数,并共同求解重建模型上每一个顶点的非刚性运动位置变换参数和参数化人体体态模型参数。
可以理解的是,根据匹配点对构造并解算能量函数获得非刚性运动和人体参数化体态信息。
进一步地,在本发明的一个实施例中,能量函数为:
E mot=λ dataE databindE bindregE regpriorE prior
其中,E mot为运动求解总能量项,E data为数据项,里面包含了非刚性运动跟踪的数据项和参数化人体体态模型的数据项,E bind为参数化人体体态和非刚性运动一致性约束项,E reg为局部刚性运动约束项,作用于非刚性运动数据项,E pri为人体体态运动的正则项,用于约束解算出来的人体姿态的合理性,λ data、λ reg、λ bind和λ pri分别为对应各个约束项的权重系数。
进一步地,在本发明的一个实施例中,其中,
Figure PCTCN2019086890-appb-000051
Figure PCTCN2019086890-appb-000052
Figure PCTCN2019086890-appb-000053
Figure PCTCN2019086890-appb-000054
其中,
Figure PCTCN2019086890-appb-000055
Figure PCTCN2019086890-appb-000056
分别表示经过非刚性运动驱动后的模型顶点坐标及其法向,u表示同一匹配点对中三维点云的位置坐标,P表示从重建模型和参数化人体体态模型上到三维点云观测的对应点集合,
Figure PCTCN2019086890-appb-000057
Figure PCTCN2019086890-appb-000058
分别表示经过人体体态参数驱动后的模型顶点坐标及其法向,τ 1、τ 2和τ 3均为示性函数以决定对应点选择的模式,局部刚性运动约束项中,i表示模型上第i个顶点,
Figure PCTCN2019086890-appb-000059
表示模型上第i个顶点周围的邻近顶点的集合,
Figure PCTCN2019086890-appb-000060
Figure PCTCN2019086890-appb-000061
分别代表已知非刚性运动对模型表面顶点x i和x j的驱动作用,
Figure PCTCN2019086890-appb-000062
Figure PCTCN2019086890-appb-000063
代表作用在x i和x j上的非刚性运动同时作用在x j上的位置变换效果,人体体态和非刚性运动一致性约束项中,
Figure PCTCN2019086890-appb-000064
Figure PCTCN2019086890-appb-000065
分别代表受非刚性运动驱动后的模型顶点坐标,人体姿态运动正则项中,
Figure PCTCN2019086890-appb-000066
μ j和δ j分别表示高斯混合权重以及高斯模型的均值和方差。
具体而言,
Figure PCTCN2019086890-appb-000067
Figure PCTCN2019086890-appb-000068
分别表示经过非刚性运动驱动后的模型顶点坐标及其法向,u表示同一匹配点对中三维点云的位置坐标,P表示从重建模型和参数化人体体态模型上到三维点云观测的对应点集合。数据项E data保证经过非刚性运动驱动后的重建模型和参数化人体体态模型与从深度图获得的三维点云尽可能的对齐;
Figure PCTCN2019086890-appb-000069
Figure PCTCN2019086890-appb-000070
分别表示经过人体体态参数驱动后的模型顶点坐标及其法向。τ 1、τ 2和τ 3三个示性函数决定了对应点选择的模式,当对应点来自重建模型时τ 1=1,当对应点来自参数化人体模型时τ 2=1,当对应点来自重建模型且其受参数化模型驱动时τ 3=1。
其中,数据项E data保证经过非刚性运动驱动后的重建模型和经过人体姿态驱动的人体体态模型与从深度图获得的三维点云尽可能的对齐;局部刚性运动约束项E reg可以在使模型整体受局部刚性约束运动的同时保证较大幅度的合理的非刚性运动也能被很好的解算出来,从而使模型更精确的与三维点云对齐;人体体态模型和非刚性运动一致性约束项E bind用于保证解算出来的人体体态模型和非刚性运动尽可能的一致,从而可以保证最终解算出来的非刚性运动即符合人体骨架动力学模型,又充分的与从深度图中获得的三维点云对齐;人体姿态运动正则项E prior使用高斯混合模型来约束人体姿态的正确性,不正常的解算姿态会导致该项能量较大,直至姿态解算正确。
进一步地,(1)在局部刚性运动约束项E reg中,i表示模型上第i个顶点,
Figure PCTCN2019086890-appb-000071
表示模型上第i个顶点周围的邻近顶点的集合,
Figure PCTCN2019086890-appb-000072
Figure PCTCN2019086890-appb-000073
分别代表已知非刚性运动对模型表面顶点x i和 x j的驱动作用,
Figure PCTCN2019086890-appb-000074
Figure PCTCN2019086890-appb-000075
代表作用在x i和x j上的非刚性运动同时作用在x j上的位置变换效果,即要保证模型上邻近顶点的非刚性驱动效果要尽可能的一致。ρ(|W i-W j| 2)是一个鲁棒惩罚函数,W i和W j分别代表人体参数化人体体态模型对模型表面顶点x i和x j的驱动作用,当模型表面两个相邻顶点受参数化人体体态模型驱动效果相差较大时,该鲁棒惩罚函数值较小,当两个相邻顶点受参数化体态模型驱动效果相差较小时,该鲁棒惩罚函数值较大,通过该鲁棒惩罚函数,可以在使模型整体受局部刚性约束运动的同时保证较大幅度的合理的非刚性运动也能被很好的解算出来,从而使模型更精确的与三维点云对齐;
(2)在人体体态和非刚性运动一致性约束项E bind中,
Figure PCTCN2019086890-appb-000076
Figure PCTCN2019086890-appb-000077
分别代表受非刚性运动驱动后的模型顶点坐标,其中,
Figure PCTCN2019086890-appb-000078
该约束项用于保证解算出来的参数化人体体态模型和表面非刚性运动尽可能的一致,从而可以保证最终解算出来的非刚性运动既符合人体骨架动力学模型,又充分的与从深度图中获得的三维点云对齐。
(3)体姿态运动正则项E prior使用高斯混合模型来约束人体体态的正确性,不正常的解算姿态会导致该项能量较大,直至姿态解算正确。其中,
Figure PCTCN2019086890-appb-000079
μ j和δ j分别表示高斯混合权重以及高斯模型的均值和方差。
在步骤S104中,对能量函数进行求解,并根据求解结果将重建模型与三维点云进行对齐。
进一步地,在本发明的一个实施例中,根据非刚性运动和人体体态模型参数驱动模型顶点,其中,计算公式为:
Figure PCTCN2019086890-appb-000080
Figure PCTCN2019086890-appb-000081
其中,
Figure PCTCN2019086890-appb-000082
为作用于顶点v i的变形矩阵,包括旋转和平移两部分;
Figure PCTCN2019086890-appb-000083
为该变形矩阵的旋转部分;
Figure PCTCN2019086890-appb-000084
为对顶点v i有驱动作用的骨骼的集合;α i,j为第j个骨骼对第i个模型顶点的驱动作用的权重,表示该骨骼对该顶点驱动作用的强弱;T bj为第j个骨骼自身的运动变形矩阵,rot(T bj)为该变形矩阵的旋转部分。
具体而言,共同求解重建模型上每一个顶点的非刚性运动位置变换参数和人体体态参数。最终求解获得的信息为每一个三维模型顶点的变换矩阵和人体体态参数,即每个骨骼的单独的变换矩阵。为了实现快速线性求解的要求,本发明实施例的方法对利用指数映射方法对变形方程做如下近似:
Figure PCTCN2019086890-appb-000085
其中,
Figure PCTCN2019086890-appb-000086
为截至上一帧的模型顶点v i的累积变换矩阵,为已知量;I为四维单位阵;
其中,
Figure PCTCN2019086890-appb-000087
Figure PCTCN2019086890-appb-000088
即上一帧变换后的模型顶点,则经过变换有:
Figure PCTCN2019086890-appb-000089
对于每个顶点,要求解的未知参数即为六维变换参数x=(v 1,v 2,v 3,w x,w y,w z) T。骨骼运动的线性化方式与非刚性运动相同。
在步骤S105中,通过深度图更新和补全对齐后的模型,以进行实时人体动态三维重建。
可以理解的是,使用深度图来更新和补全对齐后的模型并且进一步优化参数化人体模板的人体体形参数。
具体而言,使用深度图像对对齐后的三维模型进行更新和补全,将新获得的深度信息融合到三维模型中,更新三维模型表面顶点位置或为三维模型增加新的顶点,使其更符合当前深度图像的表达。由于更新后的模型融合了新的信息,使用新的模型对参数化人体体形进行更加精确的求解。
综上,本发明实施例使用一个深度相机即可同时重建动态人体外观表面模型(比如人体服饰,衣帽,背包等)以及动态人体内部体态模型,为实时重建方法,并且仅需提供单个深度相机输入,系统具有设备简单,方便部署和可扩展等有点,所需的输入信息非常容易采集,并且可以实时的获得动态三维模型。该方法求解准确鲁棒,简单易行,运行速度为实时,拥有广阔的应用前景,可以在PC机或工作站等硬件系统上快速实现。
根据本发明实施例提出的联合优化人体体态与外观模型的三维重建方法,通过实时非刚性对齐的方法,逐帧地融合动态对象表面三维信息,为了实现鲁棒地跟踪,实现在无首帧关键帧三维模板条件下的鲁棒性实时人体对象的动态三维重建,从而有效提高重建的实时性、鲁棒性和准确性,扩展性强,简单易实现。
其次参照附图描述根据本发明实施例提出的联合优化人体体态与外观模型的三维重建系统。
图2是本发明一个实施例的联合优化人体体态与外观模型的三维重建系统的结构示意图。
如图2所示,该联合优化人体体态与外观模型的三维重建系统10包括:深度相机100、匹配模块200、运动解算模块300、求解模块400和模型更新模块500。
其中,深度相机100用于对人体进行深度图拍摄,以得到单张深度图像。匹配模块200用于将单张深度图像变换为三维点云,并获取三维点云和重建模型顶点及参数化人体模型顶点之间的匹配点对。运动解算模块300用于根据匹配点对建立能量函数,并共同求解重建模型上每一个顶点的非刚性运动位置变换参数和参数化人体体态模型参数。求解模块400用于对能量函数进行求解,并根据求解结果将重建模型与三维点云进行对齐。模型更新模块500用于通过深度图更新和补全对齐后的模型,以进行实时人体动态三维重建。本发明实施例的系统10可以有效提高重建的实时性、鲁棒性和准确性,扩展性强,简单易实现。
进一步地,在本发明的一个实施例中,能量函数为:
E mot=λ dataE databindE bindregE regpriorE prior
其中,E mot为运动求解总能量项,E data为数据项,里面包含了非刚性运动跟踪的数据项和参数化人体体态模型的数据项,E bind为参数化人体体态和非刚性运动一致性约束项,E reg为局部刚性运动约束项,作用于非刚性运动数据项,E pri为人体体态运动的正则项,用于约束解算出来的人体姿态的合理性,λ data、λ reg、λ bind和λ pri分别为对应各个约束项的权重系数。
进一步地,在本发明的一个实施例中,其中,
Figure PCTCN2019086890-appb-000090
Figure PCTCN2019086890-appb-000091
Figure PCTCN2019086890-appb-000092
Figure PCTCN2019086890-appb-000093
其中,
Figure PCTCN2019086890-appb-000094
Figure PCTCN2019086890-appb-000095
分别表示经过非刚性运动驱动后的模型顶点坐标及其法向,u表示同一匹配点对中三维点云的位置坐标,P表示从重建模型和参数化人体体态模型上到三维点云观测的对应点集合,
Figure PCTCN2019086890-appb-000096
Figure PCTCN2019086890-appb-000097
分别表示经过人体体态参数驱动后的模型顶点坐标及其法向,τ 1、τ 2和τ 3均为示性函数以决定对应点选择的模式,局部刚性运动约束项中,i表示模型上第i个顶点,
Figure PCTCN2019086890-appb-000098
表示模型上第i个顶点周围的邻近顶点的集合,
Figure PCTCN2019086890-appb-000099
Figure PCTCN2019086890-appb-000100
分别代表已知非刚性运 动对模型表面顶点x i和x j的驱动作用,
Figure PCTCN2019086890-appb-000101
Figure PCTCN2019086890-appb-000102
代表作用在x i和x j上的非刚性运动同时作用在x j上的位置变换效果,人体体态和非刚性运动一致性约束项中,
Figure PCTCN2019086890-appb-000103
Figure PCTCN2019086890-appb-000104
分别代表受非刚性运动驱动后的模型顶点坐标,人体姿态运动正则项中,
Figure PCTCN2019086890-appb-000105
μ j和δ j分别表示高斯混合权重以及高斯模型的均值和方差。
进一步地,在本发明的一个实施例中,深度图投影公式为:
Figure PCTCN2019086890-appb-000106
其中,u,v为像素坐标,d(u,v)为深度图像上像素(u,v)位置处的深度值,
Figure PCTCN2019086890-appb-000107
为深度相机内参矩阵。
进一步地,在本发明的一个实施例中,根据非刚性运动和人体体态模型参数驱动模型顶点,其中,计算公式为:
Figure PCTCN2019086890-appb-000108
Figure PCTCN2019086890-appb-000109
其中,
Figure PCTCN2019086890-appb-000110
为作用于顶点v i的变形矩阵,包括旋转和平移两部分;
Figure PCTCN2019086890-appb-000111
为该变形矩阵的旋转部分;
Figure PCTCN2019086890-appb-000112
为对顶点v i有驱动作用的骨骼的集合;α i,j为第j个骨骼对第i个模型顶点的驱动作用的权重,表示该骨骼对该顶点驱动作用的强弱;T bj为第j个骨骼自身的运动变形矩阵,rot(T bj)为该变形矩阵的旋转部分。
需要说明的是,前述对联合优化人体体态与外观模型的三维重建方法实施例的解释说明也适用于该实施例的联合优化人体体态与外观模型的三维重建系统,此处不再赘述。
根据本发明实施例提出的联合优化人体体态与外观模型的三维重建系统,通过实时非刚性对齐的方法,逐帧地融合动态对象表面三维信息,为了实现鲁棒地跟踪,实现在无首帧关键帧三维模板条件下的鲁棒性实时人体对象的动态三维重建,从而有效提高重建的实时性、鲁棒性和准确性,扩展性强,简单易实现。
在本发明的描述中,需要理解的是,术语“中心”、“纵向”、“横向”、“长度”、“宽度”、“厚度”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”“内”、“外”、“顺时针”、“逆时针”、“轴向”、“径向”、“周向”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的系统或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者 隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
在本发明中,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”、“固定”等术语应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或成一体;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通或两个元件的相互作用关系,除非另有明确的限定。对于本领域的普通技术人员而言,可以根据具体情况理解上述术语在本发明中的具体含义。
在本发明中,除非另有明确的规定和限定,第一特征在第二特征“上”或“下”可以是第一和第二特征直接接触,或第一和第二特征通过中间媒介间接接触。而且,第一特征在第二特征“之上”、“上方”和“上面”可是第一特征在第二特征正上方或斜上方,或仅仅表示第一特征水平高度高于第二特征。第一特征在第二特征“之下”、“下方”和“下面”可以是第一特征在第二特征正下方或斜下方,或仅仅表示第一特征水平高度小于第二特征。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (10)

  1. 一种联合优化人体体态与外观模型的三维重建方法,其特征在于,包括以下步骤:
    对人体进行深度图拍摄,以得到单张深度图像;
    将所述单张深度图像变换为三维点云,并获取所述三维点云和重建模型顶点及参数化人体模型顶点之间的匹配点对;
    根据所述匹配点对建立能量函数,并共同求解所述重建模型上每一个顶点的非刚性运动位置变换参数和参数化人体体态模型参数;
    对所述能量函数进行求解,并根据求解结果将所述重建模型与所述三维点云进行对齐;以及
    通过深度图更新和补全对齐后的模型,以进行实时人体动态三维重建。
  2. 根据权利要求1所述的联合优化人体体态与外观模型的三维重建方法,其特征在于,所述能量函数为:
    E mot=λ dataE databindE bindregE regpriorE prior
    其中,E mot为运动求解总能量项,E data为数据项,里面包含了非刚性运动跟踪的数据项和参数化人体体态模型的数据项,E bind为参数化人体体态和非刚性运动一致性约束项,E reg为局部刚性运动约束项,作用于非刚性运动数据项,E pri为人体体态运动的正则项,用于约束解算出来的人体姿态的合理性,λ data、λ reg、λ bind和λ pri分别为对应各个约束项的权重系数。
  3. 根据权利要求2所述的联合优化人体体态与外观模型的三维重建方法,其特征在于,其中,
    Figure PCTCN2019086890-appb-100001
    Figure PCTCN2019086890-appb-100002
    Figure PCTCN2019086890-appb-100003
    Figure PCTCN2019086890-appb-100004
    其中,
    Figure PCTCN2019086890-appb-100005
    Figure PCTCN2019086890-appb-100006
    分别表示经过非刚性运动驱动后的模型顶点坐标及其法向,u表示同一匹配点对中三维点云的位置坐标,P表示从重建模型和参数化人体体态模型上到三维点云观测的对应点集合,
    Figure PCTCN2019086890-appb-100007
    Figure PCTCN2019086890-appb-100008
    分别表示经过人体体态参数驱动后的模型顶点坐标及其法向,τ 1、τ 2和τ 3均为示性函数以决定对应点选择的模式,所述局部刚性运动约束项中,i表示模型上 第i个顶点,
    Figure PCTCN2019086890-appb-100009
    表示模型上第i个顶点周围的邻近顶点的集合,
    Figure PCTCN2019086890-appb-100010
    Figure PCTCN2019086890-appb-100011
    分别代表已知非刚性运动对模型表面顶点x i和x j的驱动作用,
    Figure PCTCN2019086890-appb-100012
    Figure PCTCN2019086890-appb-100013
    代表作用在x i和x j上的非刚性运动同时作用在x j上的位置变换效果,所述人体体态和非刚性运动一致性约束项中,
    Figure PCTCN2019086890-appb-100014
    Figure PCTCN2019086890-appb-100015
    分别代表受非刚性运动驱动后的模型顶点坐标,所述人体姿态运动正则项中,
    Figure PCTCN2019086890-appb-100016
    μ j和δ j分别表示高斯混合权重以及高斯模型的均值和方差。
  4. 根据权利要求1所述的联合优化人体体态与外观模型的三维重建方法,其特征在于,深度图投影公式为:
    Figure PCTCN2019086890-appb-100017
    其中,u,v为像素坐标,d(u,v)为深度图像上像素(u,v)位置处的深度值,所述
    Figure PCTCN2019086890-appb-100018
    为深度相机内参矩阵。
  5. 根据权利要求1-4任一项所述的联合优化人体体态与外观模型的三维重建方法,其特征在于,根据非刚性运动和人体体态模型参数驱动模型顶点,其中,计算公式为:
    Figure PCTCN2019086890-appb-100019
    Figure PCTCN2019086890-appb-100020
    其中,
    Figure PCTCN2019086890-appb-100021
    为作用于顶点v i的变形矩阵,包括旋转和平移两部分;
    Figure PCTCN2019086890-appb-100022
    为该变形矩阵的旋转部分;
    Figure PCTCN2019086890-appb-100023
    为对顶点v i有驱动作用的骨骼的集合;α i,j为第j个骨骼对第i个模型顶点的驱动作用的权重,表示该骨骼对该顶点驱动作用的强弱;T bj为第j个骨骼自身的运动变形矩阵,rot(T bj)为该变形矩阵的旋转部分。
  6. 一种联合优化人体体态与外观模型的三维重建系统,其特征在于,包括:
    深度相机,用于对人体进行深度图拍摄,以得到单张深度图像;
    匹配模块,用于将所述单张深度图像变换为三维点云,并获取所述三维点云和重建模型顶点及参数化人体模型顶点之间的匹配点对;
    运动解算模块,用于根据所述匹配点对建立能量函数,并共同求解所述重建模型上每一个顶点的非刚性运动位置变换参数和参数化人体体态模型参数;
    求解模块,用于对所述能量函数进行求解,并根据求解结果将所述重建模型与所述三维点云进行对齐;以及
    模型更新模块,用于通过深度图更新和补全对齐后的模型,以进行实时人体动态三维重建。
  7. 根据权利要求6所述的联合优化人体体态与外观模型的三维重建系统,其特征在于,所述能量函数为:
    E mot=λ dataE databindE bindregE regpriorE prior
    其中,E mot为运动求解总能量项,E data为数据项,里面包含了非刚性运动跟踪的数据项和参数化人体体态模型的数据项,E bind为参数化人体体态和非刚性运动一致性约束项,E reg为局部刚性运动约束项,作用于非刚性运动数据项,E pri为人体体态运动的正则项,用于约束解算出来的人体姿态的合理性,λ data、λ reg、λ bind和λ pri分别为对应各个约束项的权重系数。
  8. 根据权利要求7所述的联合优化人体体态与外观模型的三维重建系统,其特征在于,其中,
    Figure PCTCN2019086890-appb-100024
    Figure PCTCN2019086890-appb-100025
    Figure PCTCN2019086890-appb-100026
    Figure PCTCN2019086890-appb-100027
    其中,
    Figure PCTCN2019086890-appb-100028
    Figure PCTCN2019086890-appb-100029
    分别表示经过非刚性运动驱动后的模型顶点坐标及其法向,u表示同一匹配点对中三维点云的位置坐标,P表示从重建模型和参数化人体体态模型上到三维点云观测的对应点集合,
    Figure PCTCN2019086890-appb-100030
    Figure PCTCN2019086890-appb-100031
    分别表示经过人体体态参数驱动后的模型顶点坐标及其法向,τ 1、τ 2和τ 3均为示性函数以决定对应点选择的模式,所述局部刚性运动约束项中,i表示模型上第i个顶点,
    Figure PCTCN2019086890-appb-100032
    表示模型上第i个顶点周围的邻近顶点的集合,
    Figure PCTCN2019086890-appb-100033
    Figure PCTCN2019086890-appb-100034
    分别代表已知非刚性运动对模型表面顶点x i和x j的驱动作用,
    Figure PCTCN2019086890-appb-100035
    Figure PCTCN2019086890-appb-100036
    代表作用在x i和x j上的非刚性运动同时作用在x j上的位置变换效果,所述人体体态和非刚性运动一致性约束项中,
    Figure PCTCN2019086890-appb-100037
    Figure PCTCN2019086890-appb-100038
    分别代表受非刚性运动驱动后的模型顶点坐标,所述人体姿态运动正则项中,
    Figure PCTCN2019086890-appb-100039
    μ j和δ j分别表示高斯混合权重以及高斯模型的均值和方差。
  9. 根据权利要求6所述的联合优化人体体态与外观模型的三维重建系统,其特征在于,深度图投影公式为:
    Figure PCTCN2019086890-appb-100040
    其中,u,v为像素坐标,d(u,v)为深度图像上像素(u,v)位置处的深度值,所述
    Figure PCTCN2019086890-appb-100041
    为深度 相机内参矩阵。
  10. 根据权利要求6-9任一项所述的联合优化人体体态与外观模型的三维重建系统,其特征在于,根据非刚性运动和人体体态模型参数驱动模型顶点,其中,计算公式为:
    Figure PCTCN2019086890-appb-100042
    Figure PCTCN2019086890-appb-100043
    其中,
    Figure PCTCN2019086890-appb-100044
    为作用于顶点v i的变形矩阵,包括旋转和平移两部分;
    Figure PCTCN2019086890-appb-100045
    为该变形矩阵的旋转部分;
    Figure PCTCN2019086890-appb-100046
    为对顶点v i有驱动作用的骨骼的集合;α i,j为第j个骨骼对第i个模型顶点的驱动作用的权重,表示该骨骼对该顶点驱动作用的强弱;T bj为第j个骨骼自身的运动变形矩阵,rot(T bj)为该变形矩阵的旋转部分。
PCT/CN2019/086890 2018-05-15 2019-05-14 联合优化人体体态与外观模型的三维重建方法及系统 WO2019219013A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810460079.4 2018-05-15
CN201810460079.4A CN108665537B (zh) 2018-05-15 2018-05-15 联合优化人体体态与外观模型的三维重建方法及系统

Publications (1)

Publication Number Publication Date
WO2019219013A1 true WO2019219013A1 (zh) 2019-11-21

Family

ID=63779452

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/086890 WO2019219013A1 (zh) 2018-05-15 2019-05-14 联合优化人体体态与外观模型的三维重建方法及系统

Country Status (2)

Country Link
CN (1) CN108665537B (zh)
WO (1) WO2019219013A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112837406A (zh) * 2021-01-11 2021-05-25 聚好看科技股份有限公司 一种三维重建方法、装置及系统

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108665537B (zh) * 2018-05-15 2020-09-25 清华大学 联合优化人体体态与外观模型的三维重建方法及系统
CN109523635B (zh) * 2018-11-01 2023-07-21 深圳蒜泥科技投资管理合伙企业(有限合伙) 一种三维人体扫描非刚性重建和测量方法及装置
CN109376791B (zh) * 2018-11-05 2020-11-24 北京旷视科技有限公司 深度算法精度计算方法、装置、电子设备、可读存储介质
CN110175897A (zh) * 2019-06-03 2019-08-27 广东元一科技实业有限公司 一种3d合成试衣方法和系统
CN110619681B (zh) * 2019-07-05 2022-04-05 杭州同绘科技有限公司 一种基于欧拉场形变约束的人体几何重建方法
CN110415336B (zh) * 2019-07-12 2021-12-14 清华大学 高精度人体体态重建方法及系统
CN110599535A (zh) * 2019-08-05 2019-12-20 清华大学 基于哈希表的高分辨率人体实时动态重建方法及装置
CN111462302B (zh) * 2020-03-05 2022-06-03 清华大学 基于深度编码网络的多视点人体动态三维重建方法及系统
CN111612887B (zh) * 2020-04-30 2021-11-09 北京的卢深视科技有限公司 一种人体测量方法和装置
CN111627101B (zh) * 2020-05-22 2023-05-26 北京工业大学 一种基于图卷积的三维人体重构方法
CN111932670B (zh) * 2020-08-13 2021-09-28 北京未澜科技有限公司 基于单个rgbd相机的三维人体自画像重建方法及系统
CN112446919A (zh) * 2020-12-01 2021-03-05 平安科技(深圳)有限公司 物体位姿估计方法、装置、电子设备及计算机存储介质
CN113689539B (zh) * 2021-07-06 2024-04-19 清华大学 基于隐式光流场的动态场景实时三维重建方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102842148A (zh) * 2012-07-10 2012-12-26 清华大学 一种无标记运动捕捉及场景重建方法及装置
CN103198523A (zh) * 2013-04-26 2013-07-10 清华大学 一种基于多深度图的非刚体三维重建方法及系统
US20170330375A1 (en) * 2015-02-04 2017-11-16 Huawei Technologies Co., Ltd. Data Processing Method and Apparatus
CN107833270A (zh) * 2017-09-28 2018-03-23 浙江大学 基于深度相机的实时物体三维重建方法
CN108665537A (zh) * 2018-05-15 2018-10-16 清华大学 联合优化人体体态与外观模型的三维重建方法及系统

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105046743A (zh) * 2015-07-01 2015-11-11 浙江大学 一种基于全局变分技术的超高分辨率三维重建方法
CN106683181A (zh) * 2017-01-06 2017-05-17 厦门大学 一种三维人体稠密表面运动场重建方法
CN107845134B (zh) * 2017-11-10 2020-12-29 浙江大学 一种基于彩色深度相机的单个物体的三维重建方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102842148A (zh) * 2012-07-10 2012-12-26 清华大学 一种无标记运动捕捉及场景重建方法及装置
CN103198523A (zh) * 2013-04-26 2013-07-10 清华大学 一种基于多深度图的非刚体三维重建方法及系统
US20170330375A1 (en) * 2015-02-04 2017-11-16 Huawei Technologies Co., Ltd. Data Processing Method and Apparatus
CN107833270A (zh) * 2017-09-28 2018-03-23 浙江大学 基于深度相机的实时物体三维重建方法
CN108665537A (zh) * 2018-05-15 2018-10-16 清华大学 联合优化人体体态与外观模型的三维重建方法及系统

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112837406A (zh) * 2021-01-11 2021-05-25 聚好看科技股份有限公司 一种三维重建方法、装置及系统
CN112837406B (zh) * 2021-01-11 2023-03-14 聚好看科技股份有限公司 一种三维重建方法、装置及系统

Also Published As

Publication number Publication date
CN108665537B (zh) 2020-09-25
CN108665537A (zh) 2018-10-16

Similar Documents

Publication Publication Date Title
WO2019219013A1 (zh) 联合优化人体体态与外观模型的三维重建方法及系统
WO2019219012A1 (zh) 联合刚性运动和非刚性形变的三维重建方法及装置
CN108154550B (zh) 基于rgbd相机的人脸实时三维重建方法
CN108629831B (zh) 基于参数化人体模板和惯性测量的三维人体重建方法及系统
WO2019219014A1 (zh) 基于光影优化的三维几何与本征成份重建方法及装置
Ye et al. Accurate 3d pose estimation from a single depth image
US9942535B2 (en) Method for 3D scene structure modeling and camera registration from single image
CN111414798A (zh) 基于rgb-d图像的头部姿态检测方法及系统
Zhang et al. A UAV-based panoramic oblique photogrammetry (POP) approach using spherical projection
CN109166149A (zh) 一种融合双目相机与imu的定位与三维线框结构重建方法与系统
US20170330375A1 (en) Data Processing Method and Apparatus
CN110189399B (zh) 一种室内三维布局重建的方法及系统
CN105225269A (zh) 基于运动机构的三维物体建模系统
CN108053437B (zh) 基于体态的三维模型获取方法及装置
CN109345581B (zh) 基于多目相机的增强现实方法、装置及系统
CN113077519B (zh) 一种基于人体骨架提取的多相机外参自动标定方法
CN111489392B (zh) 多人环境下单个目标人体运动姿态捕捉方法及系统
CN114863061A (zh) 一种远程监护医学图像处理的三维重建方法及系统
CN108898550A (zh) 基于空间三角面片拟合的图像拼接方法
CN110490973B (zh) 一种模型驱动的多视图鞋模型三维重建方法
Zhu et al. Mvp-human dataset for 3d human avatar reconstruction from unconstrained frames
CN111292411A (zh) 一种基于向内环视多rgbd相机的实时动态人体三维重建方法
CN116797733A (zh) 实时三维物体动态重建方法
CN114399547B (zh) 一种基于多帧的单目slam鲁棒初始化方法
WO2021042961A1 (zh) 定制人脸混合表情模型自动生成方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19803262

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19803262

Country of ref document: EP

Kind code of ref document: A1