CN113256789B - Three-dimensional real-time human body posture reconstruction method - Google Patents

Three-dimensional real-time human body posture reconstruction method Download PDF

Info

Publication number
CN113256789B
CN113256789B CN202110521606.XA CN202110521606A CN113256789B CN 113256789 B CN113256789 B CN 113256789B CN 202110521606 A CN202110521606 A CN 202110521606A CN 113256789 B CN113256789 B CN 113256789B
Authority
CN
China
Prior art keywords
human body
dimensional
geometric model
depth
body posture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110521606.XA
Other languages
Chinese (zh)
Other versions
CN113256789A (en
Inventor
苏乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Civil Aviation University of China
Original Assignee
Civil Aviation University of China
Filing date
Publication date
Application filed by Civil Aviation University of China filed Critical Civil Aviation University of China
Priority to CN202110521606.XA priority Critical patent/CN113256789B/en
Publication of CN113256789A publication Critical patent/CN113256789A/en
Application granted granted Critical
Publication of CN113256789B publication Critical patent/CN113256789B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention relates to the technical field of computers, in particular to a three-dimensional real-time human body posture reconstruction method, which is used for accurately reconstructing a fine individualized human body geometric model on line in real time at low cost and accurately capturing various three-dimensional human body movement posture sequences of different people on the basis of the geometric model on line in real time; comprising the following steps: s1, 4 depth cameras are used for shooting depth images, pixel points of the depth images are expressed as x, and depth values and three-dimensional points corresponding to the pixel points of the depth images are respectively d (x) and p; s2, connecting the 4 depth cameras with a computer through a PCI data interface, and synchronously driving the 4 depth cameras by the computer to complete data acquisition work; s3,4 depth cameras are respectively positioned at 4 corners of a square human body motion capture scene, wherein every 2 adjacent depth cameras are based on a checkerboard and a stereoscopic vision calibration method; s4, preprocessing and denoising original three-dimensional points of the depth images captured by the 4 depth cameras.

Description

Three-dimensional real-time human body posture reconstruction method
Technical Field
The invention relates to the technical field of computers, in particular to a three-dimensional real-time human body posture reconstruction method.
Background
The capture and tracking of human motion is a hotspot problem in computer vision and graphics. It mainly studies how to reconstruct fast accurate human body geometric models and human body motion sequences from an input depth data stream. Existing human body posture reconstruction methods can be broadly divided into two main categories, model-based methods and non-model-based methods.
Non-model-based methods: the human body gesture in the image is generally recognized by the feature point detection method without considering the prior information of the human body. The disadvantage is that the influence of the previous moment on the motion gesture of the human body at the current moment is neglected, i.e. the nature of the human body motion is a continuous process of spatial and temporal variation. Model-based methods (or referred to as data-driven methods)
Model-based methods require a pre-scanned three-dimensional model and pre-construct a motion pose prior. Three-dimensional scanners are costly, time consuming to process scanned data, and have problems with error accumulation and inability to track long-term motion. Typical methods include: the data driving method obtains accurate human body posture reconstruction results from the captured motion data. Such as: existing methods estimate the body position in each image by detecting feature points from the depth image, which cannot always obtain reasonable results when the participant body shape is very different from the standard model in the database, since the model in the database is a standard three-dimensional body model. For another example: existing methods retrieve the best matching three-dimensional human pose from a database based on a three-dimensional point cloud captured by multiple depth cameras. For another example: the prior art is based on a physical motion reconstruction algorithm, and combines depth data input by 3 depth cameras, foot pressure data of a wearable pressure sensor and detailed whole body geometry, and reconstructs a complete whole body motion process offline. For another example: there are also techniques to successfully reconstruct the human body motion process by combining the detection of semantic features of limbs based on bayesian estimation and inverse kinematics optimization calculation in combination with constraints such as avoiding joint limits, but this method cannot cope with motion gestures that do not meet this condition since it assumes that the head of a person in the image must always be above the waist. For another example: in addition, the point cloud matching problem is solved by solving the maximum posterior probability algorithm, and the monocular depth camera is used for capturing whole body motion data rapidly and automatically.
The human body geometric model and the motion gesture can be reconstructed simultaneously by a data driving method, but the method requires that the accurate point correspondence between the non-rigid model and the target three-dimensional point cloud is determined in advance by a manual method, so that the accurate three-dimensional human body gesture estimation during large-scale motion change can be realized, and the actual requirement cannot be met.
Disclosure of Invention
In order to solve the technical problems, the invention provides a three-dimensional real-time human body posture reconstruction method for accurately reconstructing a fine individualized human body geometric model on line in real time at low cost and based on the geometric model.
The invention relates to a three-dimensional real-time human body posture reconstruction method, which comprises the following steps:
s1, 4 depth cameras are used for shooting depth images, pixel points of the depth images are expressed as x, and depth values and three-dimensional points corresponding to the pixel points of the depth images are respectively d (x) and p;
S2, connecting the 4 depth cameras with a computer through a PCI data interface, and synchronously driving the 4 depth cameras by the computer to complete data acquisition work;
S3,4 depth cameras are respectively positioned at 4 corners of a square human body motion capture scene, wherein every 2 adjacent depth cameras are based on a checkerboard and a stereoscopic vision calibration method;
s4, preprocessing and denoising original three-dimensional points of depth images captured by the 4 depth cameras;
S5, selecting a motion sequence from a motion capture open source database, establishing a three-dimensional human body posture database Q, and carrying out skeleton normalization of a motion redirection technology in the establishment of the three-dimensional human body posture database Q;
S6, shooting an A-pose image of a real human body by using 4 depth cameras, deforming a human body geometric model of the A-pose image of the real human body in a pose dimension by a skeleton-driving-based method, and then passing a fine individuation human body geometric model automatic estimation problem form into a nonlinear optimization problem to carry out iterative optimization solution;
The human body geometric model is represented by a long vector S i of a model grid vertex set, the human body geometric model database is represented as s= { S i, i= …, N }, a global linear prior model of the human body geometric model is established by using a principal component analysis technology, and the human body geometric model is formed as follows:
Wherein beta is a low-dimensional parameter vector of the human body geometric model, P β,k is a matrix formed by the first k-dimensional principal component vectors, Is the mean value vector of the human body geometric model in the database;
the coordinates of the centers J i of joints of the human skeleton are respectively expressed as weighted linear combinations of vertex coordinate sets of the geometrical model of the human body, and the vertex weights w under the geometrical template model are calculated and expressed as follows:
Wherein V i is a neighboring model vertex set of the embedded skeleton joint i, w i,j is a weight of the embedded skeleton joint i relative to a jth vertex in the neighboring vertex set thereof, V i,j is a jth vertex coordinate in the neighboring vertex set of the embedded skeleton joint i, and J i is a coordinate of the embedded skeleton joint i;
Obtaining a vertex set of a neighboring human body geometrical model of each joint center of the human body embedded skeleton in a space bounding box defining mode;
Solving the vertex weight w estimates the vertex of the geometric model according to the human body geometric template model and the human body skeleton embedded in the model, and forms the vertex weight w into a linear least square problem with constraint, which is expressed as follows:
Wherein, Is the neighbor model vertex set of the template model embedded skeleton joint i, w i,j is the weight of the template model embedded skeleton joint i relative to the jth vertex in the neighbor template model vertex set thereof,Is the j-th vertex coordinate in the vertex set of the neighbor template model of the embedded skeleton joint i of the template model,Is the coordinate of the embedded skeleton joint i of the template model; variable(s)AndAll are known variables, and the variable w i,j is a variable to be solved; solving the vertex weight w by using a non-negative least square algorithm;
According to the formula (2), when the vertex coordinates and the vertex weights w i,j of the individualized human body geometric model are given, the joint coordinates of the embedded skeleton of the current novel human body geometric model can be obtained
Through the three-dimensional depth point cloud P, the human body geometric model database S and the human body posture database Q, an individualized human body geometric model is obtained, and the parameters of the individualized human body geometric model are human body posture parameter vectorsAnd human body geometric model parameter vectorExpressed as:
Wherein, E 1 and E 2 are iterative closest point energy terms, E 3 is a human skeleton length symmetric energy term, E 4 is a human body geometric model priori energy term, and E 5 is a human body posture priori energy term; the variable lambda 1,…λ5 is the weight of each energy term;
Wherein the point-to-point distance refers to the captured three-dimensional depth point with the closest Euclidean distance between the vertex v i (q, beta) of the human body geometric model and the space The distance between, "point-to-point" distance energy term E 1, expressed as:
wherein the "point-to-plane" distance refers to the normal of the human body geometric model vertex v i (q, beta) Intersection point with captured three-dimensional depth point cloudIs expressed as a "point-to-plane" distance energy term E 2:
In order to ensure the reasonability of the reconstruction result of the human body geometric model, a human skeleton length symmetrical energy item is introduced, and the method is expressed as follows:
E3=∑{(m,n)}∈{(M,N)}||lm-ln||2---(7);
Wherein { (M, N) } is a symmetrical skeleton segment set, and l m,ln is the length of the left and right skeleton segments with symmetry, respectively;
if the human body geometric model database in the global space forms multidimensional Gaussian distribution, the prior constraint item of the human body geometric model maximizes the following conditional probability, which is expressed as:
Wherein Λ β is a matrix formed by the k-dimensional principal component vectors before the covariance matrix of the human body geometric model database, and is converted into an energy minimization form through a formula (8), and is expressed as follows:
Where β is the low-dimensional parameter vector of the human body geometric model to be solved, P β,k and The matrix and the mean value vector are respectively formed by the calculated prior k-dimensional principal component vectors of the human body geometric model;
Let the body posture database q= { Q i, i=1, …, H }, build a global linear prior model of body posture using principal component analysis technique, expressed as:
Wherein ζ is a low-dimensional parameter vector of the human body posture, P q,b is a matrix formed by the front b-dimensional principal component vectors, Is the mean vector of the human body gesture in the database;
Assuming that the human body posture data in the global space form a multidimensional gaussian distribution, the human body posture prior constraint term maximizes the following conditional probability, expressed as:
Wherein q is the human body posture vector to be solved, and delta 5 is a constant;
The energy minimization form used in equation (11) is expressed as:
In combination of the formula (5), the formula (6), the formula (7), the formula (9) and the formula (12), the formula (4) is expressed as follows:
Wherein, Is the p-norm; because certain data noise exists in the captured three-dimensional depth point cloud, the human body geometric model vertex and the noise point of the captured depth point cloud form a space matching point pair, so that the accuracy of the human body geometric model estimation result is affected, and a variable lambda 1,…λ5 is the weight of each energy item;
S7, tracking three-dimensional human motion gestures through a three-dimensional gesture database and a multi-view depth camera of a fine individuation human geometric model;
Three-dimensional human body posture reconstructed by three-dimensional depth point cloud P captured by current frame, fine individuation human body geometric model S *, human body posture database Q and previous two frames Solving the optimal three-dimensional human motion gesture q * of the current frame;
Three-dimensional human body posture reconstructed from previous 2 frames Searching K gesture neighbors from heterogeneous three-dimensional human body database, and searching gesture from databaseIs the first Z neighbor pose sets Q -2; retrieving gestures from a databaseIs the first Z neighbor pose sets Q -1; combining the two neighbor pose sets and removing redundant poses to obtain K neighbor pose sets Q -2,-1; the three-dimensional human pose distance metric d query at the time of defining the search is a three-dimensional joint center coordinate set.
The invention relates to a three-dimensional real-time human body posture reconstruction method, which further comprises the following steps:
Synchronously capturing a plurality of groups of infrared image pairs containing checkerboards of 2 adjacent depth cameras, extracting a checkerboard corner set from the captured infrared images by using a corner detection algorithm, and estimating an inner parameter matrix and an outer parameter matrix of the infrared cameras of the 2 depth cameras based on a stereoscopic vision algorithm;
The coordinate system where 1 depth camera is located is taken as a world coordinate system, the optical center of the coordinate system is taken as a world coordinate origin, and external parameter matrixes obtained by the other 3 depth cameras based on calibration are aligned to the world coordinate system;
Estimating a floor plane in a scene based on a three-dimensional plane fitting technology, automatically subtracting background pixel points based on a three-dimensional cylindrical bounding box, wherein the bottom plane of the bounding box is parallel to the estimated floor plane, a bottom surface center point is an acquisition scene center point, the rest three-dimensional depth point cloud distribution center points of the foreground are taken as starting points, and a maximum communication area in a three-dimensional space is obtained by setting a proper three-dimensional depth point distance threshold value, so that three-dimensional depth point cloud denoising is completed.
The invention relates to a three-dimensional real-time human body posture reconstruction method, in S7,
Three-dimensional human body posture reconstructed by three-dimensional depth point cloud P captured by current frame, fine individuation human body geometric model S *, human body posture database Q and previous two framesSolving the optimal three-dimensional human motion gesture q * of the current frame, which is expressed as:
Wherein, And G 2 is an iterative closest point energy term, G 3 is a three-dimensional human body posture prior energy term, G 4 is a three-dimensional human body posture joint angle range limit energy term, and G 5 is a three-dimensional human body posture smooth change constraint energy term; the variable alpha 1,…,α5 is the weight of each energy term respectively;
"Point-to-Point" distance refers to the pose q-driven human body geometric model S * vertex v i(q,S*) and the captured three-dimensional depth point with the nearest Euclidean distance to space The distance between, "point-to-point" distance energy term G 1, expressed as:
"Point-to-plane" distance, referring to the normal of the vertex v i(q,S*) of the pose q-driven human geometric model S * The distance between the tangent planes to the point of intersection of the captured three-dimensional depth point cloud, the "point-to-face" distance energy term G 2, is expressed as:
Three-dimensional human body posture reconstructed from previous 2 frames Searching K gesture neighbors from heterogeneous three-dimensional human body database, and searching gesture from databaseIs the first Z neighbor pose sets Q -2; retrieving gestures from a databaseIs the first Z neighbor pose sets Q -1; combining the two neighbor pose sets and removing redundant poses to obtain K neighbor pose sets Q -2,-1; defining a three-dimensional human body posture distance metric d query at the time of search as a three-dimensional joint center coordinate set, expressed as:
dquery(q,qn)=||J(T(q))-J(qn)||2---(17);
Wherein J (-) represents a three-dimensional joint center coordinate set obtained by forward kinematic calculation under a three-dimensional human body posture, and T is a space transformation matrix for aligning a query posture q to a posture q n in a database;
The body posture database Q -2,-1 is also known, and a global linear prior model of the body posture is built using principal component analysis techniques, expressed as:
Wherein ζ is a low-dimensional parameter vector of the human body posture, P q,b is a matrix formed by the front b-dimensional principal component vectors, Is the mean vector of the human body gesture in the database;
Punishment is carried out on the satisfaction degree of the probability distribution of the three-dimensional human body posture of the local space formed by the reconstructed three-dimensional human body posture Q and K posture neighbors Q K={q1,…,qK searched on line; assuming that K pose neighbors in the local space satisfy the multidimensional gaussian distribution, the human body pose prior term is expressed as:
wherein q is a human body posture vector to be solved, and epsilon is a constant;
The energy minimization problem typically translated by maximizing the probability problem in equation (19) is represented as follows:
The energy term is limited in the human body posture joint angle range, so that when the three-dimensional human body posture joint angle change value is solved in an iterative optimization mode, the reasonable numerical range is prevented from being exceeded, and an unreasonable three-dimensional human body posture reconstruction result is caused to appear:
Is a binary indicator function: if the ith joint angle is below the lower limit q i<qi, χ (i) is equal to 1, otherwise equal to 0; if the ith joint angle exceeds the upper limit ThenEqual to 1, otherwise equal to 0;
Three-dimensional human body posture smooth change constraint energy term punishment reconstructed current frame three-dimensional human body posture q and previous 2 frame reconstruction posture The smoothness of the speed of change between is expressed as:
Wherein, the meaning indicated in the right double brackets of the equal sign is: reconstructing poses from previous frames The speed of change of the three-dimensional human body posture q to the current frame is compared with the posture reconstruction from the penultimate frameReconstructing poses to previous frameA difference between the varying speeds of (2);
The energy function of equation (14) is expressed in the following form by equation (15), equation (16), equation (20), equation (21) and equation (22):
Wherein, Is the p-norm.
The beneficial effects of the invention are as follows:
1. The real-time online method has the advantages that the average frame rate can reach 20 frames/second, the cost is low, and the efficiency is high;
2. The modeling process is simple and quick, and no post-processing is needed;
3. And the gesture solution space is restrained, and the accuracy of reconstructing the human motion gesture is improved.
Drawings
FIG. 1 is a flow chart of a system algorithm of the present invention;
FIG. 2 is a diagram of an example of placement and calibration of 4 depth cameras;
FIG. 3 is an illustration of the results before and after three-dimensional point cloud alignment for a 4-stage depth camera;
FIG. 4 is an example diagram of three-dimensional multi-mesh depth point cloud preprocessing;
FIG. 5 is a flow chart of a human body geometric model estimation algorithm;
FIG. 6 is a graph of a parameterized example of a human geometric model embedded skeleton node;
FIG. 7 is a current frame pose K nearest neighbor search example graph based on the previous 2 frames reconstruction pose;
Fig. 8 is a schematic flow chart of a three-dimensional human body posture reconstruction algorithm based on the GPU.
Detailed Description
The following describes in further detail the embodiments of the present invention with reference to the drawings and examples. The following examples are illustrative of the invention and are not intended to limit the scope of the invention.
Example 1
The invention relates to a three-dimensional real-time human body posture reconstruction method, which comprises the following steps:
s1, 4 depth cameras are used for shooting depth images, pixel points of the depth images are expressed as x, and depth values and three-dimensional points corresponding to the pixel points of the depth images are respectively d (x) and p;
S2, connecting the 4 depth cameras with a computer through a PCI data interface, and synchronously driving the 4 depth cameras by the computer to complete data acquisition work;
S3,4 depth cameras are respectively positioned at 4 corners of a square human body motion capture scene, wherein every 2 adjacent depth cameras are based on a checkerboard and a stereoscopic vision calibration method;
s4, preprocessing and denoising original three-dimensional points of depth images captured by the 4 depth cameras;
S5, selecting a motion sequence from a motion capture open source database, establishing a three-dimensional human body posture database Q, and carrying out skeleton normalization of a motion redirection technology in the establishment of the three-dimensional human body posture database Q;
S6, shooting an A-pose image of a real human body by using 4 depth cameras, deforming a human body geometric model of the A-pose image of the real human body in a pose dimension by a skeleton-driving-based method, and then passing a fine individuation human body geometric model automatic estimation problem form into a nonlinear optimization problem to carry out iterative optimization solution;
S7, tracking the three-dimensional human motion gesture through a three-dimensional gesture database and a multi-view depth camera of the fine individuation human geometric model.
Example 2
The invention relates to a three-dimensional real-time human body posture reconstruction method, which comprises the following steps:
s1, 4 depth cameras are used for shooting depth images, pixel points of the depth images are expressed as x, and depth values and three-dimensional points corresponding to the pixel points of the depth images are respectively d (x) and p;
S2, connecting the 4 depth cameras with a computer through a PCI data interface, and synchronously driving the 4 depth cameras by the computer to complete data acquisition work;
S3,4 depth cameras are respectively positioned at 4 corners of a square human body motion capture scene, wherein every 2 adjacent depth cameras are based on a checkerboard and a stereoscopic vision calibration method;
s4, preprocessing and denoising original three-dimensional points of depth images captured by the 4 depth cameras;
S5, selecting a motion sequence from a motion capture open source database, establishing a three-dimensional human body posture database Q, and carrying out skeleton normalization of a motion redirection technology in the establishment of the three-dimensional human body posture database Q;
Synchronously capturing a plurality of groups of infrared image pairs containing checkerboards of 2 adjacent depth cameras, extracting a checkerboard corner set from the captured infrared images by using a corner detection algorithm, and estimating an inner parameter matrix and an outer parameter matrix of the infrared cameras of the 2 depth cameras based on a stereoscopic vision algorithm;
The coordinate system where 1 depth camera is located is taken as a world coordinate system, the optical center of the coordinate system is taken as a world coordinate origin, and external parameter matrixes obtained by the other 3 depth cameras based on calibration are aligned to the world coordinate system;
Estimating a floor plane in a scene based on a three-dimensional plane fitting technology, automatically subtracting background pixel points based on a three-dimensional cylindrical bounding box, wherein the bottom plane of the bounding box is parallel to the estimated floor plane, a bottom surface center point is an acquisition scene center point, the rest three-dimensional depth point cloud distribution center points of the foreground are taken as starting points, and a maximum communication area in a three-dimensional space is obtained by setting a proper three-dimensional depth point distance threshold value, so that three-dimensional depth point cloud denoising is completed;
S6, shooting an A-pose image of a real human body by using 4 depth cameras, deforming a human body geometric model of the A-pose image of the real human body in a pose dimension by a skeleton-driving-based method, and then passing a fine individuation human body geometric model automatic estimation problem form into a nonlinear optimization problem to carry out iterative optimization solution;
S7, tracking the three-dimensional human motion gesture through a three-dimensional gesture database and a multi-view depth camera of the fine individuation human geometric model.
Example 3
The invention relates to a three-dimensional real-time human body posture reconstruction method, which comprises the following steps:
s1, 4 depth cameras are used for shooting depth images, pixel points of the depth images are expressed as x, and depth values and three-dimensional points corresponding to the pixel points of the depth images are respectively d (x) and p;
S2, connecting the 4 depth cameras with a computer through a PCI data interface, and synchronously driving the 4 depth cameras by the computer to complete data acquisition work;
S3,4 depth cameras are respectively positioned at 4 corners of a square human body motion capture scene, wherein every 2 adjacent depth cameras are based on a checkerboard and a stereoscopic vision calibration method;
s4, preprocessing and denoising original three-dimensional points of depth images captured by the 4 depth cameras;
S5, selecting a motion sequence from a motion capture open source database, establishing a three-dimensional human body posture database Q, and carrying out skeleton normalization of a motion redirection technology in the establishment of the three-dimensional human body posture database Q;
S6, shooting an A-pose image of a real human body by using 4 depth cameras, deforming a human body geometric model of the A-pose image of the real human body in a pose dimension by a skeleton-driving-based method, and then passing a fine individuation human body geometric model automatic estimation problem form into a nonlinear optimization problem to carry out iterative optimization solution;
in the step S6 described above, the step of,
The human body geometric model is represented by a long vector S i of a model grid vertex set, the human body geometric model database is represented as s= { S i, i= …, N }, a global linear prior model of the human body geometric model is established by using a principal component analysis technology, and the human body geometric model is formed as follows:
Wherein beta is a low-dimensional parameter vector of the human body geometric model, P β,k is a matrix formed by the first k-dimensional principal component vectors, Is the mean value vector of the human body geometric model in the database;
the coordinates of the centers J i of joints of the human skeleton are respectively expressed as weighted linear combinations of vertex coordinate sets of the geometrical model of the human body, and the vertex weights w under the geometrical template model are calculated and expressed as follows:
Wherein V i is a neighboring model vertex set of the embedded skeleton joint i, w i,j is a weight of the embedded skeleton joint i relative to a jth vertex in the neighboring vertex set thereof, V i,j is a jth vertex coordinate in the neighboring vertex set of the embedded skeleton joint i, and J i is a coordinate of the embedded skeleton joint i;
Obtaining a vertex set of a neighboring human body geometrical model of each joint center of the human body embedded skeleton in a space bounding box defining mode;
Solving the vertex weight w estimates the vertex of the geometric model according to the human body geometric template model and the human body skeleton embedded in the model, and forms the vertex weight w into a linear least square problem with constraint, which is expressed as follows:
Wherein, Is the neighbor model vertex set of the template model embedded skeleton joint i, w i,j is the weight of the template model embedded skeleton joint i relative to the jth vertex in the neighbor template model vertex set thereof,Is the j-th vertex coordinate in the vertex set of the neighbor template model of the embedded skeleton joint i of the template model,Is the coordinate of the embedded skeleton joint i of the template model; variable(s)AndAll are known variables, and the variable w i,j is a variable to be solved; solving the vertex weight w by using a non-negative least square algorithm;
According to the formula (2), when the vertex coordinates and the vertex weights w i,j of the individualized human body geometric model are given, the joint coordinates of the embedded skeleton of the current novel human body geometric model can be obtained
Through the three-dimensional depth point cloud P, the human body geometric model database S and the human body posture database Q, an individualized human body geometric model is obtained, and the parameters of the individualized human body geometric model are human body posture parameter vectorsAnd human body geometric model parameter vectorExpressed as:
Wherein, E 1 and E 2 are iterative closest point energy terms, E 3 is a human skeleton length symmetric energy term, E 4 is a human body geometric model priori energy term, and E 5 is a human body posture priori energy term; the variable lambda 1,…λ5 is the weight of each energy term;
Wherein the point-to-point distance refers to the captured three-dimensional depth point with the closest Euclidean distance between the vertex v i (q, beta) of the human body geometric model and the space The distance between, "point-to-point" distance energy term E 1, expressed as:
wherein the "point-to-plane" distance refers to the normal of the human body geometric model vertex v i (q, beta) Intersection point with captured three-dimensional depth point cloudIs expressed as a "point-to-plane" distance energy term E 2:
In order to ensure the reasonability of the reconstruction result of the human body geometric model, a human skeleton length symmetrical energy item is introduced, and the method is expressed as follows:
E3=∑{(m,n)}∈{(M,N)}||lm-ln||2---(7);
Wherein { (M, N) } is a symmetrical skeleton segment set, and l m,ln is the length of the left and right skeleton segments with symmetry, respectively;
if the human body geometric model database in the global space forms multidimensional Gaussian distribution, the prior constraint item of the human body geometric model maximizes the following conditional probability, which is expressed as:
Wherein Λ β is a matrix formed by the k-dimensional principal component vectors before the covariance matrix of the human body geometric model database, and is converted into an energy minimization form through a formula (8), and is expressed as follows:
Where β is the low-dimensional parameter vector of the human body geometric model to be solved, P β,k and The matrix and the mean value vector are respectively formed by the calculated prior k-dimensional principal component vectors of the human body geometric model;
Let the body posture database q= { Q i, i=1, …, h }, build a global linear prior model of body posture using principal component analysis technique, expressed as:
Wherein ζ is a low-dimensional parameter vector of the human body posture, P q,b is a matrix formed by the front b-dimensional principal component vectors, Is the mean vector of the human body gesture in the database;
Assuming that the human body posture data in the global space form a multidimensional gaussian distribution, the human body posture prior constraint term maximizes the following conditional probability, expressed as:
Wherein q is the human body posture vector to be solved, and delta 5 is a constant;
The energy minimization form used in equation (11) is expressed as:
In combination of the formula (5), the formula (6), the formula (7), the formula (9) and the formula (12), the formula (4) is expressed as follows:
Wherein, Is the p-norm; because certain data noise exists in the captured three-dimensional depth point cloud, the human body geometric model vertex and the noise point of the captured depth point cloud form a space matching point pair, so that the accuracy of the human body geometric model estimation result is affected, and a variable lambda 1,…λ5 is the weight of each energy item;
S7, tracking the three-dimensional human motion gesture through a three-dimensional gesture database and a multi-view depth camera of the fine individuation human geometric model.
Example 4
The invention relates to a three-dimensional real-time human body posture reconstruction method, which comprises the following steps:
s1, 4 depth cameras are used for shooting depth images, pixel points of the depth images are expressed as x, and depth values and three-dimensional points corresponding to the pixel points of the depth images are respectively d (x) and p;
S2, connecting the 4 depth cameras with a computer through a PCI data interface, and synchronously driving the 4 depth cameras by the computer to complete data acquisition work;
S3,4 depth cameras are respectively positioned at 4 corners of a square human body motion capture scene, wherein every 2 adjacent depth cameras are based on a checkerboard and a stereoscopic vision calibration method;
s4, preprocessing and denoising original three-dimensional points of depth images captured by the 4 depth cameras;
S5, selecting a motion sequence from a motion capture open source database, establishing a three-dimensional human body posture database Q, and carrying out skeleton normalization of a motion redirection technology in the establishment of the three-dimensional human body posture database Q;
S6, shooting an A-pose image of a real human body by using 4 depth cameras, deforming a human body geometric model of the A-pose image of the real human body in a pose dimension by a skeleton-driving-based method, and then passing a fine individuation human body geometric model automatic estimation problem form into a nonlinear optimization problem to carry out iterative optimization solution;
S7, tracking three-dimensional human motion gestures through a three-dimensional gesture database and a multi-view depth camera of a fine individuation human geometric model;
In the step S7 of the above-mentioned method,
Three-dimensional human body posture reconstructed by three-dimensional depth point cloud P captured by current frame, fine individuation human body geometric model S *, human body posture database Q and previous two framesSolving the optimal three-dimensional human motion gesture q * of the current frame, which is expressed as:
Wherein, G 1 and G 2 are iterative closest point energy terms, G 3 is a three-dimensional human body posture priori energy term, G 4 is a three-dimensional human body posture joint angle range limiting energy term, and G 5 is a three-dimensional human body posture smooth change constraint energy term; the variable alpha 1,…,α5 is the weight of each energy term respectively;
"Point-to-Point" distance refers to the pose q-driven human body geometric model S * vertex v i(q,S*) and the captured three-dimensional depth point with the nearest Euclidean distance to space The distance between, "point-to-point" distance energy term G 1, expressed as:
"Point-to-plane" distance, referring to the normal of the vertex v i(q,S*) of the pose q-driven human geometric model S * The distance between the tangent planes to the point of intersection of the captured three-dimensional depth point cloud, the "point-to-face" distance energy term G 2, is expressed as:
Three-dimensional human body posture reconstructed from previous 2 frames Searching K gesture neighbors from heterogeneous three-dimensional human body database, and searching gesture from databaseIs the first Z neighbor pose sets Q -2; retrieving gestures from a databaseIs the first Z neighbor pose sets Q -1; combining the two neighbor pose sets and removing redundant poses to obtain K neighbor pose sets Q -2,-1; defining a three-dimensional human body posture distance metric d query at the time of search as a three-dimensional joint center coordinate set, expressed as:
dquery(q,qn)=||J(T(q))-J(qn)||2---(17);
Wherein J (-) represents a three-dimensional joint center coordinate set obtained by forward kinematic calculation under a three-dimensional human body posture, and T is a space transformation matrix for aligning a query posture q to a posture q n in a database;
The body posture database Q -2,-1 is also known, and a global linear prior model of the body posture is built using principal component analysis techniques, expressed as:
Wherein ζ is a low-dimensional parameter vector of the human body posture, P q,b is a matrix formed by the front b-dimensional principal component vectors, Is the mean vector of the human body gesture in the database;
Punishment is carried out on the satisfaction degree of the probability distribution of the three-dimensional human body posture of the local space formed by the reconstructed three-dimensional human body posture Q and K posture neighbors Q K={q1,…,qK searched on line; assuming that K pose neighbors in the local space satisfy the multidimensional gaussian distribution, the human body pose prior term is expressed as:
wherein q is a human body posture vector to be solved, and epsilon is a constant;
The energy minimization problem typically translated by maximizing the probability problem in equation (19) is represented as follows:
The energy term is limited in the human body posture joint angle range, so that when the three-dimensional human body posture joint angle change value is solved in an iterative optimization mode, the reasonable numerical range is prevented from being exceeded, and an unreasonable three-dimensional human body posture reconstruction result is caused to appear:
Is a binary indicator function: if the ith joint angle is below the lower limit q i<qi, χ (i) is equal to 1, otherwise equal to 0; if the ith joint angle exceeds the upper limit ThenEqual to 1, otherwise equal to 0;
Three-dimensional human body posture smooth change constraint energy term punishment reconstructed current frame three-dimensional human body posture q and previous 2 frame reconstruction posture The smoothness of the speed of change between is expressed as:
Wherein, the meaning indicated in the right double brackets of the equal sign is: reconstructing poses from previous frames The speed of change of the three-dimensional human body posture q to the current frame is compared with the posture reconstruction from the penultimate frameReconstructing poses to previous frameA difference between the varying speeds of (2);
The energy function of equation (14) is expressed in the following form by equation (15), equation (16), equation (20), equation (21) and equation (22):
Wherein, Is the p-norm.
Example 5
The overall algorithm flow is shown in fig. 1, and the three-dimensional real-time human body posture reconstruction method provided by the invention comprises the following steps:
s1, 4 depth cameras are used for shooting depth images, pixel points of the depth images are expressed as x, and depth values and three-dimensional points corresponding to the pixel points of the depth images are respectively d (x) and p;
s2, connecting the 4 depth cameras with a computer through a PCI data interface, and synchronously driving the 4 depth cameras by the computer to complete data acquisition work, wherein the data acquisition work is naturally synchronous in time;
S3,4 depth cameras are respectively positioned at 4 corners of a square human body motion capture scene and are all placed towards the center of the scene, as shown in FIG. 2 (a), wherein each 2 adjacent depth cameras are based on a checkerboard and stereoscopic vision calibration method; first, synchronously capturing a plurality of infrared image pairs including a checkerboard of adjacent 2 depth cameras, one of which is shown in fig. 2 (b); then, extracting a checkerboard corner set from the captured infrared image using a corner detection algorithm, as shown in fig. 2 (c); and finally, estimating an inner parameter matrix and an outer parameter matrix of the infrared cameras of the 2 depth cameras based on a stereoscopic vision algorithm. The coordinate system where 1 depth camera is located is taken as a world coordinate system, the optical center of the coordinate system is taken as a world coordinate origin, external parameter matrixes obtained by the other 3 depth cameras based on calibration are aligned to the world coordinate system, and fruits such as shown in figure 3 are obtained before and after three-dimensional point cloud alignment of the 4 depth cameras;
S4, preprocessing and denoising original three-dimensional points of depth images captured by the 4 depth cameras; firstly, estimating a floor plane in a scene based on a three-dimensional plane fitting technology; then, automatically subtracting background pixel points based on a three-dimensional cylinder bounding box, wherein the bottom plane of the bounding box is parallel to the estimated floor plane, the bottom center point is the center point of the acquired scene, and the radius is 1.2m and the height is 2.5 meters; finally, taking the left foreground three-dimensional depth point cloud distribution center point as a starting point, taking 0.02m as a three-dimensional depth point distance threshold value, solving the largest communication area in the three-dimensional space, and completing three-dimensional depth point cloud denoising, as shown in fig. 4;
The method for estimating the fine individuation human body geometric model is shown in fig. 5, S5, selecting a motion sequence from a motion capture open source database, establishing a three-dimensional human body gesture database Q, and carrying out skeleton normalization of a motion redirection technology in the establishment of the three-dimensional human body gesture database Q;
Synchronously capturing a plurality of groups of infrared image pairs containing checkerboards of 2 adjacent depth cameras, extracting a checkerboard corner set from the captured infrared images by using a corner detection algorithm, and estimating an inner parameter matrix and an outer parameter matrix of the infrared cameras of the 2 depth cameras based on a stereoscopic vision algorithm;
The coordinate system where 1 depth camera is located is taken as a world coordinate system, the optical center of the coordinate system is taken as a world coordinate origin, and external parameter matrixes obtained by the other 3 depth cameras based on calibration are aligned to the world coordinate system;
Estimating a floor plane in a scene based on a three-dimensional plane fitting technology, automatically subtracting background pixel points based on a three-dimensional cylindrical bounding box, wherein the bottom plane of the bounding box is parallel to the estimated floor plane, a bottom surface center point is an acquisition scene center point, the rest three-dimensional depth point cloud distribution center points of the foreground are taken as starting points, and a maximum communication area in a three-dimensional space is obtained by setting a proper three-dimensional depth point distance threshold value, so that three-dimensional depth point cloud denoising is completed;
Wherein, three-dimensional human body posture represents: defining a three-dimensional human body posture as a vector q epsilon R 36 formed by joint degrees of freedom, wherein the vector q epsilon R 36 specifically comprises a root node (6 degrees of freedom), a trunk (3 degrees of freedom), left shoulders and right shoulders (2 degrees of freedom), left big arms and right big arms (3 degrees of freedom), left small arms and right small arms (1 degree of freedom), a neck (2 degrees of freedom), a head (1 degree of freedom), left thighs and right thighs (3 degrees of freedom), left thighs and right shanks (1 degree of freedom), left feet and right feet (2 degrees of freedom); heterogeneous three-dimensional human body posture database: when the sparse three-dimensional mark points are automatically extracted from the depth image and the gesture priori is automatically built on line, a three-dimensional human gesture database which is built in advance is needed to be used. A motion sequence with a total time of approximately 2.5 hours is selected from a motion capture open source database, and the motion types comprise: walking, running, shadowbox, kicking, jumping, dancing, waving hands, exercising, golfing, etc. Skeletal normalization was performed using motion redirection techniques.
S6, shooting an A-pose image of a real human body by using 4 depth cameras, deforming a human body geometric model of the A-pose image of the real human body in a pose dimension by a skeleton-driving-based method, and then passing a fine individuation human body geometric model automatic estimation problem form into a nonlinear optimization problem to carry out iterative optimization solution;
The human body geometric model is represented by a long vector S i of a model grid vertex set, the human body geometric model database is represented as s= { S i, i= …, N }, a global linear prior model of the human body geometric model is established by using a principal component analysis technology, and the human body geometric model is formed as follows:
Wherein beta is a low-dimensional parameter vector of the human body geometric model, P β,k is a matrix formed by the first k-dimensional principal component vectors, Is the mean value vector of the human body geometric model in the database;
the coordinates of the centers J i of joints of the human skeleton are respectively expressed as weighted linear combinations of vertex coordinate sets of the geometrical model of the human body, and the vertex weights w under the geometrical template model are calculated and expressed as follows:
Wherein V i is a neighboring model vertex set of the embedded skeleton joint i, w i,j is a weight of the embedded skeleton joint i relative to a jth vertex in the neighboring vertex set thereof, V i,j is a jth vertex coordinate in the neighboring vertex set of the embedded skeleton joint i, and J i is a coordinate of the embedded skeleton joint i;
Obtaining a near-neighbor human body geometrical model vertex set of each joint center of the human body embedded skeleton in a space bounding box (three-dimensional sphere) mode, wherein the human body geometrical model vertex set of each color in the figure represents a near neighbor around the corresponding human body skeleton joint as shown in fig. 6;
Solving the vertex weight w estimates the vertex of the geometric model according to the human body geometric template model and the human body skeleton embedded in the model, and forms the vertex weight w into a linear least square problem with constraint, which is expressed as follows:
Wherein, Is the neighbor model vertex set of the template model embedded skeleton joint i, w i,j is the weight of the template model embedded skeleton joint i relative to the jth vertex in the neighbor template model vertex set thereof,Is the j-th vertex coordinate in the vertex set of the neighbor template model of the embedded skeleton joint i of the template model,Is the coordinate of the embedded skeleton joint i of the template model; variable(s)AndAll are known variables, and the variable w i,j is a variable to be solved; solving the vertex weight w by using a non-negative least square algorithm;
According to the formula (2), when the vertex coordinates and the vertex weights w i,j of the individualized human body geometric model are given, the joint coordinates of the embedded skeleton of the current novel human body geometric model can be obtained
Through the three-dimensional depth point cloud P, the human body geometric model database S and the human body posture database Q, an individualized human body geometric model is obtained, and the parameters of the individualized human body geometric model are human body posture parameter vectorsAnd human body geometric model parameter vectorExpressed as:
Wherein, E 1 and E 2 are iterative closest point energy terms, E 3 is a human skeleton length symmetric energy term, E 4 is a human body geometric model priori energy term, and E 5 is a human body posture priori energy term; the variable lambda 1,…λ5 is the weight of each energy term;
Wherein the point-to-point distance refers to the captured three-dimensional depth point with the closest Euclidean distance between the vertex v i (q, beta) of the human body geometric model and the space The distance between, "point-to-point" distance energy term E 1, expressed as:
wherein the "point-to-plane" distance refers to the normal of the human body geometric model vertex v i (q, beta) Intersection point with captured three-dimensional depth point cloudIs expressed as a "point-to-plane" distance energy term E 2:
In order to ensure the reasonability of the reconstruction result of the human body geometric model, a human skeleton length symmetrical energy item is introduced, and the method is expressed as follows:
E3=∑{(m,n)}∈{(M,N)}||lm-ln||2---(7);
Wherein { (M, N) } is a symmetrical skeleton segment set, and l m,ln is the length of the left and right skeleton segments with symmetry, respectively;
if the human body geometric model database in the global space forms multidimensional Gaussian distribution, the prior constraint item of the human body geometric model maximizes the following conditional probability, which is expressed as:
Wherein Λ β is a matrix formed by the k-dimensional principal component vectors before the covariance matrix of the human body geometric model database, and is converted into an energy minimization form through a formula (8), and is expressed as follows:
Where β is the low-dimensional parameter vector of the human body geometric model to be solved, P β,k and The matrix and the mean value vector are respectively formed by the calculated prior k-dimensional principal component vectors of the human body geometric model;
Let the body posture database q= { Q i, i=1, …, h }, build a global linear prior model of body posture using principal component analysis technique, expressed as:
Wherein ζ is a low-dimensional parameter vector of the human body posture, P q,b is a matrix formed by the front b-dimensional principal component vectors, Is the mean vector of human body gestures in the database, and the main component ratio is set to be 95%;
Assuming that the human body posture data in the global space form a multidimensional gaussian distribution, the human body posture prior constraint term maximizes the following conditional probability, expressed as:
Wherein q is the human body posture vector to be solved, and delta 5 is a constant;
The energy minimization form used in equation (11) is expressed as:
In combination of the formula (5), the formula (6), the formula (7), the formula (9) and the formula (12), the formula (4) is expressed as follows:
Wherein, Is the p-norm; because certain data noise exists in the captured three-dimensional depth point cloud, the human body geometric model vertex and the noise point of the captured depth point cloud form a space matching point pair, so that the accuracy of the human body geometric model estimation result is affected, and p=L 2/L1 can be set in practice; the variable lambda 1,…λ5 is the weight of each energy term, which in practice can be set to 1,3,1e3,0.03 and 10;
S7, tracking three-dimensional human motion gestures through a three-dimensional gesture database and a multi-view depth camera of a fine individuation human geometric model;
Three-dimensional human body posture reconstructed by three-dimensional depth point cloud P captured by current frame, fine individuation human body geometric model S *, human body posture database Q and previous two frames Solving the optimal three-dimensional human motion gesture q * of the current frame, which is expressed as:
Wherein, G 1 and G 2 are iterative closest point energy terms, G 3 is a three-dimensional human body posture priori energy term, G 4 is a three-dimensional human body posture joint angle range limiting energy term, and G 5 is a three-dimensional human body posture smooth change constraint energy term; the variable alpha 1,…,α5 is the weight of each energy term respectively;
"Point-to-Point" distance refers to the pose q-driven human body geometric model S * vertex v i(q,S*) and the captured three-dimensional depth point with the nearest Euclidean distance to space The distance between, "point-to-point" distance energy term G 1, expressed as:
"Point-to-plane" distance, referring to the normal of the vertex v i(q,S*) of the pose q-driven human geometric model S * The weight ratio of the 'point-to-surface' distance energy item to the 'point-to-point' distance energy item is increased to ensure good algorithm convergence with the distance between the tangential planes of the points of the captured three-dimensional depth point cloud, so that the ratio is still set to be 1:3 in practice; the "point-to-plane" distance energy term G 2, expressed as:
Three-dimensional human body posture reconstructed from previous 2 frames Searching K gesture neighbors from heterogeneous three-dimensional human body database, and searching gesture from the database as shown in figure 7 (wherein the dot-dash line represents 2 identical K gesture neighbors, only 1 is reserved)Is the first Z neighbor pose sets Q -2; retrieving gestures from a databaseIs the first Z neighbor pose sets Q -1; combining the two neighbor pose sets and removing redundant poses to obtain K neighbor pose sets Q -2,-1; defining a three-dimensional human body posture distance metric d query at the time of search as a three-dimensional joint center coordinate set, expressed as:
dquery(q,qn)=||J(T(q))-J(qn)||2---(17);
Wherein J (-) represents a three-dimensional joint center coordinate set obtained by forward kinematic calculation under a three-dimensional human body posture, and T is a space transformation matrix for aligning a query posture q to a posture q n in a database;
The body posture database Q -2,-1 is also known, and a global linear prior model of the body posture is built using principal component analysis techniques, expressed as:
Wherein ζ is a low-dimensional parameter vector of the human body posture, P q,b is a matrix formed by the front b-dimensional principal component vectors, Is the mean vector of the human body gesture in the database;
Punishment is carried out on the satisfaction degree of the probability distribution of the three-dimensional human body posture of the local space formed by the reconstructed three-dimensional human body posture Q and K posture neighbors Q K={q1,…,qK searched on line; assuming that K pose neighbors in the local space satisfy the multidimensional gaussian distribution, the human body pose prior term is expressed as:
wherein q is a human body posture vector to be solved, and epsilon is a constant;
The energy minimization problem typically translated by maximizing the probability problem in equation (19) is represented as follows:
The energy term is limited in the human body posture joint angle range, so that when the three-dimensional human body posture joint angle change value is solved in an iterative optimization mode, the reasonable numerical range is prevented from being exceeded, and an unreasonable three-dimensional human body posture reconstruction result is caused to appear:
Is a binary indicator function: if the ith joint angle is below the lower limit q i<qi, χ (i) is equal to 1, otherwise equal to 0; if the ith joint angle exceeds the upper limit ThenEqual to 1, otherwise equal to 0;
Three-dimensional human body posture smooth change constraint energy term punishment reconstructed current frame three-dimensional human body posture q and previous 2 frame reconstruction posture The smoothness of the speed of change between is expressed as:
Wherein, the meaning indicated in the right double brackets of the equal sign is: reconstructing poses from previous frames The speed of change of the three-dimensional human body posture q to the current frame is compared with the posture reconstruction from the penultimate frameReconstructing poses to previous frameA difference between the varying speeds of (2);
The energy function of equation (14) is expressed in the following form by equation (15), equation (16), equation (20), equation (21) and equation (22):
Wherein, Is the p-norm;
The captured three-dimensional depth point cloud has certain data noise, so that a space matching point pair is formed by the vertex of the human body geometric model and the noise point of the captured depth point cloud, thereby affecting the accuracy of the estimation result of the human body geometric model, and setting p=L 2/L1 in time; the variable α 1,…,α5 is the weight of each energy term, which in practice can be set to 1,3,0.1,1e2 and 0.01, respectively.
Compared with a non-model-based three-dimensional human body motion capture method, the model-based three-dimensional human body motion capture method is equivalent to adding three-dimensional human body shape priori knowledge due to the fact that an individualized three-dimensional human body geometric model is used, and therefore a more accurate three-dimensional human body motion gesture sequence can be obtained under general conditions. Furthermore, compared with a model-based monocular depth camera three-dimensional human body motion capture method, the model-based monocular depth camera three-dimensional human body motion method can reduce ambiguity of three-dimensional human body gesture reconstruction caused by limb shielding and self-shielding to the greatest extent due to the fact that data input information with more visual angles is used. Therefore, the invention mainly solves the technical problem of how to reconstruct three-dimensional human body gestures on line in real time based on the multi-view depth camera.
The invention provides a system for accurately capturing a three-dimensional human body motion gesture sequence on line in real time at low cost based on a fine individuation human body geometric model. The key thought is as follows: firstly, based on a human body geometric model library, automatically constructing a fine individual human body geometric model and an embedded human body skeleton thereof on line according to three-dimensional depth point cloud data captured by a multi-view depth camera; and then, directly constructing a system for accurately capturing the three-dimensional human motion gesture sequence on line in real time at low cost according to the three-dimensional depth point cloud data captured by the multi-view depth camera based on the reconstructed human geometric model. According to the three-dimensional depth point cloud data captured by the multi-view depth camera, the method can automatically and accurately reconstruct a fine individualized human body geometric model, and based on the geometric model, various types of three-dimensional human body motion gesture sequences of different people are accurately captured on line in real time, and the average frame rate on the GPU is about 20 frames/second.
Referring to fig. 8, the three-dimensional human body posture reconstruction algorithm flow based on the GPU includes "Levenberg-Marquardt nonlinear optimization based on the GPU", the GPU and the CPU process the image, and the number represents the actual average processing time of each step; the method comprises the steps of (1) calculating a 3D depth point cloud by using a GPU, (b) extracting and denoising based on a GPU foreground, (c) carrying out a LBS algorithm based on the GPU, (c) searching and registering based on the nearest point of the GPU, (c) carrying out a K neighbor search based on the GPU, and (c) carrying out linearization A=J '. Times.J based on the GPU, wherein g=J '. Times.rq ' based on the GPU are processed by the GPU; the "reading depth image", "updating 3D human body pose", "eliminating mismatching point pair" and "displaying 3D pose tracking result" are all processed by the CPU.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that it will be apparent to those skilled in the art that modifications and variations can be made without departing from the technical principles of the present invention, and these modifications and variations should also be regarded as the scope of the invention.

Claims (3)

1. A three-dimensional real-time human body posture reconstruction method, comprising:
s1, 4 depth cameras are used for shooting depth images, pixel points of the depth images are expressed as x, and depth values and three-dimensional points corresponding to the pixel points of the depth images are respectively d (x) and p;
S2, connecting the 4 depth cameras with a computer through a PCI data interface, and synchronously driving the 4 depth cameras by the computer to complete data acquisition work;
S3,4 depth cameras are respectively positioned at 4 corners of a square human body motion capture scene, wherein every 2 adjacent depth cameras are based on a checkerboard and a stereoscopic vision calibration method;
s4, preprocessing and denoising original three-dimensional points of depth images captured by the 4 depth cameras;
S5, selecting a motion sequence from a motion capture open source database, establishing a three-dimensional human body posture database Q, and carrying out skeleton normalization of a motion redirection technology in the establishment of the three-dimensional human body posture database Q;
S6, shooting an A-pose image of a real human body by using 4 depth cameras, deforming a human body geometric model of the A-pose image of the real human body in a pose dimension by a skeleton-driving-based method, and then passing a fine individuation human body geometric model automatic estimation problem form into a nonlinear optimization problem to carry out iterative optimization solution;
The human body geometric model is represented by a long vector S i of a model grid vertex set, the human body geometric model database is represented as s= { S i, i= …, N }, a global linear prior model of the human body geometric model is established by using a principal component analysis technology, and the human body geometric model is formed as follows:
Wherein beta is a low-dimensional parameter vector of the human body geometric model, P β,k is a matrix formed by the first k-dimensional principal component vectors, Is the mean value vector of the human body geometric model in the database;
the coordinates of the centers J i of joints of the human skeleton are respectively expressed as weighted linear combinations of vertex coordinate sets of the geometrical model of the human body, and the vertex weights w under the geometrical template model are calculated and expressed as follows:
Wherein V i is a neighboring model vertex set of the embedded skeleton joint i, w i,j is a weight of the embedded skeleton joint i relative to a jth vertex in the neighboring vertex set thereof, V i,j is a jth vertex coordinate in the neighboring vertex set of the embedded skeleton joint i, and J i is a coordinate of the embedded skeleton joint i;
Obtaining a vertex set of a neighboring human body geometrical model of each joint center of the human body embedded skeleton in a space bounding box defining mode;
Solving the vertex weight w estimates the vertex of the geometric model according to the human body geometric template model and the human body skeleton embedded in the model, and forms the vertex weight w into a linear least square problem with constraint, which is expressed as follows:
The weight of the template model embedded skeleton joint i relative to the jth vertex in the vertex set of the neighboring template model, Is the j-th vertex coordinate in the vertex set of the neighbor template model of the embedded skeleton joint i of the template model,Is the coordinate of the embedded skeleton joint i of the template model; variable(s)AndAll are known variables, and the variable w i,j is a variable to be solved; solving the vertex weight w by using a non-negative least square algorithm;
According to the formula (2), when the vertex coordinates and the vertex weights w i,j of the individualized human body geometric model are given, the joint coordinates of the embedded skeleton of the current novel human body geometric model can be obtained
Through the three-dimensional depth point cloud P, the human body geometric model database S and the human body posture database Q, an individualized human body geometric model is obtained, and the parameters of the individualized human body geometric model are human body posture parameter vectorsAnd human body geometric model parameter vectorExpressed as:
Wherein, E 1 and E 2 are iterative closest point energy terms, E 3 is a human skeleton length symmetric energy term, E 4 is a human body geometric model priori energy term, and E 5 is a human body posture priori energy term; the variable lambda 1,…λ5 is the weight of each energy term;
Wherein the point-to-point distance refers to the captured three-dimensional depth point with the closest Euclidean distance between the vertex v i (q, beta) of the human body geometric model and the space The distance between, "point-to-point" distance energy term E 1, expressed as:
wherein the "point-to-plane" distance refers to the normal of the human body geometric model vertex v i (q, beta) Intersection point with captured three-dimensional depth point cloudIs expressed as a "point-to-plane" distance energy term E 2:
In order to ensure the reasonability of the reconstruction result of the human body geometric model, a human skeleton length symmetrical energy item is introduced, and the method is expressed as follows:
E3=∑{(m,n)}∈{(M,N)}‖lm-ln2---(7);
Wherein { (M, N) } is a symmetrical skeleton segment set, and l m,ln is the length of the left and right skeleton segments with symmetry, respectively;
if the human body geometric model database in the global space forms multidimensional Gaussian distribution, the prior constraint item of the human body geometric model maximizes the following conditional probability, which is expressed as:
Wherein Λ β is a matrix formed by the k-dimensional principal component vectors before the covariance matrix of the human body geometric model database, and is converted into an energy minimization form through a formula (8), and is expressed as follows:
Where β is the low-dimensional parameter vector of the human body geometric model to be solved, P β,k and The matrix and the mean value vector are respectively formed by the calculated prior k-dimensional principal component vectors of the human body geometric model;
Let the body posture database q= { Q i, i=1, …, H }, build a global linear prior model of body posture using principal component analysis technique, expressed as:
Wherein ζ is a low-dimensional parameter vector of the human body posture, P q,b is a matrix formed by the front b-dimensional principal component vectors, Is the mean vector of the human body gesture in the database;
Assuming that the human body posture data in the global space form a multidimensional gaussian distribution, the human body posture prior constraint term maximizes the following conditional probability, expressed as:
Wherein q is the human body posture vector to be solved, and delta 5 is a constant;
The energy minimization form used in equation (11) is expressed as:
In combination of the formula (5), the formula (6), the formula (7), the formula (9) and the formula (12), the formula (4) is expressed as follows:
Wherein, Is the p-norm; because certain data noise exists in the captured three-dimensional depth point cloud, the human body geometric model vertex and the noise point of the captured depth point cloud form a space matching point pair, so that the accuracy of the human body geometric model estimation result is affected, and a variable lambda 1,…λ5 is the weight of each energy item;
S7, tracking three-dimensional human motion gestures through a three-dimensional gesture database and a multi-view depth camera of a fine individuation human geometric model;
Three-dimensional human body posture reconstructed by three-dimensional depth point cloud P captured by current frame, fine individuation human body geometric model S *, human body posture database Q and previous two frames Solving the optimal three-dimensional human motion gesture q * of the current frame;
Three-dimensional human body posture reconstructed from previous 2 frames Searching K gesture neighbors from heterogeneous three-dimensional human body database, and searching gesture from databaseIs the first Z neighbor pose sets Q -2; retrieving gestures from a databaseIs the first Z neighbor pose sets Q -1; combining the two neighbor pose sets and removing redundant poses to obtain K neighbor pose sets Q -2,-1; the three-dimensional human pose distance metric d query at the time of defining the search is a three-dimensional joint center coordinate set.
2. The three-dimensional real-time human body posture reconstruction method of claim 1, further comprising:
Synchronously capturing a plurality of groups of infrared image pairs containing checkerboards of 2 adjacent depth cameras, extracting a checkerboard corner set from the captured infrared images by using a corner detection algorithm, and estimating an inner parameter matrix and an outer parameter matrix of the infrared cameras of the 2 depth cameras based on a stereoscopic vision algorithm;
The coordinate system where 1 depth camera is located is taken as a world coordinate system, the optical center of the coordinate system is taken as a world coordinate origin, and external parameter matrixes obtained by the other 3 depth cameras based on calibration are aligned to the world coordinate system;
Estimating a floor plane in a scene based on a three-dimensional plane fitting technology, automatically subtracting background pixel points based on a three-dimensional cylindrical bounding box, wherein the bottom plane of the bounding box is parallel to the estimated floor plane, a bottom surface center point is an acquisition scene center point, the rest three-dimensional depth point cloud distribution center points of the foreground are taken as starting points, and a maximum communication area in a three-dimensional space is obtained by setting a proper three-dimensional depth point distance threshold value, so that three-dimensional depth point cloud denoising is completed.
3. A three-dimensional real-time human body posture reconstruction method according to claim 1, characterized in that in said S7,
Three-dimensional human body posture reconstructed by three-dimensional depth point cloud P captured by current frame, fine individuation human body geometric model S *, human body posture database Q and previous two framesSolving the optimal three-dimensional human motion gesture q * of the current frame, which is expressed as:
Wherein, G 1 and G 2 are iterative closest point energy terms, G 3 is a three-dimensional human body posture priori energy term, G 4 is a three-dimensional human body posture joint angle range limiting energy term, and G 5 is a three-dimensional human body posture smooth change constraint energy term; the variable alpha 1,…,α5 is the weight of each energy term respectively;
"Point-to-Point" distance refers to the pose q-driven human body geometric model S * vertex v i(q,S*) and the captured three-dimensional depth point with the nearest Euclidean distance to space The distance between, "point-to-point" distance energy term G 1, expressed as:
"Point-to-plane" distance, referring to the normal of the vertex v i(q,S*) of the pose q-driven human geometric model S * The distance between the tangent planes to the point of intersection of the captured three-dimensional depth point cloud, the "point-to-face" distance energy term G 2, is expressed as:
Three-dimensional human body posture reconstructed from previous 2 frames Searching K gesture neighbors from heterogeneous three-dimensional human body database, and searching gesture from databaseIs the first Z neighbor pose sets Q -2; retrieving gestures from a databaseIs the first Z neighbor pose sets Q -1; combining the two neighbor pose sets and removing redundant poses to obtain K neighbor pose sets Q -2,-1; defining a three-dimensional human body posture distance metric d query at the time of search as a three-dimensional joint center coordinate set, expressed as:
dquery(q,qn)=‖J(T(q))-J(qn)‖2---(17);
Wherein J (-) represents a three-dimensional joint center coordinate set obtained by forward kinematic calculation under a three-dimensional human body posture, and T is a space transformation matrix for aligning a query posture q to a posture q n in a database;
The body posture database Q -2,-1 is also known, and a global linear prior model of the body posture is built using principal component analysis techniques, expressed as:
Wherein ζ is a low-dimensional parameter vector of the human body posture, P q,b is a matrix formed by the front b-dimensional principal component vectors, Is the mean vector of the human body gesture in the database;
Punishment is carried out on the satisfaction degree of the probability distribution of the three-dimensional human body posture of the local space formed by the reconstructed three-dimensional human body posture Q and K posture neighbors Q K={q1,…,qK searched on line; assuming that K pose neighbors in the local space satisfy the multidimensional gaussian distribution, the human body pose prior term is expressed as:
wherein q is a human body posture vector to be solved, and epsilon is a constant;
The energy minimization problem typically translated by maximizing the probability problem in equation (19) is represented as follows:
The energy term is limited in the human body posture joint angle range, so that when the three-dimensional human body posture joint angle change value is solved in an iterative optimization mode, the reasonable numerical range is prevented from being exceeded, and an unreasonable three-dimensional human body posture reconstruction result is caused to appear:
Is a binary indicator function: if the ith joint angle is below the lower limit q i<qi, χ (i) is equal to 1, otherwise equal to 0; if the ith joint angle exceeds the upper limit ThenEqual to 1, otherwise equal to 0;
Three-dimensional human body posture smooth change constraint energy term punishment reconstructed current frame three-dimensional human body posture q and previous 2 frame reconstruction posture The smoothness of the speed of change between is expressed as:
Wherein, the meaning indicated in the right double brackets of the equal sign is: reconstructing poses from previous frames The speed of change of the three-dimensional human body posture q to the current frame is compared with the posture reconstruction from the penultimate frameReconstructing poses to previous frameA difference between the varying speeds of (2);
The energy function of equation (14) is expressed in the following form by equation (15), equation (16), equation (20), equation (21) and equation (22):
Wherein, Is the p-norm.
CN202110521606.XA 2021-05-13 Three-dimensional real-time human body posture reconstruction method Active CN113256789B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110521606.XA CN113256789B (en) 2021-05-13 Three-dimensional real-time human body posture reconstruction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110521606.XA CN113256789B (en) 2021-05-13 Three-dimensional real-time human body posture reconstruction method

Publications (2)

Publication Number Publication Date
CN113256789A CN113256789A (en) 2021-08-13
CN113256789B true CN113256789B (en) 2024-07-05

Family

ID=

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106600626A (en) * 2016-11-01 2017-04-26 中国科学院计算技术研究所 Three-dimensional human body movement capturing method and system
CN108765548A (en) * 2018-04-25 2018-11-06 安徽大学 Three-dimensional scene real-time reconstruction method based on depth camera

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106600626A (en) * 2016-11-01 2017-04-26 中国科学院计算技术研究所 Three-dimensional human body movement capturing method and system
CN108765548A (en) * 2018-04-25 2018-11-06 安徽大学 Three-dimensional scene real-time reconstruction method based on depth camera

Similar Documents

Publication Publication Date Title
CN110135375B (en) Multi-person attitude estimation method based on global information integration
CN108629801B (en) Three-dimensional human body model posture and shape reconstruction method of video sequence
CN107392964B (en) The indoor SLAM method combined based on indoor characteristic point and structure lines
Ye et al. Accurate 3d pose estimation from a single depth image
US8023726B2 (en) Method and system for markerless motion capture using multiple cameras
Uddin et al. Human activity recognition using body joint‐angle features and hidden Markov model
US20180144458A1 (en) Multiple Hypotheses Segmentation-Guided 3D Object Detection and Pose Estimation
Boisvert et al. Three-dimensional human shape inference from silhouettes: reconstruction and validation
CN103733226A (en) Fast articulated motion tracking
CN107563323A (en) A kind of video human face characteristic point positioning method
CN114119739A (en) Binocular vision-based hand key point space coordinate acquisition method
CN106815855A (en) Based on the human body motion tracking method that production and discriminate combine
Uddin et al. Human Activity Recognition via 3-D joint angle features and Hidden Markov models
CN112330813A (en) Wearing three-dimensional human body model reconstruction method based on monocular depth camera
Thang et al. Estimation of 3-D human body posture via co-registration of 3-D human model and sequential stereo information
Ubina et al. Intelligent underwater stereo camera design for fish metric estimation using reliable object matching
Kanaujia et al. 3D human pose and shape estimation from multi-view imagery
CN111428555B (en) Joint-divided hand posture estimation method
CN110490973B (en) Model-driven multi-view shoe model three-dimensional reconstruction method
CN113256789B (en) Three-dimensional real-time human body posture reconstruction method
Wang et al. Physical Priors Augmented Event-Based 3D Reconstruction
CN113192186B (en) 3D human body posture estimation model establishing method based on single-frame image and application thereof
Brox et al. Nonparametric density estimation for human pose tracking
Zhang et al. Motion analysis of articulated objects from monocular images
CN113256789A (en) Three-dimensional real-time human body posture reconstruction method

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant