CN115810219A - Three-dimensional gesture tracking method based on RGB camera - Google Patents

Three-dimensional gesture tracking method based on RGB camera Download PDF

Info

Publication number
CN115810219A
CN115810219A CN202211650785.8A CN202211650785A CN115810219A CN 115810219 A CN115810219 A CN 115810219A CN 202211650785 A CN202211650785 A CN 202211650785A CN 115810219 A CN115810219 A CN 115810219A
Authority
CN
China
Prior art keywords
hand
dimensional
optimal
model
joint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211650785.8A
Other languages
Chinese (zh)
Inventor
陈建新
欧超前
汪峰平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202211650785.8A priority Critical patent/CN115810219A/en
Publication of CN115810219A publication Critical patent/CN115810219A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a three-dimensional gesture tracking method based on an RGB camera, which comprises the following steps: taking an RGB camera as input, carrying out standardized processing on pictures, sending the pictures into a three-dimensional joint point detection module, extracting image characteristics, predicting preliminary three-dimensional joint points and calculating the length of hand bones; iteratively fitting the length of the hand skeleton and an MAMO hand model by utilizing a particle swarm optimization algorithm, finding out the optimal hand shape and obtaining a new three-dimensional joint point; sending the new three-dimensional joint points into a reverse analysis kinematics module to solve the posture parameters required by the MAMO model; obtaining a final joint point and a mesh vertex by combining the MANO model with the optimal hand shape and the optimal posture parameter theta; and performing reconstruction rendering output on the obtained grid vertex and joint points to obtain the final three-dimensional hand real-time tracking posture. The method has the advantages of high prediction accuracy, high precision of the three-dimensional gesture tracking effect and easy realization of the tracking process and result.

Description

Three-dimensional gesture tracking method based on RGB camera
Technical Field
The invention belongs to the technical field of gesture recognition, and particularly relates to a three-dimensional gesture tracking method based on an RGB (red, green and blue) camera.
Background
The hand is one of the most frequently used parts of the human body in real life, and is one of the most rich in motion change. The hand can present diversified postures, thereby conveying rich information. Capturing the motion state of a hand is very important for various applications in the fields of virtual reality, augmented reality, human-computer interaction, and the like. Therefore, many researchers have proceeded with this research and have made some progress in recent years.
Due to the wide application of depth cameras, many researchers at early times estimated hand pose based on depth images by fitting the generated model to the depth images. Tompson et al combine CNN with random decision forest and inverse kinematics to estimate hand pose in real time from a single depth image. Wan et al uses unmarked depth images for self-supervised fine tuning, while Mueller et al constructs photo-level photorealistic datasets to achieve better robustness. Some researchers have also separated point clouds and 3D voxels from depth images for research.
Because depth sensors are expensive, high in power consumption, and strict in requirements for experimental environments, more and more people are beginning to research 3D hand pose estimation based on monocular RGB images. Zimmermann and Brox trained a CNN-based model that estimates 3D joint coordinates directly from RGB images. Iqbal et al use a 2.5D heat map formula that encodes the 2D joint position along with the depth information, greatly improving accuracy. Many researchers have utilized depth image data sets to expand the diversity seen during training. Mueller et al propose a large-scale rendering dataset post-processed by CycleGAN to close the domain gap. However, they only focus on joint position estimation and do not perform joint rotation recovery. Ge et al use GraphCNN to directly regress the hand mesh, but require a special dataset with a real hand mesh. This model-free approach generally works for challenging scenes. This approach works well for leveraging existing datasets from different modalities, including image data and non-image data. Zhou et al, using image data with 2D or 3D annotations and 3D animated images without corresponding image data, propose a 3D hand joint detection module and a reverse kinematics module that not only regresses the 3D joint position, but also focuses on the joint rotation problem, which has certain prospects in the fields of computer vision and graphics applications. However, the Zhou method does not consider the best matching of the three-dimensional joint points and the hand model of the MANO, and the inverse kinematics module algorithm is slightly more complex and not easy to implement.
In the above methods, some depth images and some RGB color images are used, but on one hand, the image features of 2D image information and 3D image information are not fully used, and on the other hand, because the network model is not advanced enough or the method is too complicated, the prediction accuracy of the hand 3D coordinates is not very high or the three-dimensional gesture tracking effect is general. Meanwhile, the problems of shielding of hands, poor real-time performance of gesture tracking and the like exist.
Disclosure of Invention
The invention mainly aims to design a three-dimensional gesture tracking method to improve the prediction accuracy and obtain a high-precision three-dimensional gesture tracking effect, and meanwhile, the tracking process and the tracking result are easy to realize.
In order to achieve the above object, the present invention provides a three-dimensional gesture tracking method based on an RGB camera, comprising the following steps:
the method comprises the following steps of firstly, taking an RGB camera as input, carrying out standardized processing on a picture, sending the processed picture into a three-dimensional joint point detection module, extracting image characteristics, predicting a primary three-dimensional joint point by using a convolutional neural network and calculating the length of a hand skeleton;
iteratively fitting the length of the skeleton of the hand and an MAMO hand model by utilizing a particle swarm optimization algorithm, finding out the optimal hand shape and obtaining a new three-dimensional joint point;
sending the new three-dimensional joint points into a reverse analysis kinematics module to solve a posture parameter theta required by the MAMO model;
step four, combining the optimal hand shape and the optimal posture parameter theta by using an MANO model to obtain a final joint point and a mesh vertex; and
and fifthly, performing reconstruction rendering output on the obtained grid vertexes and joint points to obtain the final three-dimensional hand real-time tracking posture.
The invention is further improved in that the three-dimensional joint point detection module uses a neural network model based on ResNet50, including a feature extractor, a 2D detector and a 3D detector, uses ResNet50 with attention mechanism as the feature extractor, inputs are images with a resolution of 128 x 128, and outputs a feature body F with a size of 32 x 256.
The invention is further improved in that the 2D detector is a two-layer CNN, acquires the characteristic body F and outputs a heat map H of 21 joints, and the heat map H is used for 2D posture estimation; the 3D detector first estimates a delta map D from the heat map H and the eigenbody F using the layer 2 CNN, which are connected and fed into another layer 2 CNN to obtain a final position map L, and estimates the 3D hand joint position in the form of the position map L.
The invention is further improved in that the MANO model in step four is a 3D parameterized model that constructs a complete hand chain from 16 joint points and 5 fingertip points taken from vertices.
A further improvement of the present invention is that in step five, the reconstruction of the hand mesh vertices is performed using open 3D.
A further improvement of the invention resides in the fact that in the MANO model
Figure BDA0004010484410000041
Is a gesture of palm flattening, and is formed by a template
Figure BDA0004010484410000042
Shape function B s (beta) and the postural function B p (theta) obtaining a hand deformation template T, and then performing skinning operation on the hand deformation template T by combining the posture parameter theta, the skinning weight omega and the joint position J (theta), wherein the mathematical expression is as follows:
Figure BDA0004010484410000043
M(θ,β)=W(T(θ,β),θ,ω,J(θ))。
the invention is further improved in that the particle swarm optimization algorithm initializes a group of random particles, then finds an optimal solution through multiple iterations, in each iteration, the particles update themselves by tracking two extreme values, and after finding the two optimal values, the particles update their speed and position through the following formula:
Figure BDA0004010484410000044
Figure BDA0004010484410000045
wherein i =1, 2., N is the particle swarm size, d is the particle dimension number, k is the number of iterations, ω is the inertial weight, c 1 Is an individual learning factor, c 2 Is a group learning factor, r 1 ,r 2 Is the interval [0,1]Random numbers inside, increase the randomness of the search,
Figure BDA0004010484410000046
is the d-th dimension of the velocity vector of particle i in the k-th iteration,
Figure BDA0004010484410000047
is the location vector of particle i in dimension d in the kth iteration,
Figure BDA0004010484410000048
is the historical optimal position of the particle i in the d-th dimension in the k-th iteration, namely after the k-th iteration, the optimal solution is obtained by searching the ith particle,
Figure BDA0004010484410000049
is the historical optimal position of the population in the d-th dimension in the k-th iteration, i.e. in the k-th iterationAfter generation, the optimal solution in the whole population of particles.
The invention has the beneficial effects that: the hand shape is restored by combining the particle swarm optimization algorithm and the MANO model, the accuracy of gesture tracking is improved, the tracking effect on the self-shielding of hand motion and the shielding of objects is good, the posture parameters are solved by adopting analytic inverse kinematics, the algorithm structure is optimized, the complexity of the whole method is controlled, and the real-time performance is good. And finally, high-precision real-time three-dimensional gesture tracking is realized.
Drawings
Fig. 1 is an overall framework diagram of the three-dimensional gesture tracking method based on the RGB camera according to the present invention.
FIG. 2 is a network model diagram of a three-dimensional joint detection module according to the present invention.
FIG. 3 is a graph showing the effect of the experiment according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
It should be emphasized that in describing the present invention, various formulas and constraints are identified with consistent labels, but the use of different labels to identify the same formula and/or constraint is not precluded and is provided for the purpose of more clearly illustrating the features of the present invention.
The invention provides a gesture tracking method which is based on a convolutional neural network, adds a particle swarm optimization algorithm and combines inverse analytic kinematics. The method uses 2D image data and 3D image data, can better train the model by using image characteristics, preprocesses original data and sends the preprocessed data to a convolutional neural network added with an attention mechanism, the accuracy of the obtained hand 3D joint points is higher, then iterative fitting is carried out on the MANO model by combining a particle swarm optimization algorithm to find out the optimal shape of the hand, and then derivation of hand posture parameters is carried out by using analytic inverse kinematics to obtain the hand parameters required by the MANO model. The method can better utilize the motion characteristics of the hand by combining with the inverse analytic kinematics, and simultaneously carry out PSO iterative fitting on the skeleton length and the MANO hand model parameter file to find out the optimal hand shape. On the premise of controlling the complexity of the algorithm, the high-precision three-dimensional gesture tracking effect is finally realized.
The invention discloses a three-dimensional gesture tracking method based on an RGB camera, which mainly comprises the following steps:
acquiring a video stream, processing a picture, sending the processed picture into a three-dimensional joint point detection module, extracting image characteristics, predicting a preliminary three-dimensional joint point by using a convolutional neural network and calculating the length of a hand skeleton;
iteratively fitting the length of the hand skeleton and an MAMO hand model by utilizing a particle swarm optimization algorithm, finding out the optimal hand shape and obtaining a new three-dimensional joint point;
sending the new three-dimensional joint points into a reverse analysis kinematics module to solve a posture parameter theta required by the MAMO model;
step four, combining the optimal hand shape and the optimal posture parameter theta by using the MANO model to obtain a final joint point and a mesh vertex; and
and fifthly, performing reconstruction rendering output on the obtained grid vertexes and joint points to obtain the final three-dimensional hand real-time tracking posture.
The present invention is described in detail below with reference to the attached drawings.
The method comprises the steps of firstly, acquiring a video stream, processing a picture, sending the processed picture to a three-dimensional joint point detection module, extracting image characteristics, predicting a preliminary three-dimensional joint point by using a convolutional neural network and calculating the length of a hand skeleton.
In the first step, an RGB three-channel camera is used as input, the picture is cut and normalized, the processed picture is sent to a three-dimensional joint point detection module, image features are extracted, a convolutional neural network is used for predicting preliminary hand three-dimensional joint points and calculating the length of bones. The hardware device uses a common RGB three-channel camera, takes a video stream form as input, processes the video stream frame by frame, firstly uniformly cuts the video stream into 128 × 128 size, and standardizes the three channels by using the mean value and standard deviation of each layer of channels.
As shown in fig. 2, the three-dimensional joint point detection module is composed of a feature extractor, a 2D detector, and a 3D detector. Using the ResNet50 with attention mechanism as a feature extractor, the input is an image with a resolution of 128 × 128, and the output is a feature body F with a size of 32 × 32 × 256; the 2D detector is a two-layer CNN that takes the feature F and outputs a heat map H of 21 joints, which is used for 2D pose estimation, 2D coordinates of 21 joint points of the hand:
P 2d =[[x 1 ,y 1 ],[x 2 ,y 2 ],...,[x 1 ,y 1 ]] T
a heat map is generated for the 2D image coordinates of each joint using a two-dimensional gaussian distribution, the heat map producing the formula:
Figure BDA0004010484410000071
wherein σ determines the size of the hot soil radius. f (x, y) represents the probability value of the joint point i with the image coordinate of [4x,4y ], and the position with the maximum response in the ith heat map corresponds to the 2D image coordinate of the ith joint.
The 3D detector first estimates a delta map D from the heat map H and the feature volume F using the layer 2 CNN, which are connected and fed into another layer 2 CNN to obtain a final position map L, and estimates the 3D hand joint position in the form of the position map L. The 3D hand joint position represents the 3D coordinate P3 of 21 joint points of the hand in a space coordinate system relative to the wrist joint d =[[x 1 ,y1,z 1 ],[x 2 ,y 2 ,z 2 ],...,[x 1 ,y 1 ,z 1 ]] T . The existence of the hand skeleton limits the size, the motion range and the distance between joint points of the hand. Any one of the skeletons can be expressed as a vector b between the ith and jth joint points of the hand ij Bone vector b ij Length | b of ij I corresponds to the length of the bone, bone vector b ij Direction of (1)
Figure BDA0004010484410000081
The orientation of the bone is indicated. For the entire hand, there are 20 bone vectors, so the bone length of the entire hand can be represented as matrix B L ,B L ∈R (J-1)×1
And step two, iteratively fitting the length of the hand skeleton and the MAMO hand model by utilizing a particle swarm optimization algorithm, finding out the optimal hand shape and obtaining a new three-dimensional joint point.
The particle swarm optimization algorithm is a random optimization technique based on a swarm, and the algorithm firstly initializes a swarm of random particles (random solution) and then finds the optimal solution through multiple iterations. In each iteration, the particle updates itself by tracking both extrema. After finding these two optimal values, the particle updates its velocity and position by the following formula:
Figure BDA0004010484410000082
Figure BDA0004010484410000083
wherein i =1, 2., N is the particle swarm size, d is the particle dimension number, k is the number of iterations, ω is the inertial weight, c 1 Is an individual learning factor, c 2 Is a population learning factor, r 1 ,r 2 Is the interval [0,1]Random numbers inside, increase the randomness of the search,
Figure BDA0004010484410000084
is the d-th dimension of the velocity vector of particle i in the k-th iteration,
Figure BDA0004010484410000085
is the location vector of particle i in dimension d in the kth iteration,
Figure BDA0004010484410000086
is the historical optimal position of the particle i in the d-th dimension in the k-th iteration, namely after the k-th iteration, the optimal solution is obtained by searching the ith particle,
Figure BDA0004010484410000087
the optimal solution is the historical optimal position of the population in the d-th dimension in the k-th iteration, namely the optimal solution in the whole population of particles after the k-th iteration.
The iteration termination condition of the particle swarm optimization algorithm is generally selected as the maximum iteration frequency or meets the precision requirement according to different specific problems, and in the invention, the maximum iteration frequency is finally set to be 150 times through multiple comparison experiments. In each iteration, the particle updates itself by tracking both extrema.
The MANO model is a mainstream hand posture estimation parameterized model, which forms a complete hand chain according to 16 joint points and 5 fingertip points obtained from vertexes, and the hand shape can be recovered from the MANO model by combining posture parameters.
Figure BDA0004010484410000091
Is an initial MANO network template used to represent the initial mesh vertex positions of a standard MANO model surface at rest. Initial template through MANO
Figure BDA0004010484410000092
Shape function B s (beta) and the postural function B p (theta), we can obtain a hand deformation template T, and then perform skinning operation on the hand deformation template T by combining the posture parameter theta, the skinning weight omega and the joint position J (theta), wherein the mathematical expression is as follows:
Figure BDA0004010484410000093
M(θ,β)=W(T(θ,β),θ,ω,J(θ))
and step three, sending the new three-dimensional joint points to a reverse analysis kinematics module to solve the posture parameter theta required by the MAMO model.
The 3D joint coordinates may explain the hand pose to some extent but not enough to represent a three-dimensional hand model, so we need to derive joint rotations from the joint coordinates. The analytical inverse kinematics is to solve the pose parameters θ by decomposing the rotation into torsion and swing, which are finally used in the MANO model to restore the hand shape and pose.
And step four, combining the optimal hand shape and the optimal posture parameter theta by using the MANO model to obtain a final joint point and a mesh vertex.
According to the whole process of the model generation, after the posture parameter theta and the optimal hand shape are given, the corresponding grid shape can be generated through the MANO model.
And fifthly, performing reconstruction rendering output on the obtained grid vertexes and joint points to obtain the final three-dimensional hand real-time tracking posture.
In step five, reconstruction of the hand mesh vertices is performed using open 3D.
The present design completes a total of two experiments.
The first experiment aims at testing whether the method can improve the detection accuracy of the three-dimensional joint points, the model is trained on GPU Tesla P100-SXM2 of Nvidia DGX, and a training set of the model adopts three data sets including CMU HandDB, rendered Handpost Dataset and GAnerated handleset Dataset. The test set used four data sets of Rendered Handpost Dataset, egoDexter Dataset, STB Dataset, dexterObject Dataset.
The evaluation indexes are correct key point Proportion (PCK) and area under the curve (AUC), a 3D joint point error threshold value c needs to be set artificially for calculating the PCK, when the 3D joint point error is smaller than c, the joint point is considered to be detected correctly, and the proportion of the joint points which are estimated to be correct to all joint points is the value of the PCK. At the same threshold, the higher the PCK value, the better the performance of the method. Different PCKs can be obtained by setting different thresholds c, a curve of the PCKs changing along with the thresholds can be obtained by taking the thresholds c as a horizontal axis and the PCK value as a vertical axis, the area between the curve and the horizontal axis is calculated to obtain the AUC value, and the higher the AUC value is, the more accurate the attitude estimation is represented.
The experimental results are shown in table 1, and it can be seen that the method of the present invention can estimate the coordinates of the joint points of the hand more accurately.
TABLE 1 results of the experiment
Figure BDA0004010484410000101
Figure BDA0004010484410000111
The second experiment aims at testing the real-time hand tracking effect and the hand shielding recovery problem of the method, the experiment environment is Intel (R) Xeon (R) CPU E5-2620, the palm, the pen holding, the cup holding and other actions are respectively carried out, the experiment result is shown in figure 3, and the method has good three-dimensional gesture tracking capability, does not have obvious deformation and self-shielding phenomena of the hand, and can recover the hand shape in real time for the shielding of an object.
In conclusion, the three-dimensional gesture tracking performed by the method has better real-time performance, and has better real-time effect on the self-shielding and object shielding of the hand.
The invention also provides a three-dimensional gesture tracking device based on the RGB camera, which comprises a picture acquisition and preprocessing module, a three-dimensional joint point detection module, a posture parameter calculation module, a joint point acquisition module and a rendering output module.
The image acquisition and preprocessing module and the three-dimensional joint point detection module are used for carrying out standardized processing on the image by taking an RGB camera as input, sending the processed image into the three-dimensional joint point detection module, extracting image characteristics, predicting a preliminary hand three-dimensional joint point by utilizing a convolutional neural network and calculating the length of a skeleton;
the gesture parameter calculation module is used for iteratively fitting the hand skeleton length and the MANO model by utilizing a particle swarm optimization algorithm, finding out the optimal hand shape and obtaining a group of new three-dimensional joint points; sending the new three-dimensional joint points into a reverse analysis kinematics module to solve a posture parameter theta required by an MANO model;
the joint point acquisition module is used for obtaining a final joint point and a grid vertex by combining the MANO model with the optimal hand shape and the posture parameter theta; and
and the rendering output module is used for reconstructing and rendering the obtained grid vertex and joint points to output to obtain the final three-dimensional hand real-time tracking gesture.
According to the method, hand skeleton length is deduced through hand 3D coordinates predicted by a convolutional neural network model, the hand skeleton length and an MANO hand model parameter file are subjected to 150 times of iterative fitting by using a particle swarm optimization algorithm to match an optimal hand shape, then posture parameters are deduced from joint positions by using inverse analytic kinematics, and finally the posture parameters are combined with the MANO model to obtain a final hand tracking posture. The algorithm has better real-time performance in a real use scene, and has better robustness for self-shielding of hands and object shielding.
Although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the present invention.

Claims (7)

1. A three-dimensional gesture tracking method based on an RGB camera is characterized by comprising the following steps:
the method comprises the following steps of firstly, taking an RGB camera as input, carrying out standardized processing on a picture, sending the processed picture into a three-dimensional joint point detection module, extracting image characteristics, predicting a primary three-dimensional joint point by using a convolutional neural network and calculating the length of a hand skeleton;
iteratively fitting the length of the hand skeleton and an MAMO hand model by utilizing a particle swarm optimization algorithm, finding out the optimal hand shape and obtaining a new three-dimensional joint point;
sending the new three-dimensional joint points into a reverse analysis kinematics module to solve a posture parameter theta required by the MAMO model;
step four, combining the optimal hand shape and the optimal posture parameter theta by using the MANO model to obtain a final joint point and a mesh vertex; and
and fifthly, performing reconstruction rendering output on the obtained grid vertexes and joint points to obtain the final three-dimensional hand real-time tracking posture.
2. The RGB camera-based three-dimensional gesture tracking method of claim 1, wherein: the three-dimensional joint point detection module uses a neural network model based on ResNet50, comprising a feature extractor, a 2D detector and a 3D detector, uses ResNet50 with an attention mechanism as the feature extractor, inputs images with the resolution of 128 x 128, and outputs a feature body F with the size of 32 x 256.
3. The RGB camera-based three-dimensional gesture tracking method of claim 2, wherein: the 2D detector is a two-layer CNN, a feature body F is obtained, heat maps H of 21 joint points are output, and the heat maps H are used for 2D attitude estimation; the 3D detector first estimates a delta map D from the heat map H and the eigenbody F using the layer 2 CNN, which are connected and fed into another layer 2 CNN to obtain a final position map L, and estimates the 3D hand joint position in the form of the position map L.
4. The RGB camera-based three-dimensional gesture tracking method of claim 3, wherein: the MANO model in step four is a 3D parameterized model that constructs a complete hand chain from 16 joint points and 5 fingertip points taken from vertices.
5. The RGB camera-based three-dimensional gesture tracking method of claim 3, wherein: and in the fifth step, reconstructing the vertex of the hand mesh by using open 3D.
6. RGB camera based on claim 3The three-dimensional gesture tracking method is characterized by comprising the following steps: in the MANO model
Figure FDA0004010484400000021
Is a gesture of palm flattening, and is formed by a template
Figure FDA0004010484400000022
Shape function B s (beta) and the postural function B p (theta) obtaining a hand deformation template T, and then performing skinning operation on the hand deformation template T by combining the posture parameter theta, the skinning weight omega and the joint position J (theta), wherein the mathematical expression is as follows:
Figure FDA0004010484400000023
M(θ,β)=W(T(θ,β),θ,ω,J(θ))。
7. the RGB camera-based three-dimensional gesture tracking method of claim 6, wherein: the particle swarm optimization algorithm initializes a group of random particles, then finds an optimal solution through multiple iterations, in each iteration, the particles update themselves by tracking two extreme values, and after finding the two optimal values, the particles update their speed and position through the following formula:
Figure FDA0004010484400000024
Figure FDA0004010484400000025
wherein i =1,2, \8230, N is the particle swarm size, d is the particle dimension number, k is the iteration number, omega is the inertia weight, c 1 Is an individual learning factor, c 2 Is a population learning factor, r 1 ,r 2 Is the interval [0,1]Random number in, increase searchThe randomness of the cord is such that,
Figure FDA0004010484400000031
is the d-th dimension of the velocity vector of particle i in the k-th iteration,
Figure FDA0004010484400000032
is the location vector of particle i in dimension d in the kth iteration,
Figure FDA0004010484400000033
is the historical optimal position of the particle i in the d-th dimension in the k-th iteration, namely after the k-th iteration, the optimal solution is obtained by searching the ith particle,
Figure FDA0004010484400000034
the optimal solution is the historical optimal position of the population in the d-th dimension in the k-th iteration, namely the optimal solution in the whole population of particles after the k-th iteration.
CN202211650785.8A 2022-12-21 2022-12-21 Three-dimensional gesture tracking method based on RGB camera Pending CN115810219A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211650785.8A CN115810219A (en) 2022-12-21 2022-12-21 Three-dimensional gesture tracking method based on RGB camera

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211650785.8A CN115810219A (en) 2022-12-21 2022-12-21 Three-dimensional gesture tracking method based on RGB camera

Publications (1)

Publication Number Publication Date
CN115810219A true CN115810219A (en) 2023-03-17

Family

ID=85486419

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211650785.8A Pending CN115810219A (en) 2022-12-21 2022-12-21 Three-dimensional gesture tracking method based on RGB camera

Country Status (1)

Country Link
CN (1) CN115810219A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117894072A (en) * 2024-01-17 2024-04-16 北京邮电大学 Diffusion model-based hand detection and three-dimensional posture estimation method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117894072A (en) * 2024-01-17 2024-04-16 北京邮电大学 Diffusion model-based hand detection and three-dimensional posture estimation method and system
CN117894072B (en) * 2024-01-17 2024-09-24 北京邮电大学 Diffusion model-based hand detection and three-dimensional posture estimation method and system

Similar Documents

Publication Publication Date Title
CN111160164B (en) Action Recognition Method Based on Human Skeleton and Image Fusion
CN110852182B (en) Depth video human body behavior recognition method based on three-dimensional space time sequence modeling
CN113421328B (en) Three-dimensional human body virtual reconstruction method and device
CN108171133B (en) Dynamic gesture recognition method based on characteristic covariance matrix
CN109359514B (en) DeskVR-oriented gesture tracking and recognition combined strategy method
CN104463191A (en) Robot visual processing method based on attention mechanism
CN112750198B (en) Dense correspondence prediction method based on non-rigid point cloud
CN111368759B (en) Monocular vision-based mobile robot semantic map construction system
CN102682452A (en) Human movement tracking method based on combination of production and discriminant
Dibra et al. Monocular RGB hand pose inference from unsupervised refinable nets
CN106815855A (en) Based on the human body motion tracking method that production and discriminate combine
CN110135277B (en) Human behavior recognition method based on convolutional neural network
CN104751493A (en) Sparse tracking method on basis of gradient texture features
CN114036969A (en) 3D human body action recognition algorithm under multi-view condition
CN114882493A (en) Three-dimensional hand posture estimation and recognition method based on image sequence
CN117994480A (en) Lightweight hand reconstruction and driving method
Wu et al. An unsupervised real-time framework of human pose tracking from range image sequences
CN115810219A (en) Three-dimensional gesture tracking method based on RGB camera
CN114782992A (en) Super-joint and multi-mode network and behavior identification method thereof
CN113256789B (en) Three-dimensional real-time human body posture reconstruction method
CN118071932A (en) Three-dimensional static scene image reconstruction method and system
Liu et al. Key algorithm for human motion recognition in virtual reality video sequences based on hidden markov model
CN112365589B (en) Virtual three-dimensional scene display method, device and system
CN117115855A (en) Human body posture estimation method and system based on multi-scale transducer learning rich visual features
CN116758220A (en) Single-view three-dimensional point cloud reconstruction method based on conditional diffusion probability model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination