CN118172412A - Method and device for carrying out 3D human body posture positioning and restoring by using 2D image - Google Patents
Method and device for carrying out 3D human body posture positioning and restoring by using 2D image Download PDFInfo
- Publication number
- CN118172412A CN118172412A CN202410596275.XA CN202410596275A CN118172412A CN 118172412 A CN118172412 A CN 118172412A CN 202410596275 A CN202410596275 A CN 202410596275A CN 118172412 A CN118172412 A CN 118172412A
- Authority
- CN
- China
- Prior art keywords
- human body
- sample
- generated
- loss function
- body posture
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 230000006870 function Effects 0.000 claims abstract description 48
- 230000009471 action Effects 0.000 claims abstract description 28
- 238000013528 artificial neural network Methods 0.000 claims abstract description 10
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 9
- 238000004364 calculation method Methods 0.000 claims abstract description 7
- 230000008569 process Effects 0.000 claims abstract description 7
- 239000011159 matrix material Substances 0.000 claims description 32
- 238000006243 chemical reaction Methods 0.000 claims description 19
- 230000009466 transformation Effects 0.000 claims description 14
- 238000000844 transformation Methods 0.000 claims description 13
- 230000033001 locomotion Effects 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 5
- 230000004807 localization Effects 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 4
- 230000002441 reversible effect Effects 0.000 claims description 4
- 230000000903 blocking effect Effects 0.000 abstract description 5
- 238000005516 engineering process Methods 0.000 abstract description 5
- 238000013135 deep learning Methods 0.000 abstract description 2
- 230000036544 posture Effects 0.000 abstract 7
- 230000009467 reduction Effects 0.000 description 4
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000037237 body shape Effects 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000003387 muscular Effects 0.000 description 1
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a method and a device for carrying out 3D human body posture positioning and restoring by utilizing a 2D image, which relate to the technical field of 3D posture estimation and comprise S1, a standardized flow process for generating a sample; s2, dimension lifting and projection loss calculation; s3, shielding a processing network; s4, adjusting the part. According to the method and the device for carrying out 3D human body posture positioning and restoring by using the 2D image, the sample is generated by a standardized flow technology, the sample scale is increased so as to cover all possible postures and scenes as much as possible, the generalization capability of the deep learning method in the training process is improved, the missing depth information is obtained by dimension lifting and projection loss calculation, the blocked posture is used as the input of a blocking processing network, a multi-layer fully-connected neural network structure is adopted for predicting the blocking part, the 3D posture which is completely free of blocking is output, the length loss function and the action loss function are utilized for correction, and the stability and the accuracy of a 2D-3D posture restoring algorithm are improved.
Description
Technical Field
The invention relates to the technical field of 3D gesture estimation, in particular to a method and a device for carrying out 3D human gesture positioning and restoring by using a 2D image.
Background
3D pose estimation in short, we refer to a technique called 3D HPE, whose main goal is to try to predict the position of each important joint of the human body in a three-dimensional environment. This technique has extremely wide application, for example in human interaction with machines, or analysis of human movements, even in the field of rehabilitation and the like. In addition, it may provide information about bone structure for other computer vision tasks.
For the representation method of human body, there are mainly two kinds: one is to demonstrate the human body pose through a skeleton, which is made up of a series of key points and lines connecting them; the other is to display the posture and the body shape by using a grid model of the human body in a parameterized mode.
However, estimating the pose of three dimensions from two-dimensional images is a problem with uncertainty factors. That is, there may be a plurality of different three-dimensional poses whose two-dimensional projections are identical. In addition, the practical application of the technology is challenged by the existence of monocular image methods, such as self-occlusion, object occlusion, difficulty in obtaining depth information, and the like.
At present, although we can identify the key joint positions of the human body, the technology is quite mature, and how to restore the three-dimensional positions of the joints by means of limited photos under the condition of multiple people and shielding exists, so that the gesture and the action intention are deduced, which is a problem still needing to be studied deeply.
The prior art also has many challenges and problems in restoring 2D poses to 3D:
1. Absence of depth information: the lack of depth information is indeed a significant feature of 2D images, and to solve this problem, additional sensors, such as a depth camera or a laser scanner, are usually required, but these devices tend to be costly and inconvenient to use, and furthermore, even with depth information, it is not easy to directly extract the exact pose from the depth information, since the depth structure of the human body itself is complex;
2. diversity and complexity of gestures: the posture of the human body varies very much and is very complex. This is mainly due to the fact that the body structure and the muscular system of the person are very complex, and at the same time, environmental factors can influence the posture of the person;
3. limitations of the dataset: the existing data set is small in scale and lacks of diversity, and has great influence on generalization capability and universality of the model;
4. Stability and accuracy of the algorithm: the accuracy and stability of the existing 2D-to-3D gesture reduction algorithm still need to be improved, and particularly when complex gestures and occlusion situations are processed, the performance of the algorithm is often affected.
Therefore, the invention provides a method and a device for carrying out 3D human body posture positioning reduction by utilizing a 2D image.
Disclosure of Invention
The invention aims to provide a method and a device for carrying out 3D human body posture positioning and restoring by utilizing a 2D image so as to solve the problems in the background technology.
In order to achieve the above purpose, the present invention provides the following technical solutions:
In a first aspect, a method for performing 3D human body pose localization and restoration using a 2D image includes:
S1, generating a standardized flow process of a sample: training a real sample, adopting a standardized flow generation model, training independent generation models for legs, a trunk and arms on the left side and the right side, and minimizing the negative log likelihood of the generated sample and the real sample;
S2, dimension lifting and projection loss calculation: lifting the 2D coordinates to 3D through a projection conversion matrix, controlling the 3D object to rotate, and re-projecting the 3D object to 2D through the conversion matrix;
S3, shielding processing network: the 3D gesture with the occlusion is obtained from the 2D through dimension lifting, the gesture with the occlusion is used as the input of an occlusion processing network, the predicted 3D gesture without the occlusion is output, and the prediction part adopts a multi-layer fully-connected neural network structure for prediction;
s4, an adjusting part: the method comprises the steps of introducing a length loss function for generating limbs, correcting the limb length predicted by a network, avoiding the deviation of the generated limb length from a normal value, defining an action loss function for two actions with fixed time intervals during training, and correcting the generated actions to enable the change of the action to be closer to a real value.
Further, in the step S1, the normalized flow is a technique for constructing a more complex generation model in GAN, which includes a generator and a arbiter, and the generator and the arbiter are all typically composed of a deep neural network, by a series of reversible and easy-to-calculate transforms, which transform an input random noise vector into data samples with a desired distribution, and which transform is typically parameterized and can be learned by an optimization algorithm;
The main task of the generator is to generate data from random noise, and the generated data should be as close to real data as possible, and the use flow of the generator is as follows: receiving a random noise vector as input and then generating new data samples through a series of transformations;
The task of the arbiter is to distinguish whether the input data comes from a real data set or is generated by a generator, and the usage flow of the arbiter is as follows: a data sample is received as input and a probability value is then output through a series of transformations indicating the likelihood that the sample is real data.
Further, in the step S1, the formula of the normalized stream is as follows: Wherein/> Is the generated coordinates,/>Is normalized flow with parameters,/>Is the estimated true position,/>Is a constant adjustment coefficient,/>Representing gaussian noise, the standard deviation and variance are 0 and 1.
Further, in the step S1, the formula of likelihood estimation is as follows: Wherein/> Is the likelihood,/>Is the generated coordinates,/>Is a real (GT) coordinate,/>Is the number of samples,/>The probability density function is estimated by a standardized equation, is a distribution to be trained, is required to be trained by using samples, corresponds to one position, generally 5 positions, corresponds to the trunk, the hands and the feet respectively, and θ is a representation parameter without practical meaning.
Further, in step S2, the same 3D motion or pose generates several 2D projections, and a correct 3D reconstruction is re-projected after rotation, so that different 2D images should be generated by the same motion, and a loss function is defined to obtain a minimum value, and the 2D loss function is as follows: Wherein/> Is the original 2D coordinates,/>Is the calculated rotated 3D coordinate, P is the 3D to 2D conversion matrix, R is the rotated azimuth angle matrix, and the azimuth angle range is [ -, pi ],/>The representation is an inverse matrix.
Further, in the step S2, for the 3D coordinates, a loss function is also calculated to keep the conversion consistent, and the loss function of the 3D is as follows: Wherein/> Is the reconstructed 3D coordinates,/>Is the rotated coordinate, P is the 3D to 2D conversion matrix, R is the rotated azimuth angle matrix, and the azimuth angle range is [ -, II ],/>The representation is an inverse matrix.
Further, in the step S3, the network structure adopts a 4-5 layer fully connected network, the activation function adopts relu, and the loss function is defined as follows: wherein the coordinate representation of the subscript m is the predicted value of the occlusion network and the subscript o is GT, including the real sample and the generated sample.
Further, in the step S4, the length loss function is defined as follows: where b represents the length of a limb, including the spine, arm, leg, etc., K is the total number of sample limbs,/> Is a model predictive value,/>Is the value of GT;
The action loss function is defined as follows:
where a, b represent 2 actions of adjacent time intervals,/> Representing coordinates predicted from the model from the actual GT sample,/>Representing the coordinates predicted from the generated samples.
In a second aspect, an apparatus for performing 3D human body posture positioning restoration using a 2D image includes: the system comprises a memory, a processor and computer program instructions stored on the memory and executable on the processor, wherein the processor executes the computer program instructions to implement the method for performing 3D human body posture positioning and restoring by using the 2D image.
In a third aspect, a computer readable storage medium has stored therein computer executable instructions for implementing a method for 3D human body pose location restoration using 2D images as described above when executed by a processor.
The invention provides a method and a device for carrying out 3D human body posture positioning and restoring by utilizing a 2D image, which have the following beneficial effects:
According to the invention, a sample is generated by a standardized flow technology, the sample scale is increased, so that all possible gestures and scenes can be covered as much as possible, the generalization capability of the deep learning method in the training process is improved, missing depth information is obtained by dimension lifting and projection loss calculation, the blocked gesture is used as the input of a blocking processing network, a multi-layer fully-connected neural network structure is adopted to predict a blocked part, a 3D gesture which is completely free of blocking is output, a length loss function and an action loss function are utilized to correct, the generated limb length is prevented from deviating from a normal value, the action change is enabled to be closer to a true value, and the stability and accuracy of a 2D-3D gesture reduction algorithm are improved.
Drawings
Fig. 1 is a standardized flow operation logic diagram of a method for performing 3D human body posture positioning and restoring by using a 2D image.
Detailed Description
Embodiments of the present invention are described in further detail below with reference to the accompanying drawings and examples. The following examples are illustrative of the invention but are not intended to limit the scope of the invention.
As shown in fig. 1, a method for performing 3D human body posture positioning and restoring by using a 2D image includes:
S1, generating a standardized flow process of a sample: training a real sample, adopting a standardized flow generation model, training independent generation models for legs, a trunk and arms on the left side and the right side, and minimizing the negative log likelihood of the generated sample and the real sample;
In step S1, the normalized flow is a technique for constructing a more complex generative model in GAN, which is typically learned by an optimization algorithm by converting the input random noise vector into data samples with the desired distribution through a series of reversible and easy-to-calculate transformations, which are typically parameterized, the GAN including generators and discriminators, which are typically all composed of deep neural networks;
the main task of the generator is to generate data from random noise, and the generated data should be as close to real data as possible, and the use flow of the generator is as follows: receiving a random noise vector as input and then generating new data samples through a series of transformations;
the task of the arbiter is to distinguish whether the input data comes from the real data set or is generated by the generator, the usage flow of the arbiter is as follows: receiving a data sample as input and then outputting a probability value through a series of transformations, representing the likelihood that the sample is real data;
in step S1, the formula of the normalized stream is as follows: Wherein/> Is the coordinates that are generated and are used to generate the coordinate,Is normalized flow with parameters,/>Is the estimated true position,/>Is a constant adjustment coefficient,/>Representing gaussian noise, the standard deviation and variance are 0 and 1;
In step S1, the formula of likelihood estimation is as follows: Wherein/> Is the likelihood,/>Is the generated coordinates,/>Is the real (GT) coordinate, N is the number of samples,/>The probability density function is estimated by a standardized equation, is a distribution to be trained, is required to be trained by using samples, corresponds to one position, generally 5 positions, corresponds to trunk, hands and feet respectively, and has no practical meaning, wherein theta is a representation parameter;
S2, dimension lifting and projection loss calculation: lifting the 2D coordinates to 3D through a projection conversion matrix, controlling the 3D object to rotate, and re-projecting the 3D object to 2D through the conversion matrix;
In step S2, the same 3D motion or pose will generate several 2D projections, and a correct 3D reconstruction is re-projected after rotation, so that different 2D images should be generated by the same motion, and a loss function can be defined to obtain a minimum value, and the 2D loss function is as follows: Wherein/> Is the original 2D coordinates,/>Is the calculated rotated 3D coordinate, P is the 3D to 2D conversion matrix, R is the rotated azimuth angle matrix, and the azimuth angle range is [ -, pi ],/>The representation is an inverse matrix;
S3, shielding processing network: the 3D gesture with the occlusion is obtained from the 2D through dimension lifting, the gesture with the occlusion is used as the input of an occlusion processing network, the predicted 3D gesture without the occlusion is output, and the prediction part adopts a multi-layer fully-connected neural network structure for prediction;
In step S2, for the 3D coordinates, the loss function is also calculated to keep the conversion consistent, and the loss function of 3D is as follows: Wherein/> Is the reconstructed 3D coordinates,/>Is the rotated coordinate, P is the 3D to 2D conversion matrix, R is the rotated azimuth angle matrix, and the azimuth angle range is [ -, II ],/>The representation is an inverse matrix;
In step S3, the network structure adopts a 4-5 layer fully connected network, the activation function adopts relu, and the definition of the loss function is as follows: Wherein the coordinate representation of the subscript m is a predicted value of the shielding network, and the subscript o is GT, and the subscript m comprises a real sample and a generated sample;
S4, an adjusting part: introducing a length loss function for generating limbs, correcting the limb length predicted by the network, avoiding the generated limb length from deviating from a normal value, defining an action loss function for two actions with fixed time intervals during training, and correcting the generated actions to ensure that the change of the action is closer to a real value;
in step S4, the length loss function is defined as follows: where b represents the length of a limb, including the spine, arm, leg, etc., K is the total number of sample limbs,/> Is a model predictive value,/>Is the value of GT;
The action loss function is defined as follows: where a, b represent 2 actions of adjacent time intervals,/> Representing coordinates predicted from the model from the actual GT sample,/>Representing the coordinates predicted from the generated samples.
An apparatus for 3D human body pose localization restoration using 2D images, comprising: the system comprises a memory, a processor and computer program instructions stored on the memory and executable on the processor, wherein the processor executes the computer program instructions to implement the method for performing 3D human body posture positioning and restoring by using the 2D image.
A computer readable storage medium, wherein computer executable instructions are stored in the computer readable storage medium, and the computer executable instructions are executed by a processor to implement the method for performing 3D human body posture positioning reduction using 2D images as described above.
In summary, as shown in fig. 1, the working principle of the method and the device for performing 3D human body posture positioning and restoring by using the 2D image is as follows:
S1, generating a standardized flow process of a sample: by training the real sample, a model is generated by adopting a standardized flow, and the formula of the standardized flow is as follows: Wherein/> Is the generated coordinates,/>Is normalized flow with parameters,/>Is the estimated true position,/>Is a constant adjustment coefficient,/>Representing gaussian noise, the standard deviation and variance are 0 and 1; and the legs, the trunk and the arms on the left and right sides need to train independent generation models, and the negative log likelihood of the generated samples and the real samples is minimized, and the likelihood estimation formula is as follows: /(I)Wherein/>Is the likelihood,/>Is the generated coordinates,/>Is a real (GT) coordinate,/>Is the number of samples,/>The probability density function is estimated by a standardized equation, is a distribution to be trained, is required to be trained by using samples, corresponds to one position, generally 5 positions, and corresponds to the trunk, the hands and the feet respectively; normalized flow is a technique in GAN for building more complex generative models, converting an input random noise vector into data samples with a desired distribution through a series of reversible and easy-to-calculate transformations, and the transformations are typically parameterized, can be learned by optimization algorithms, the GAN includes generators and discriminants, and the generators and discriminants are typically all made up of deep neural networks;
the main task of the generator is to generate data from random noise, and the generated data should be as close to real data as possible, and the use flow of the generator is as follows: receiving a random noise vector as input and then generating new data samples through a series of transformations;
the task of the arbiter is to distinguish whether the input data comes from the real data set or is generated by the generator, the usage flow of the arbiter is as follows: receiving a data sample as input and then outputting a probability value through a series of transformations, representing the likelihood that the sample is real data;
S2, dimension lifting and projection loss calculation: lifting the 2D coordinates to 3D through a projection conversion matrix, controlling the 3D object to rotate, and re-projecting the 3D object to 2D through the conversion matrix; the same 3D motion or pose will produce several 2D projections, while a correct 3D reconstruction is re-projected after rotation, generating different 2D images corresponding to the same motion, by defining a loss function that is minimized, and the 2D loss function is as follows: Wherein/> Is the original 2D coordinates,/>Is the calculated rotated 3D coordinate, P is the 3D to 2D conversion matrix, R is the rotated azimuth angle matrix, and the azimuth angle range is [ -, pi ],/>The representation is an inverse matrix; for 3D coordinates, it is also necessary to calculate the loss function, keeping the transformation consistent, and the loss function for 3D is as follows: /(I)Wherein/>Is the reconstructed 3D coordinates,/>Is the rotated coordinate, P is a 3D to 2D conversion matrix, R is a rotated azimuth angle matrix, the azimuth angle range is [ - ], pi ],The representation is an inverse matrix;
S3, shielding processing network: the 3D gesture with the occlusion is obtained from the 2D through dimension lifting, the gesture with the occlusion is used as the input of an occlusion processing network, the predicted 3D gesture without the occlusion is output, and the prediction part adopts a multi-layer fully-connected neural network structure for prediction; the network structure adopts 4-5 layers of fully connected networks, the activation function adopts relu, and the loss function is as follows: Wherein the coordinate representation of the subscript m is a predicted value of the occlusion network, and the subscript o is GT, including a real sample and a generated sample;
S4, an adjusting part: introducing a length loss function for generating limbs, correcting the limb length predicted by the network, and avoiding the generated limb length from deviating from a normal value, wherein the length loss function is defined as follows: wherein b represents the length of a limb, including spine, arm, leg, etc., K is the total number of sample limbs,/> Is a model predictive value,/>Is the value of GT; and for two actions with fixed time intervals, an action loss function is defined during training, and the generated actions are corrected so that the change of the actions is closer to a real value, and the action loss function is defined as follows: /(I)Wherein a, b represent 2 actions of adjacent time intervals,/>Representing coordinates predicted from the model from the actual GT sample,/>Representing the coordinates predicted from the generated samples.
The embodiments of the invention have been presented for purposes of illustration and description, and are not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
Claims (10)
1. A method for performing 3D human body posture positioning and restoring by using a 2D image, comprising the steps of:
S1, generating a standardized flow process of a sample: training a real sample, adopting a standardized flow generation model, training independent generation models for legs, a trunk and arms on the left side and the right side, and minimizing the negative log likelihood of the generated sample and the real sample;
S2, dimension lifting and projection loss calculation: lifting the 2D coordinates to 3D through a projection conversion matrix, controlling the 3D object to rotate, and re-projecting the 3D object to 2D through the conversion matrix;
S3, shielding processing network: the 3D gesture with the occlusion is obtained from the 2D through dimension lifting, the gesture with the occlusion is used as the input of an occlusion processing network, the predicted 3D gesture without the occlusion is output, and the prediction part adopts a multi-layer fully-connected neural network structure for prediction;
s4, an adjusting part: the method comprises the steps of introducing a length loss function for generating limbs, correcting the limb length predicted by a network, avoiding the deviation of the generated limb length from a normal value, defining an action loss function for two actions with fixed time intervals during training, and correcting the generated actions to enable the change of the action to be closer to a real value.
2. A method for 3D human body posture localization restoration with 2D images according to claim 1, characterized in that in step S1, the normalized flow is a technique for constructing more complex generative models in GAN, the input random noise vector is transformed into data samples with the required distribution through a series of reversible and easy-to-calculate transformations, and the transformations are usually parameterized, and can be learned by optimization algorithms, the GAN comprises a generator and a arbiter, and the generator and the arbiter are all typically composed of deep neural networks;
The main task of the generator is to generate data from random noise, and the generated data should be as close to real data as possible, and the use flow of the generator is as follows: receiving a random noise vector as input and then generating new data samples through a series of transformations;
The task of the arbiter is to distinguish whether the input data comes from a real data set or is generated by a generator, and the usage flow of the arbiter is as follows: a data sample is received as input and a probability value is then output through a series of transformations indicating the likelihood that the sample is real data.
3. The method for performing 3D human body posture positioning restoration using 2D images according to claim 1, wherein in the step S1, the formula of the normalized flow is as follows: Wherein/> Is the coordinates that are generated and are used to generate the coordinate,Is normalized flow with parameters,/>Is the estimated true position,/>Is a constant adjustment coefficient,/>Representing gaussian noise, the standard deviation and variance are 0 and 1.
4. The method for performing 3D human body posture positioning restoration using 2D images according to claim 1, wherein in the step S1, the formula of likelihood ratio estimation is as follows: Wherein/> Is the likelihood,/>Is the generated coordinates,/>Is a real (GT) coordinate,/>Is the number of samples,/>The probability density function is estimated by a standardized equation, is a distribution to be trained, and is required to be trained by using samples, wherein one standardized flow corresponds to one position, generally 5 positions, and the positions correspond to the trunk, the hands and the feet respectively.
5. A method for performing 3D human body posture positioning and restoring by using 2D images according to claim 1, wherein in the step S2, the same 3D motion or posture generates several 2D projections, and a correct 3D reconstruction is re-projected after rotation, so that different 2D images should be generated by the same motion, a loss function can be defined to obtain a minimum value, and the 2D loss function is as follows: Wherein/> Is the original 2D coordinates,/>Is the calculated rotated 3D coordinates, P is a 3D to 2D conversion matrix, R is a rotated azimuth angle matrix, the azimuth angle range is [ - ], pi ],The representation is an inverse matrix.
6. The method for performing 3D human body posture positioning and restoring by using 2D images according to claim 5, wherein in the step S2, a loss function is also calculated for the 3D coordinates, so that the conversion is consistent, and the loss function of 3D is as follows: Wherein/> Is the reconstructed 3D coordinates,/>Is the rotated coordinate, P is the 3D to 2D conversion matrix, R is the rotated azimuth angle matrix, and the azimuth angle range is [ -, II ],/>The representation is an inverse matrix.
7. The method for performing 3D human body posture positioning and restoring by using 2D images according to claim 1, wherein in the step S3, a 4-5 layer fully connected network is adopted as the network structure, relu is adopted as the activation function, and the loss function is defined as follows: wherein the coordinate representation of the subscript m is the predicted value of the occlusion network and the subscript o is GT, including the real sample and the generated sample.
8. The method for performing 3D human body posture positioning restoration by using 2D images according to claim 1, wherein in the step S4, a length loss function is defined as follows: where b represents the length of a limb, including the spine, arm, leg, etc., K is the total number of sample limbs,/> Is a model predictive value,/>Is the value of GT;
The action loss function is defined as follows:
where a, b represent 2 actions of adjacent time intervals,/> Representing coordinates predicted from the model from the actual GT sample,/>Representing the coordinates predicted from the generated samples.
9. A device for performing 3D human body posture positioning and restoring by using a 2D image, comprising: memory, a processor and computer program instructions stored on the memory and executable on the processor, the processor executing the computer program instructions to implement the method of 3D human body pose localization restoration using 2D images as claimed in any of the preceding claims 1-8.
10. A computer readable storage medium, wherein computer executable instructions are stored in the computer readable storage medium, which when executed by a processor is configured to implement the method for 3D human body posture localization restoration using 2D images according to any of the preceding claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410596275.XA CN118172412A (en) | 2024-05-14 | 2024-05-14 | Method and device for carrying out 3D human body posture positioning and restoring by using 2D image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410596275.XA CN118172412A (en) | 2024-05-14 | 2024-05-14 | Method and device for carrying out 3D human body posture positioning and restoring by using 2D image |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118172412A true CN118172412A (en) | 2024-06-11 |
Family
ID=91347208
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410596275.XA Pending CN118172412A (en) | 2024-05-14 | 2024-05-14 | Method and device for carrying out 3D human body posture positioning and restoring by using 2D image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118172412A (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108038465A (en) * | 2017-12-25 | 2018-05-15 | 深圳市唯特视科技有限公司 | A kind of three-dimensional more personage's Attitude estimations based on generated data collection |
CN108133202A (en) * | 2018-01-17 | 2018-06-08 | 深圳市唯特视科技有限公司 | It is a kind of that hand gestures method of estimation is blocked based on layering mixture density network certainly |
US20210248772A1 (en) * | 2020-02-11 | 2021-08-12 | Nvidia Corporation | 3d human body pose estimation using a model trained from unlabeled multi-view data |
DE102021102748A1 (en) * | 2020-02-11 | 2021-08-12 | Nvidia Corporation | 3D HUMAN BODY POST ESTIMATE USING A MODEL TRAINED FROM UNLABELED MULTI-VIEW DATA |
CN114066932A (en) * | 2021-09-26 | 2022-02-18 | 浙江工业大学 | Real-time deep learning-based multi-person human body three-dimensional posture estimation and tracking method |
CN114611600A (en) * | 2022-03-09 | 2022-06-10 | 安徽大学 | Self-supervision technology-based three-dimensional attitude estimation method for skiers |
CN115953839A (en) * | 2022-12-26 | 2023-04-11 | 广州紫为云科技有限公司 | Real-time 2D gesture estimation method based on loop architecture and coordinate system regression |
KR20230081378A (en) * | 2021-11-30 | 2023-06-07 | 광운대학교 산학협력단 | Multi-view semi-supervised learning for 3D human pose estimation |
CN117234331A (en) * | 2023-08-30 | 2023-12-15 | 南京博雅涵天科技有限公司 | Time sequence-based hand gesture interaction system and interaction method |
CN117671800A (en) * | 2023-12-27 | 2024-03-08 | 广州市浩洋电子股份有限公司 | Human body posture estimation method and device for shielding and electronic equipment |
-
2024
- 2024-05-14 CN CN202410596275.XA patent/CN118172412A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108038465A (en) * | 2017-12-25 | 2018-05-15 | 深圳市唯特视科技有限公司 | A kind of three-dimensional more personage's Attitude estimations based on generated data collection |
CN108133202A (en) * | 2018-01-17 | 2018-06-08 | 深圳市唯特视科技有限公司 | It is a kind of that hand gestures method of estimation is blocked based on layering mixture density network certainly |
US20210248772A1 (en) * | 2020-02-11 | 2021-08-12 | Nvidia Corporation | 3d human body pose estimation using a model trained from unlabeled multi-view data |
DE102021102748A1 (en) * | 2020-02-11 | 2021-08-12 | Nvidia Corporation | 3D HUMAN BODY POST ESTIMATE USING A MODEL TRAINED FROM UNLABELED MULTI-VIEW DATA |
CN114066932A (en) * | 2021-09-26 | 2022-02-18 | 浙江工业大学 | Real-time deep learning-based multi-person human body three-dimensional posture estimation and tracking method |
KR20230081378A (en) * | 2021-11-30 | 2023-06-07 | 광운대학교 산학협력단 | Multi-view semi-supervised learning for 3D human pose estimation |
CN114611600A (en) * | 2022-03-09 | 2022-06-10 | 安徽大学 | Self-supervision technology-based three-dimensional attitude estimation method for skiers |
CN115953839A (en) * | 2022-12-26 | 2023-04-11 | 广州紫为云科技有限公司 | Real-time 2D gesture estimation method based on loop architecture and coordinate system regression |
CN117234331A (en) * | 2023-08-30 | 2023-12-15 | 南京博雅涵天科技有限公司 | Time sequence-based hand gesture interaction system and interaction method |
CN117671800A (en) * | 2023-12-27 | 2024-03-08 | 广州市浩洋电子股份有限公司 | Human body posture estimation method and device for shielding and electronic equipment |
Non-Patent Citations (1)
Title |
---|
周意乔;徐昱琳;: "基于双向LSTM的复杂环境下实时人体姿势识别", 仪器仪表学报, no. 03, 15 March 2020 (2020-03-15) * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111598998B (en) | Three-dimensional virtual model reconstruction method, three-dimensional virtual model reconstruction device, computer equipment and storage medium | |
CN110827342A (en) | Three-dimensional human body model reconstruction method, storage device and control device | |
JP5525407B2 (en) | Behavior model learning device, three-dimensional posture estimation device, behavior model learning method, three-dimensional posture estimation method, and program | |
CN112861598B (en) | System and method for human body model estimation | |
CN111645065A (en) | Mechanical arm motion planning method based on deep reinforcement learning | |
JP2008511932A (en) | System and method for registration and modeling of deformable shapes by direct factorization | |
JP2023524252A (en) | Generative nonlinear human shape model | |
JP2014085933A (en) | Three-dimensional posture estimation apparatus, three-dimensional posture estimation method, and program | |
CN112419419A (en) | System and method for human body pose and shape estimation | |
CN113298047A (en) | 3D form and posture estimation method and device based on space-time correlation image | |
KR102461111B1 (en) | Texture mesh reconstruction system based on single image and method thereof | |
CN117218246A (en) | Training method and device for image generation model, electronic equipment and storage medium | |
CN116740290B (en) | Three-dimensional interaction double-hand reconstruction method and system based on deformable attention | |
Madadi et al. | Deep unsupervised 3D human body reconstruction from a sparse set of landmarks | |
CN113989283A (en) | 3D human body posture estimation method and device, electronic equipment and storage medium | |
AU2020436768B2 (en) | Joint rotation inferences based on inverse kinematics | |
CN112184912A (en) | Multi-metric three-dimensional face reconstruction method based on parameterized model and position map | |
CN118172412A (en) | Method and device for carrying out 3D human body posture positioning and restoring by using 2D image | |
CN116758212A (en) | 3D reconstruction method, device, equipment and medium based on self-adaptive denoising algorithm | |
CN116248920A (en) | Virtual character live broadcast processing method, device and system | |
CN115049764A (en) | Training method, device, equipment and medium for SMPL parameter prediction model | |
KR20200057572A (en) | Hand recognition augmented reality-intraction apparatus and method | |
CA3177593A1 (en) | Transformer-based shape models | |
Alcoverro et al. | Skeleton and shape adjustment and tracking in multicamera environments | |
CN115471863A (en) | Three-dimensional posture acquisition method, model training method and related equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination |