CN112926475A - Human body three-dimensional key point extraction method - Google Patents

Human body three-dimensional key point extraction method Download PDF

Info

Publication number
CN112926475A
CN112926475A CN202110251506.XA CN202110251506A CN112926475A CN 112926475 A CN112926475 A CN 112926475A CN 202110251506 A CN202110251506 A CN 202110251506A CN 112926475 A CN112926475 A CN 112926475A
Authority
CN
China
Prior art keywords
dimensional
human body
key point
dimensional key
key points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110251506.XA
Other languages
Chinese (zh)
Other versions
CN112926475B (en
Inventor
刘晞
刘勇国
李巧勤
杨尚明
朱嘉静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202110251506.XA priority Critical patent/CN112926475B/en
Publication of CN112926475A publication Critical patent/CN112926475A/en
Application granted granted Critical
Publication of CN112926475B publication Critical patent/CN112926475B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention discloses a human body three-dimensional key point extraction method, which is applied to the field of human body three-dimensional key point detection and aims at solving the problem of poor estimation precision in the prior art, firstly, double view angles are adopted to collect human body action data; then, a double-branch multi-stage structure is adopted to respectively detect two-dimensional key point confidence maps of the human body on the data of the two visual angles; further establishing a three-dimensional key point generation model; finally, inputting a human body two-dimensional key point confidence map corresponding to the detected human body action behavior data to be detected into a three-dimensional key point generation model to obtain three-dimensional key point coordinates; the method can effectively improve the estimation precision of the three-dimensional key points of the human body.

Description

Human body three-dimensional key point extraction method
Technical Field
The invention belongs to the field of image processing, and particularly relates to a three-dimensional key point detection technology.
Background
At present, human body motion capture technology is widely adopted in aspects of health monitoring, movie and television production and the like, and the motion of a virtual character is rendered to be more real according to the real motion of a reproduced human body, wherein human body key point detection is the basis for realizing human body motion reproduction. And two-dimensional key point detection and three-dimensional key point detection can be divided according to whether the detection result contains three-dimensional depth information. The two-dimensional key point detection is researched more, but false detection and missed detection are easily caused due to reasons such as shielding or light and shadow change, and the detection precision is influenced.
At present, three-dimensional key point detection is mainly divided into two types: the method is characterized in that three-dimensional key point detection is directly carried out from an image, and Chinese invention patent 'a combined target classification and three-dimensional attitude estimation method (CN108280481A) based on a residual error network' carries out key point feature extraction and classification based on the residual error network ResNet-50 to realize three-dimensional key point detection; the other method is to firstly acquire two-dimensional coordinates of key points from images and then generate three-dimensional coordinates based on the two-dimensional coordinates of the key points, and the Chinese patent 'a method for estimating the three-dimensional posture of a human body based on structural information (CN 110427877A)' firstly inputs monocular RGB images into a two-dimensional posture detector to acquire the two-dimensional coordinates of the key points, then constructs a graph convolution network based on the structural information of the two-dimensional key points and outputs the three-dimensional coordinates of the key points.
The existing three-dimensional key point detection method has the following defects: 1) the method for directly checking the coordinates of the three-dimensional key points through the images usually depends on other parameters, such as a camera projection matrix, and the parameters are not usually marked in the video data; 2) the direct three-dimensional key point labeling is difficult, the existing training data basically come from a motion capture system, the scene and the object are single, and the generalization capability of the trained model is limited; 3) the detection of three-dimensional key points of a video is generally carried out according to frames, a static image of each frame is processed, and time sequence information between continuous frames and action change of the previous frame and the next frame are ignored.
Disclosure of Invention
In order to solve the technical problems, the invention provides a human body three-dimensional key point extraction method which comprises the steps of firstly, respectively extracting features of two visual angle original images, preliminarily generating two-dimensional key point coordinates through two-dimensional key point confidence detection, simultaneously establishing a three-dimensional key point generation model, cooperatively predicting the three-dimensional key point coordinates through double visual angles, and improving the detection precision of the human body three-dimensional key points.
The technical scheme adopted by the invention is as follows: a human body three-dimensional key point extraction method comprises the following steps:
s1, collecting human body action data by adopting double visual angles;
s2, detecting two-dimensional key point confidence maps of the human body on the data of the two visual angles by adopting a double-branch multi-stage structure;
s3, establishing a three-dimensional key point generation model;
s4, processing the human body action behavior data to be detected acquired in the step S1 in the step S2 to obtain a corresponding two-dimensional key point confidence map, and inputting the two-dimensional key point confidence map into the three-dimensional key point generation model established in the step S3 to obtain three-dimensional key point coordinates.
Step S1 specifically includes: two cameras are adopted and marked as a camera A and a camera B, human action behavior data acquisition is carried out simultaneously, and synchronous frame sampling is carried out on acquired video data.
The two-branch multi-stage structure in step S2 specifically includes: the upper branch is used for learning the positions of key points in the camera A, the lower branch is used for learning the positions of key points in the camera B, and the upper branch and the lower branch comprise a plurality of stages, wherein 3 layers of 3 × 3 convolution and two layers of 1 × 1 convolution are adopted in the stage 1, and 5 layers of 7 × 7 convolution and two layers of 1 × 1 convolution are adopted in the rest stages.
The method also comprises the steps of extracting original image features by adopting a layer of three-dimensional CNN, wherein the input of the first stage is the original image features extracted by the layer of three-dimensional CNN; the input of the subsequent stage is the original image characteristics extracted by the three-dimensional CNN of the layer and the confidence map prediction result of the previous stage.
The layer of three-dimensional CNN is used for extracting image features of the current frame and frames before and after the current frame.
The three-dimensional CNN convolution kernel size of this layer is 3 × 3 × 3.
The three-dimensional key point generation model of step S3 specifically includes: and the three convolutional layers and the one full-connection layer adopt a sigmoid function as an output unit and use a ReLU function as an activation function of the convolutional layers.
The method further comprises the following step of carrying out weak supervision training on the three-dimensional key point generation model, specifically: and the optimization target is a minimum loss function, a gradient descent method is adopted for carrying out back propagation weight training, and the parameters of the three-dimensional key point generation model are updated in an iterative manner.
The loss function expression is:
Loss=LD+LTD+f
wherein L isDRepresenting the distance error loss function, LTDAnd f represents a two-dimensional confidence loss function of the whole double-branch multi-stage structure.
The invention has the beneficial effects that: according to the method, video data of two visual angles are acquired through a common camera, characteristics are extracted through CNN, and two-branch multi-stage structures are utilized to respectively detect confidence maps of two-dimensional key points of a human body on the data of the two visual angles; designing a three-dimensional CNN model, generating three-dimensional key point coordinates based on a two-dimensional key point confidence map of the detected human body action behavior data to be detected, and performing weak supervision training on the model through double-view combination; the method of the invention has the following advantages:
1) the three-dimensional convolution neural network simultaneously extracts the spatial characteristics and the track information of the key points and the time characteristics of the whole activity process, and reduces the confidence map error of the two-dimensional key points by utilizing the interframe correlation information.
2) The method generates the three-dimensional coordinates based on the detected two-dimensional confidence map, reduces the influence brought by the process of converting the confidence map into the two-dimensional key point coordinates, establishes the loss function training model by combining the consistency of the time point and the human body posture through the double visual angles, solves the problem of lack of three-dimensional key point marking, simultaneously repairs the influence brought by missing detection of the key points of a single visual angle, and obviously improves the estimation precision of the human body three-dimensional key points.
Drawings
FIG. 1 is a flow chart of three-dimensional keypoint coordinate estimation;
FIG. 2 is a two-branch multi-stage structure provided by the present invention;
fig. 3 is a schematic diagram of three-dimensional coordinate generation.
Detailed Description
In order to facilitate the understanding of the technical contents of the present invention by those skilled in the art, the present invention will be further explained with reference to the accompanying drawings.
As shown in fig. 1, the method of the present invention mainly includes the steps of video data acquisition, two-dimensional key point detection, three-dimensional key point coordinate generation, etc., and the specific steps are as follows:
1. data acquisition
The invention adopts two common cameras, the direction of the camera A and the camera B forms an angle of 90 degrees, the human body action data acquisition is carried out simultaneously, the synchronous frame sampling is carried out on the acquired video data, the sampling frequency is set to be 30Hz, namely 30 frames of images are sampled per second, and the size of each frame of image is represented as w multiplied by h pixels.
The azimuth angles of the camera a and the camera B are not limited to 90 degrees, and may be other angles.
2. Video-based three-dimensional keypoint detection
According to the invention, three-dimensional key point detection is carried out by combining a three-dimensional CNN (Convolutional Neural Networks) with a two-layer CPM (Convolutional attitude Machine) network, a network frame is designed into a multi-stage double-view branch structure, as shown in FIG. 2, each branch corrects a confidence map through multiple stages, and each stage has supervision training, so that the problem that an over-deep network is difficult to optimize is solved. In which stage 1 (denoted by block1 in fig. 2) employs 3 layers of 3 × 3 convolution and two layers of 1 × 1 convolution, and the remaining stages (denoted by block2 in fig. 2) each employ 5 layers of 7 × 7 convolution and two layers of 1 × 1 convolution. The upper branch is used for learning the position of a key point in the view A and is represented as a confidence map, namely the probability of the key point at a certain point on the image; the lower branch is used to learn the location of the keypoints in view B. And finally, realizing three-dimensional key point prediction by the cooperation of the upper branch and the lower branch. C within the box in fig. 2 is used to represent convolution.
The specific process is as follows:
1) two-dimensional confidence map detection
Inputting the images with the size of w multiplied by h multiplied by T collected in the view angle A and the view angle B into a double-branch multi-stage structure network,t represents the number of image frames, firstly, the image characteristics F of the corresponding frame and the frames before and after the corresponding frame are extracted through a layer of three-dimensional CNNAAnd FBThe sizes of convolution kernels are all set to be 3 multiplied by 3 so as to extract time sequence characteristics between three frames; the data size after convolution was (w-2) × (h-2) × T. FAAnd FBCorresponding to the two branches respectively, and taking the branches as the input of the first stage. The first stage of the network being based on image features FAAnd FBGenerating two sets of detection confidence maps
Figure BDA0002966264320000041
And
Figure BDA0002966264320000042
wherein the content of the first and second substances,
Figure BDA0002966264320000043
Figure BDA0002966264320000044
r represents a real number, i.e.
Figure BDA0002966264320000045
The method is characterized in that the method is a w x h matrix and represents a confidence map of J key points in the first stage in a visual angle A, J belongs to {1 … J }, and J represents the number of the key points;
Figure BDA0002966264320000046
and (3) representing a confidence map of J key points at the first stage in the view B, wherein J is equal to {1 … J }, and J represents the number of key points.
Then, each stage is similar to the first stage, and the confidence map prediction result and the original image characteristic F from the previous stage are inputAAnd FBTo produce more accurate prediction results.
Figure BDA0002966264320000047
And
Figure BDA0002966264320000048
representing the network structure of the nth stage (it should be noted by those skilled in the art that the network here isThe structure is equivalent to a process function),
Figure BDA0002966264320000049
and
Figure BDA00029662643200000410
representing the set S obtained in the n-th stageAAnd set SB
Figure BDA00029662643200000411
Figure BDA00029662643200000412
Wherein the content of the first and second substances,
Figure BDA00029662643200000413
representing a confidence map of J key points at the nth stage in the visual angle A, wherein J belongs to {1 … J }, J represents the number of the key points, N belongs to {1 … N }, and N represents the number of model stages;
Figure BDA00029662643200000414
Figure BDA00029662643200000415
a confidence map representing the jth keypoint of the nth stage in view B.
The confidence map loss function for stage n is calculated as follows:
Figure BDA00029662643200000416
Figure BDA00029662643200000417
where P represents each pixel location in the image,
Figure BDA00029662643200000418
is a prediction confidence map for the jth keypoint in the nth stage in view a,
Figure BDA00029662643200000419
is the true confidence map of the jth keypoint in view A;
Figure BDA00029662643200000420
is a prediction confidence map for the jth keypoint in the nth stage in view B,
Figure BDA00029662643200000421
the confidence coefficient map is a real confidence coefficient map of the jth key point in the view angle B, W is a binary mask matrix and is used for reducing errors caused by label value missing, when a label at the position P is missing, the value of the binary mask matrix is 0, otherwise, the value of the binary mask matrix is 1.
The overall two-dimensional confidence loss function is expressed as:
Figure BDA0002966264320000051
and N represents the total stage number of the network structure, the confidence coefficient graph of the key points is analyzed through a greedy algorithm, and the coordinates of the two-dimensional key points of the human body are output.
Due to the problems of occlusion, light rays and the like in the image, missing detection may occur through the two-dimensional key point coordinates detected in the steps, the threshold value th is set to be 0.4, and if the maximum probability value of a certain point in the confidence map generated in the last stage does not exceed th, the point is determined to be a missing detection point. Processing respective missing key points in the two visual angles, if the same key point is judged to be missing in the visual angle A and is successfully detected in the visual angle B, setting the coordinates of the key point in the visual angle A to be consistent with those in the visual angle B, and similarly, if the same key point is judged to be missing in the visual angle B and is successfully detected in the visual angle A, setting the coordinates of the key point in the visual angle B to be consistent with those in the visual angle A.
The threshold th is 0.4 based on a compromise principle, the missing detection points are filtered based on the threshold th, if the threshold is too large, the complexity of the algorithm for filtering the missing detection points is increased, and the algorithm is not favorable for repairing, and if the threshold is too small, part of the missing detection points can be omitted, so that the error is increased, and therefore, the threshold is set in a compromise mode in terms of detection precision and algorithm efficiency.
2) And (3) generating three-dimensional coordinates:
the two-dimensional confidence map set obtained in the last stage is collected
Figure BDA0002966264320000052
And collections
Figure BDA0002966264320000053
Inputting a three-dimensional CNN network, extracting information of front and rear key points and outputting three-dimensional coordinates of corresponding key points, wherein the network structure is shown in figure 3. The input size is w × h × J, J representing the number of key points. The CNN includes three convolutional layers, each of which is set to a size of 3 × 3 × 3. And (3) introducing nonlinear mapping for the model by using a ReLU function as an activation function of the convolutional layer, wherein the ReLU activation function is as follows:
ReLU(x)=max(0,x)
where x represents a function argument.
The convolution layer extracts the space characteristics among the key points through convolution operation, and inputs the convolution result into a full-connected Dense (J) layer, wherein J is the regression target number (namely the number of the key points of the human body), and J is equal to 17 in the invention. And finally, taking a sigmoid function as an output unit:
Figure BDA0002966264320000054
and obtaining the corresponding three-dimensional coordinate point after the output result is subjected to inverse normalization.
3) Coordinate point processing
The world coordinate system is determined to be the point A of the camera, the output result of the visual angle B branch line is converted into the world coordinate system coordinate, and the initial result is
Figure BDA0002966264320000055
And
Figure BDA0002966264320000056
representing the camera coordinate system coordinates at each of view a and view B. The transfer of the camera coordinate system to the world coordinate system transfer matrix is:
Figure BDA0002966264320000061
Figure BDA0002966264320000062
wherein R is an orthogonal rotation matrix, (R)x,ry,rz) Is the angular deviation of camera B relative to camera A, and B is the position of camera B in the world coordinate system (B)x,by,bz),
Figure BDA0002966264320000063
Indicating that view B outputs coordinates in the world coordinate system.
The final two-branch result after the output conversion of the B branch at the visual angle is
Figure BDA0002966264320000064
And
Figure BDA0002966264320000065
and the coordinates of the human body key points output by the visual angle A and the visual angle B in a world coordinate system are shown, wherein T epsilon {1 … T }, the corresponding image frame number is shown, and the model outputs at the same moment are consistent.
4) Weakly supervised model training
The three-dimensional key point model loss function comprises a distance error loss and an inter-frame loss, and the distance error loss function is defined as follows:
Figure BDA0002966264320000066
t denotes the number of frames simultaneously input to the network,
Figure BDA0002966264320000067
indicating that the euclidean distance is calculated.
The interframe error loss function is as follows:
Figure BDA0002966264320000068
the final joint loss function is defined as the sum of two losses of the two-dimensional key point branch and the three-dimensional key point branch in the last stage:
Loss=LD+LTD+f
the goal of the model training is to minimize the loss function, perform back propagation weight training by using a gradient descent method, and iteratively update model parameters, which are known to those skilled in the art and specifically refer to weight parameters and bias parameters in the neural network calculation.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims (9)

1. A human body three-dimensional key point extraction method is characterized by comprising the following steps:
s1, collecting human body action data by adopting double visual angles;
s2, detecting two-dimensional key point confidence maps of the human body on the data of the two visual angles by adopting a double-branch multi-stage structure;
s3, establishing a three-dimensional key point generation model;
s4, processing the human body action behavior data to be detected acquired in the step S1 in the step S2 to obtain a corresponding two-dimensional key point confidence map, and inputting the two-dimensional key point confidence map into the three-dimensional key point generation model established in the step S3 to obtain three-dimensional key point coordinates.
2. The method for extracting three-dimensional key points of a human body according to claim 1, wherein the step S1 specifically comprises: two cameras are adopted and marked as a camera A and a camera B, human action behavior data acquisition is carried out simultaneously, and synchronous frame sampling is carried out on acquired video data.
3. The method for extracting three-dimensional key points of a human body according to claim 2, wherein the two-branch multi-stage structure of step S2 is specifically: the upper branch is used for learning the positions of key points in the camera A, the lower branch is used for learning the positions of key points in the camera B, and the upper branch and the lower branch comprise a plurality of stages, wherein 3 layers of 3 × 3 convolution and two layers of 1 × 1 convolution are adopted in the stage 1, and 5 layers of 7 × 7 convolution and two layers of 1 × 1 convolution are adopted in the rest stages.
4. The method for extracting human body three-dimensional key points according to claim 3, further comprising extracting original image features by using a layer of three-dimensional CNN, wherein the input of the first stage is the original image features extracted by the layer of three-dimensional CNN; the input of the subsequent stage is the original image characteristics extracted by the three-dimensional CNN of the layer and the confidence map prediction result of the previous stage.
5. The method as claimed in claim 4, wherein the layer of three-dimensional CNN is used for extracting image features of the current frame and frames before and after the current frame.
6. The method as claimed in claim 5, wherein the size of the three-dimensional CNN convolution kernel is 3 x 3.
7. The method for extracting three-dimensional key points of a human body according to any one of claims 1 to 6, wherein the step S3 of generating a model of the three-dimensional key points specifically includes: and the three convolutional layers and the one full-connection layer adopt a sigmoid function as an output unit and use a ReLU function as an activation function of the convolutional layers.
8. The method for extracting three-dimensional key points of a human body according to claim 7, further comprising performing weak supervision training on a three-dimensional key point generation model, specifically: and the optimization target is a minimum loss function, a gradient descent method is adopted for carrying out back propagation weight training, and the parameters of the three-dimensional key point generation model are updated in an iterative manner.
9. The method for extracting three-dimensional key points of a human body according to claim 8, wherein the loss function expression is as follows:
Loss=LD+LTD+f
wherein L isDRepresenting the distance error loss function, LTDAnd f represents a two-dimensional confidence loss function of the whole double-branch multi-stage structure.
CN202110251506.XA 2021-03-08 2021-03-08 Human body three-dimensional key point extraction method Expired - Fee Related CN112926475B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110251506.XA CN112926475B (en) 2021-03-08 2021-03-08 Human body three-dimensional key point extraction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110251506.XA CN112926475B (en) 2021-03-08 2021-03-08 Human body three-dimensional key point extraction method

Publications (2)

Publication Number Publication Date
CN112926475A true CN112926475A (en) 2021-06-08
CN112926475B CN112926475B (en) 2022-10-21

Family

ID=76171889

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110251506.XA Expired - Fee Related CN112926475B (en) 2021-03-08 2021-03-08 Human body three-dimensional key point extraction method

Country Status (1)

Country Link
CN (1) CN112926475B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113780120A (en) * 2021-08-27 2021-12-10 深圳云天励飞技术股份有限公司 Method, device, server and storage medium for generating human body three-dimensional model
CN113989283A (en) * 2021-12-28 2022-01-28 中科视语(北京)科技有限公司 3D human body posture estimation method and device, electronic equipment and storage medium
CN114757822A (en) * 2022-06-14 2022-07-15 之江实验室 Binocular-based human body three-dimensional key point detection method and system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299669A (en) * 2018-08-30 2019-02-01 清华大学 Video human face critical point detection method and device based on double intelligent bodies
CN109635843A (en) * 2018-11-14 2019-04-16 浙江工业大学 A kind of three-dimensional object model classification method based on multi-view image
CN109949368A (en) * 2019-03-14 2019-06-28 郑州大学 A kind of human body three-dimensional Attitude estimation method based on image retrieval
CN110544301A (en) * 2019-09-06 2019-12-06 广东工业大学 Three-dimensional human body action reconstruction system, method and action training system
CN110874865A (en) * 2019-11-14 2020-03-10 腾讯科技(深圳)有限公司 Three-dimensional skeleton generation method and computer equipment
CN111108507A (en) * 2017-09-22 2020-05-05 祖克斯有限公司 Generating a three-dimensional bounding box from two-dimensional images and point cloud data
CN111738220A (en) * 2020-07-27 2020-10-02 腾讯科技(深圳)有限公司 Three-dimensional human body posture estimation method, device, equipment and medium
CN111753747A (en) * 2020-06-28 2020-10-09 高新兴科技集团股份有限公司 Violent motion detection method based on monocular camera and three-dimensional attitude estimation

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111108507A (en) * 2017-09-22 2020-05-05 祖克斯有限公司 Generating a three-dimensional bounding box from two-dimensional images and point cloud data
CN109299669A (en) * 2018-08-30 2019-02-01 清华大学 Video human face critical point detection method and device based on double intelligent bodies
CN109635843A (en) * 2018-11-14 2019-04-16 浙江工业大学 A kind of three-dimensional object model classification method based on multi-view image
CN109949368A (en) * 2019-03-14 2019-06-28 郑州大学 A kind of human body three-dimensional Attitude estimation method based on image retrieval
CN110544301A (en) * 2019-09-06 2019-12-06 广东工业大学 Three-dimensional human body action reconstruction system, method and action training system
CN110874865A (en) * 2019-11-14 2020-03-10 腾讯科技(深圳)有限公司 Three-dimensional skeleton generation method and computer equipment
CN111753747A (en) * 2020-06-28 2020-10-09 高新兴科技集团股份有限公司 Violent motion detection method based on monocular camera and three-dimensional attitude estimation
CN111738220A (en) * 2020-07-27 2020-10-02 腾讯科技(深圳)有限公司 Three-dimensional human body posture estimation method, device, equipment and medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张广翩: ""基于二维点云图的三维人体建模方法"", 《计算机工程与应用》 *
肖澳文: ""基于CNN的三维人体姿态估计方法"", 《武汉工程大学学报》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113780120A (en) * 2021-08-27 2021-12-10 深圳云天励飞技术股份有限公司 Method, device, server and storage medium for generating human body three-dimensional model
CN113989283A (en) * 2021-12-28 2022-01-28 中科视语(北京)科技有限公司 3D human body posture estimation method and device, electronic equipment and storage medium
CN114757822A (en) * 2022-06-14 2022-07-15 之江实验室 Binocular-based human body three-dimensional key point detection method and system

Also Published As

Publication number Publication date
CN112926475B (en) 2022-10-21

Similar Documents

Publication Publication Date Title
CN112926475B (en) Human body three-dimensional key point extraction method
CN110135319B (en) Abnormal behavior detection method and system
CN109684924B (en) Face living body detection method and device
CN107492121B (en) Two-dimensional human body bone point positioning method of monocular depth video
CN109684925B (en) Depth image-based human face living body detection method and device
CN110555412B (en) End-to-end human body gesture recognition method based on combination of RGB and point cloud
CN107767419A (en) A kind of skeleton critical point detection method and device
CN111814661A (en) Human behavior identification method based on residual error-recurrent neural network
CN111199207B (en) Two-dimensional multi-human body posture estimation method based on depth residual error neural network
CN111695457A (en) Human body posture estimation method based on weak supervision mechanism
CN110852182A (en) Depth video human body behavior recognition method based on three-dimensional space time sequence modeling
CN112257513B (en) Training method, translation method and system for sign language video translation model
WO2022052782A1 (en) Image processing method and related device
CN111898566B (en) Attitude estimation method, attitude estimation device, electronic equipment and storage medium
CN115376024A (en) Semantic segmentation method for power accessory of power transmission line
CN114611600A (en) Self-supervision technology-based three-dimensional attitude estimation method for skiers
CN114724185A (en) Light-weight multi-person posture tracking method
CN115482523A (en) Small object target detection method and system of lightweight multi-scale attention mechanism
CN116092178A (en) Gesture recognition and tracking method and system for mobile terminal
Li et al. Mask-FPAN: semi-supervised face parsing in the wild with de-occlusion and UV GAN
CN108537156B (en) Anti-shielding hand key node tracking method
CN113762009B (en) Crowd counting method based on multi-scale feature fusion and double-attention mechanism
CN111274901B (en) Gesture depth image continuous detection method based on depth gating recursion unit
CN111950476A (en) Deep learning-based automatic river channel ship identification method in complex environment
WO2019136591A1 (en) Salient object detection method and system for weak supervision-based spatio-temporal cascade neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20221021