CN110826500A - Method for estimating 3D human body posture based on antagonistic network of motion link space - Google Patents

Method for estimating 3D human body posture based on antagonistic network of motion link space Download PDF

Info

Publication number
CN110826500A
CN110826500A CN201911085729.2A CN201911085729A CN110826500A CN 110826500 A CN110826500 A CN 110826500A CN 201911085729 A CN201911085729 A CN 201911085729A CN 110826500 A CN110826500 A CN 110826500A
Authority
CN
China
Prior art keywords
human body
human
coordinates
network
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911085729.2A
Other languages
Chinese (zh)
Other versions
CN110826500B (en
Inventor
薛裕明
谢军伟
李�根
罗鸣
童同
高钦泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Timor View Mdt Infotech Ltd
Original Assignee
Fujian Timor View Mdt Infotech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Timor View Mdt Infotech Ltd filed Critical Fujian Timor View Mdt Infotech Ltd
Priority to CN201911085729.2A priority Critical patent/CN110826500B/en
Publication of CN110826500A publication Critical patent/CN110826500A/en
Application granted granted Critical
Publication of CN110826500B publication Critical patent/CN110826500B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a method for estimating a 3D human body posture based on a antagonism network of a motion link space. A convolution neural network is adopted, three-dimensional coordinates of key nodes of a human body are estimated from an image collected by monocular equipment, specifically, a monocular RGB image is used as input, and a motion link space and countermeasure network technology are adopted, so that the overfitting phenomenon is solved, and the accuracy and precision of 3D human body posture estimation are improved.

Description

Method for estimating 3D human body posture based on antagonistic network of motion link space
Technical Field
The invention relates to image content understanding, in particular to a method for estimating a 3D human body posture based on a antagonism network of a motion link space.
Background
The current artificial intelligence technology brings huge breakthroughs in the fields of image content understanding, video enhancement, voice recognition and the like. Particularly, in the image content understanding, the 3D human body posture recognition technology has high application value in the fields of rehabilitation medicine, video monitoring, advanced human-computer interaction and the like.
The 3D human body posture estimation refers to a technology for predicting three-dimensional coordinates of a human body posture from a monocular or monocular image. The 3D human posture estimation can be roughly classified into the following three methods:
the first method is to calculate a spatial coordinate system according to information such as position relation between multi-view cameras and shooting angles by adopting a mathematical operation or machine learning mode, predict a corresponding depth map, and estimate a 2D image of any angle. However, the disadvantage is that not only the image collected by the multi-view camera is needed, but also the placing position of the collecting device cannot be changed.
The second method is to directly calculate 2D human body posture coordinates from a single image by using only a single acquisition device, and then estimate the corresponding 3D human body posture by a simple matrix multiplication or lightweight network learning mode. However, due to lack of original image input, spatial information may be lost, resulting in poor accuracy of 3D coordinates; in addition, this method relies only on 2D pose input information, so its error may be amplified in the 3D estimation process.
The third method is to calculate the end-to-end mapping relation from the monocular RGB images to the 3-dimensional coordinates by a deep learning method. Compared with the former two methods, the method has obvious improvement in efficiency and performance.
Although the 3D human body posture estimation has made a certain progress, additional acquisition equipment information is still needed, and the phenomenon of overfitting is very easy to occur due to the existence of a deep neural network.
Therefore, the invention only takes the monocular RGB image as input, adopts the motion link space and the countermeasure network technology, not only solves the overfitting phenomenon, but also improves the precision and the accuracy of the 3D human body posture estimation.
Disclosure of Invention
The invention aims to provide a method for estimating a 3D human body posture based on a reactive network of a motion link space, which adopts a convolutional neural network to estimate three-dimensional coordinates of key nodes of a human body from an image acquired by monocular equipment, so that the accuracy and precision of estimating the 3D human body posture are improved.
In order to achieve the purpose, the technical scheme of the invention is as follows: a method for estimating a 3D human body posture based on a antagonism network of a motion link space comprises the following steps:
s1, collecting a human body color image I by adopting monocular equipment, then carrying out image normalization, marking by utilizing 2D and 3D human body data sets, and respectively obtaining 2D human body bone coordinate P and 3D human body bone coordinate M e to R3×n(ii) a Adopting an original image and human skeleton coordinates to carry out mirror image and cutting, and carrying out image data augmentation;
step S2, generating a network by the 3D human body skeleton coordinates: weak supervision generation is adopted to resist network learning to solve the problem of data overfitting, wherein the following calculation formula is adopted in the feature extraction stage:
F=R(BN(W1*Ig+B1)) (1)
wherein R represents a nonlinear activation function LeakyRelu, W1,B1Respectively representing the weights and offsets of the convolutional layers in the feature extraction stage, BN representing the normalization function, IgRepresenting an input picture, and F representing an output result obtained in the characteristic extraction stage; then, the 3D human skeleton coordinates are obtained through the convolution block, the remodeling module and the two full-connection layers respectively;
step S3, estimating camera coordinate parameter K ∈ R by adopting convolutional neural network2×3To assist in the back projection layer;
step S4, generating 3D human skeleton coordinates generated by the network based on the 3D human skeleton coordinates obtained by the labeling in the step S1 and the 3D human skeleton coordinates generated in the step S2, calculating the link angle and the length of the human skeleton by adopting a Wasserstein GAN discriminator of a motion link space, and simultaneously fusing and inputting the input image and the 3D human skeleton coordinates into a convolutional neural network so as to improve the accuracy of the human body structure, namely the generation of the 3D human skeleton coordinates;
step S5, through the back projection layer, based on the camera coordinate parameter K epsilon R calculated in step S32×3Converting the 3D human skeleton coordinates into 2D human skeleton coordinates;
P'=KM (2)
wherein P' is the predicted 2D human skeletal coordinates;
step S6, predicting a loss function of the key nodes of the 3D human body posture, wherein M belongs to R3×nRepresenting 3D human skeleton coordinates, i.e. 3D human posture key node position, coordinate mi(x, y, z) represents one key node position of the human body, i is 1, … …, n, and reshape operation is performed on the last output layer, so as to obtain 3D human body coordinates;
step S7, gradual training strategy: dividing the training process into a plurality of preset sub-training periods, and adopting a stepping increasing strategy to train the sub-training periods in sequence; scaling the original image into small pictures and starting training at a large learning rate when training is started, and gradually increasing the color original image and gradually reducing the learning rate after each sub-training period is finished; when the 3D human skeleton coordinate generated after completing one sub-training period and the corresponding calibration data have large entries, the backward propagation is continued, the gradient descent optimization algorithm is used for updating the convolution weight parameter and the bias parameter, and then the step S2 is executed; and when the 3D human body bone coordinates generated after one sub-training period is finished reach the expected times or all the preset sub-training periods are finished, obtaining the final result.
In an embodiment of the present invention, the loss function of the key node of the 3D body pose is equal to:
W(Pr,Pg)+λLcam
Figure BDA0002265268980000032
wherein, W (P)r,Pg) Representing the loss function of WGAN, the input of which comprises two parts, PgThe notation is a batch of data (containing images and correspondingly generated 3D human skeleton coordinates) input as generated, PrRepresenting a batch of inputs as real data (containing images and corresponding real labeled 3D human bone coordinates),
Figure BDA0002265268980000033
representing the loss value discriminated as a real 3D human skeleton,
Figure BDA0002265268980000034
a loss value representing a 3D human skeleton discriminated to be generated; | f | non-conducting phosphorL1 or less means that the Lipschitz constant of the function f is 1, meaning that the Lipschitz constant of the function f | | | ventilated phosphor is requiredLUnder the condition of not exceeding 1, taking f to all possible satisfied conditions
Figure BDA0002265268980000035
The upper bound of (c); l iscamRepresenting the loss function of the camera estimation network, taking lambda as 0-1, and trace as the trace for calculating the corresponding matrix, | | | | non-calculationFIs F norm, K is belonged to R2×3,I2Is an identity matrix of 2 x 2.
Compared with the prior art, the invention has the following beneficial effects:
the innovation of the method for estimating the 3D human body posture based on the antagonistic network of the motion link space is mainly embodied in two aspects: firstly, a deep neural network model is used for generating a human body 3D skeleton frame in a weak supervision mode, the generation is accurate, the effect is good, and most human body action analysis requirements can be met. And secondly, the 3D coordinates are introduced for the first time to be fused with the images, and the discrimination network is introduced into the KCS network layer at the same time, so that the discrimination network is upgraded, and a great auxiliary effect is provided for the generation of the 3D structure. The invention aims to provide a method for estimating a 3D human body posture based on a antagonism network of a motion link space, which is accurate and reliable in the generated 3D human body posture by using an antagonism generation network and assisting a KCS network layer and a camera back projection network.
Drawings
FIG. 1 is a diagram of the present invention FIG. 1 is a diagram of a network structure of a 3D human skeleton coordinate part generated by the method for estimating 3D human posture based on an antagonistic network of a kinematic link space according to the present invention;
FIG. 2 is a camera estimation network structure of the method for estimating 3D human body posture based on the antagonistic network of the kinematic link space;
FIG. 3 is a discriminator portion of the method of the present invention for estimating 3D body pose based on a antagonism network of the kinematic link space;
FIG. 4 is a basic flow chart of the method for estimating 3D human body posture based on the antagonistic network of the motion link space;
FIG. 5 is a diagram illustrating the effect of the method for estimating the 3D human body posture based on the antagonism network of the kinematic link space.
Detailed Description
The technical scheme of the invention is specifically explained below by combining the attached drawings 1-5.
As shown in fig. 4, the method for estimating a 3D human body posture based on a reactive network of a motion link space of the present invention aims to estimate three-dimensional coordinates of key nodes of a human body from an image acquired by monocular equipment by using a convolutional neural network, and specifically comprises the following steps:
step 1:
to train the model, a number of color body images were selected as input I, followed by image normalization and labeling with 2D and 3D body data sets, resulting in 2D and 3D coordinates of each body being P, M. The method comprises the steps of carrying out mirror image inversion on a color original image and labeling information, randomly changing the brightness and the chroma saturation to obtain a large amount of augmented image data, and storing the augmented image data in a matching data pair mode to serve as a training data set for deep learning. At the same time, the 2D coordinate P (P) on the training set is also matched1,p2,...pn) (ii) a 3D coordinate M (M)0,m1,...,mn),M∈R3×n(ii) a And normalization processing is performed, so that the convergence rate of the model can be further improved, the precision of the model is improved, and gradient explosion is prevented.
Step 2:
the generator portion 1: the 3D human skeletal coordinates generate a network. Compared with the traditional method, the method adopts weak supervision generation to resist network learning to solve the problem of data overfitting, and the specific steps are as follows:
the characteristic extraction stage consists of a convolution layer, a batch regularization layer and a LeakyRelu activation function, and the calculation formula is as follows:
F=R(BN(W1*Ig+B1)) (1)
wherein R represents a nonlinear activation function LeakyRelu, W1,B1Respectively representing the weights and offsets of the convolutional layers in the feature extraction stage, BN representing the normalization function, IgRepresenting an input picture, and F representing an output result obtained in the characteristic extraction stage; then, respectively passing through a convolution block, a reshaping module (flatten) and two full-connection layers to obtain corresponding 3D human skeleton coordinates;
and step 3:
the generator section 2: in order to estimate the accuracy of the human body posture, the invention adopts a convolution neural network to estimate the coordinate parameter K belonging to R of the camera2×3The method aims to assist a back projection layer, back projects the 3D human skeleton coordinates to the corresponding 2D human skeleton coordinates, compares the 2D human skeleton coordinates with the 2D coordinates in the original input image, and calculates the back projection loss, thereby preventing the over-fitting phenomenon. Since K must have the following properties as a matrix transformation:
KKt=s2I2(2)
where s is the scaling factor of the projection, I2Is an identity matrix of 2 x 2 to which the invention assigns the largest singular value in the K matrix, since s is an uncertain quantity. The calculation formula method is as follows:
Figure BDA0002265268980000051
the loss function of the camera estimation network is as follows:
wherein trace is the trace for calculating the corresponding matrix, | | | | | non-woven calculationFIs F norm, K is belonged to R2×3
The 3D human skeletal coordinates are converted to 2D skeletal coordinates by training the network shown in fig. 2 to obtain the output, i.e. to obtain the matrix K of the back projection:
P'=KM (5)
and 4, step 4:
a discriminator section: as shown in FIG. 3, in order to determine the accuracy of human structure generation, the present invention uses a classifier of Wasserstein GAN [1] of kinematic link space [2] (KCS: kinetic chain space) for more reasonable calculation of link angle and length. Meanwhile, the input image and the 3D human skeleton frame are fused and input into the convolutional neural network, and the characteristic of whether the 3D skeleton is attached to the original image or not is increased.
The KCS layer is a network layer introduced by the invention and capable of improving the representation of the human posture. The KCS matrix is an important method for representing human body posture, and it contains joint link nodes and bone lengths. A bone bkCan be represented as a link of the r-th and t-th nodes.
bk=pr-pt=Mc (6)
c=(0,...,0,1,0,...,0,-1,0,...,0)T(7)
The position of r is 1 and the position of t is-1. The final overall human skeleton is defined as:
B=(b1,b2,...,bn) (8)
the matrix C is obtained by linking a plurality of C vectors, so that B can be represented as.
B=MC (9)
The KCS matrix is calculated as follows:
by adding the Ψ matrix to the network layer, it can be found that there is a length of each bone on the diagonal, and an angle representation between any two bones at other locations. Compared with matrices for calculating Euclidean distances in other methods, the algorithm adopts a matrix operation form, so that the operation speed is effectively improved, and the part is mainly used for extracting bone features and making the judgment on virtually constructed bones at the fastest speed.
In order to increase the characteristic of whether the 3D skeleton is jointed with the original image or not, the invention adds a second part of input, namely the original image and the 3D skeleton are combined as input, and the characteristic is extracted through a convolutional neural network. Specifically, the newly added 3D image portion is initialized to a floating-point matrix of width, height, depth, which is the same width and height as the original image, and the initial value is all 0.5, where width, height is the maximum depth value of the 3D human body, and each point of the input 3D human body is assigned to 1.0. As shown in fig. 3.
The invention links two parts of extracted features, and adds two full link layers in the next network, wherein each full link layer comprises 90 neurons. A determination is ultimately made from whom the 3D bone coordinates are derived.
And 5:
loss function: predicting a loss function of the key nodes of the 3D human body posture: w (P)r,Pg)+λLcam,M∈R3×nThe 3D human body posture key node position is represented, the coordinate mi (x, y, z) represents one key node position of the human body, and reshape operation is carried out on the last output layer, so that the 3D human body coordinate is obtained. The discriminator part adopts Wasserteiinloss [1]]The loss function as part is shown below:
Figure BDA0002265268980000061
wherein, W (P)r,Pg) Representing the loss function of WGAN, the input of which comprises two parts, PgThe notation is a batch of input as the generated data, PrIndicating that a batch of inputs is trueThe real data is transmitted to the mobile terminal,
Figure BDA0002265268980000063
representing the loss value discriminated as a real 3D human skeleton,
Figure BDA0002265268980000064
a loss value representing a 3D human skeleton discriminated to be generated; | f | non-conducting phosphorL1 or less means that the Lipschitz constant of the function f is 1, meaning that the Lipschitz constant of the function f | | | ventilated phosphor is requiredLUnder the condition of not exceeding 1, taking f to all possible satisfied conditions
Figure BDA0002265268980000065
The upper bound of (c);
the loss function of the camera estimation network is as follows:
Figure BDA0002265268980000062
wherein trace is the trace for calculating the corresponding matrix, | | | | | non-woven calculationFIs F norm, K is belonged to R2×3,I2Is an identity matrix of 2 x 2.
Step 6:
and (5) gradually training a strategy. Dividing the training process into a plurality of preset sub-training periods, and adopting a stepping increasing strategy to train the sub-training periods in sequence; the original image is scaled to small pictures at the beginning of training and the training is started with a large learning rate, and the color original image is gradually increased and the learning rate is gradually decreased after each sub-training period.
When the 3D human skeleton coordinate generated after completing one sub-training period and the corresponding calibration data have a larger exit, continuing to perform backward propagation, updating the convolution weight parameter and the bias parameter by using a gradient descent optimization algorithm, and then executing the step 2; and when the 3D human body bone coordinates generated after one sub-training period is finished reach the expected times or all the preset sub-training periods are finished, obtaining the final result. The reason for this is that training is started on the basis of scaling the original picture into a small picture, and is assisted by the university learning rate. And after the training period is finished, increasing the input picture, reducing the learning rate and performing training again. By analogy, the precision of the picture with higher resolution can be enhanced on the basis of the picture with low resolution, and the robustness of the network is increased.
FIG. 5 is a diagram illustrating the effect of the method for estimating the 3D human body posture based on the antagonism network of the kinematic link space.
Reference documents:
[1].M.Arjovsky,S.Chintala,and L.Bottou.Wasserstein generativeadversarial networks.In D.Precup and Y.W.Teh,editors,Proceedings of the 34thInternational Conference on Machine Learning,volume 70of Proceedings ofMachine Learning Research,pages 214–223,International Convention Centre,Sydney,Australia,06–11Aug 2017.PMLR.3,4,5
[2]B.Wandt,H.Ackermann,and B.Rosenhahn.A kinematic chain space formonocular motion capture.In ECCV Workshops,Sept.2018.1,2,4,8。
the above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims (2)

1. A method for estimating a 3D human body posture based on a antagonism network of a motion link space is characterized by comprising the following steps:
s1, collecting a human body color image I by adopting monocular equipment, then carrying out image normalization, marking by utilizing 2D and 3D human body data sets, and respectively obtaining 2D human body bone coordinate P and 3D human body bone coordinate M e to R3×n(ii) a Adopting an original image and human skeleton coordinates to carry out mirror image and cutting, and carrying out image data augmentation;
step S2, generating a network by the 3D human body skeleton coordinates: weak supervision generation is adopted to resist network learning to solve the problem of data overfitting, wherein the following calculation formula is adopted in the feature extraction stage:
F=R(BN(W1*Ig+B1)) (1)
wherein R represents a nonlinear activation function LeakyRelu, W1,B1Respectively representing the weight and the bias of the convolution layer in the feature extraction stage, wherein BN represents a normalization function, Ig represents an input picture, and F represents an output result obtained in the feature extraction stage; then, the 3D human skeleton coordinates are obtained through the convolution block, the remodeling module and the two full-connection layers respectively;
step S3, estimating camera coordinate parameter K ∈ R by adopting convolutional neural network2×3To assist in the back projection layer;
step S4, generating 3D human skeleton coordinates generated by the network based on the 3D human skeleton coordinates obtained by the labeling in the step S1 and the 3D human skeleton coordinates generated in the step S2, calculating the link angle and the length of the human skeleton by adopting a Wasserstein GAN discriminator of a motion link space, and simultaneously fusing and inputting the input image and the 3D human skeleton coordinates into a convolutional neural network so as to improve the accuracy of the generation of the 3D human skeleton coordinates;
step S5, through the back projection layer, based on the camera coordinate parameter K epsilon R calculated in step S32×3Converting the 3D human skeleton coordinates into 2D human skeleton coordinates;
P'=KM (2)
wherein P' is the predicted 2D human skeletal coordinates;
step S6, predicting a loss function of the key nodes of the 3D human body posture, wherein M belongs to R3×nRepresenting 3D human skeleton coordinates, i.e. 3D human posture key node position, coordinate mi(x, y, z) represents one key node position of the human body, i is 1, … …, n, and reshape operation is performed on the last output layer, so as to obtain 3D human body coordinates;
step S7, gradual training strategy: dividing the training process into a plurality of preset sub-training periods, and adopting a stepping increasing strategy to train the sub-training periods in sequence; scaling the original image into small pictures and starting training at a large learning rate when training is started, and gradually increasing the color original image and gradually reducing the learning rate after each sub-training period is finished; when the 3D human skeleton coordinate generated after completing one sub-training period and the corresponding calibration data have large entries, the backward propagation is continued, the gradient descent optimization algorithm is used for updating the convolution weight parameter and the bias parameter, and then the step S2 is executed; and when the 3D human body bone coordinates generated after one sub-training period is finished reach the expected times or all the preset sub-training periods are finished, obtaining the final result.
2. The method for estimating the 3D human body posture based on the antagonistic network of the motion link space of claim 1, wherein the loss function of the key node of the 3D human body posture is equal to:
W(Pr,Pg)+λLcam
Figure FDA0002265268970000021
Figure FDA0002265268970000022
wherein, W (P)r,Pg) Representing the loss function of WGAN, the input of which comprises two parts, PgThe notation is a batch of input as the generated data, PrIndicating that a batch of input is real data,
Figure FDA0002265268970000023
representing the loss value discriminated as a real 3D human skeleton,
Figure FDA0002265268970000024
a loss value representing a 3D human skeleton discriminated to be generated; | f | non-conducting phosphorL1 or less means that the Lipschitz constant of the function f is 1, meaning that the Lipschitz constant of the function f | | | ventilated phosphor is requiredLUnder the condition of not exceeding 1, taking f to all possible satisfied conditions
Figure FDA0002265268970000025
The upper bound of (c); l iscamRepresenting camera estimation networkThe λ is 0-1, trace is the trace of the calculation corresponding matrix, | | | | computation proceedsFIs F norm, K is belonged to R2×3,I2Is an identity matrix of 2 x 2.
CN201911085729.2A 2019-11-08 2019-11-08 Method for estimating 3D human body posture based on antagonistic network of motion link space Active CN110826500B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911085729.2A CN110826500B (en) 2019-11-08 2019-11-08 Method for estimating 3D human body posture based on antagonistic network of motion link space

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911085729.2A CN110826500B (en) 2019-11-08 2019-11-08 Method for estimating 3D human body posture based on antagonistic network of motion link space

Publications (2)

Publication Number Publication Date
CN110826500A true CN110826500A (en) 2020-02-21
CN110826500B CN110826500B (en) 2023-04-14

Family

ID=69553460

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911085729.2A Active CN110826500B (en) 2019-11-08 2019-11-08 Method for estimating 3D human body posture based on antagonistic network of motion link space

Country Status (1)

Country Link
CN (1) CN110826500B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111462274A (en) * 2020-05-18 2020-07-28 南京大学 Human body image synthesis method and system based on SMP L model
CN111598954A (en) * 2020-04-21 2020-08-28 哈尔滨拓博科技有限公司 Rapid high-precision camera parameter calculation method
CN111914618A (en) * 2020-06-10 2020-11-10 华南理工大学 Three-dimensional human body posture estimation method based on countermeasure type relative depth constraint network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100331855A1 (en) * 2005-05-16 2010-12-30 Intuitive Surgical, Inc. Efficient Vision and Kinematic Data Fusion For Robotic Surgical Instruments and Other Applications
US20110205337A1 (en) * 2010-02-25 2011-08-25 Hariraam Varun Ganapathi Motion Capture with Low Input Data Constraints
CN108549876A (en) * 2018-04-20 2018-09-18 重庆邮电大学 The sitting posture detecting method estimated based on target detection and human body attitude
CN109949368A (en) * 2019-03-14 2019-06-28 郑州大学 A kind of human body three-dimensional Attitude estimation method based on image retrieval
CN110135375A (en) * 2019-05-20 2019-08-16 中国科学院宁波材料技术与工程研究所 More people's Attitude estimation methods based on global information integration

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100331855A1 (en) * 2005-05-16 2010-12-30 Intuitive Surgical, Inc. Efficient Vision and Kinematic Data Fusion For Robotic Surgical Instruments and Other Applications
US20110205337A1 (en) * 2010-02-25 2011-08-25 Hariraam Varun Ganapathi Motion Capture with Low Input Data Constraints
CN108549876A (en) * 2018-04-20 2018-09-18 重庆邮电大学 The sitting posture detecting method estimated based on target detection and human body attitude
CN109949368A (en) * 2019-03-14 2019-06-28 郑州大学 A kind of human body three-dimensional Attitude estimation method based on image retrieval
CN110135375A (en) * 2019-05-20 2019-08-16 中国科学院宁波材料技术与工程研究所 More people's Attitude estimation methods based on global information integration

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598954A (en) * 2020-04-21 2020-08-28 哈尔滨拓博科技有限公司 Rapid high-precision camera parameter calculation method
CN111462274A (en) * 2020-05-18 2020-07-28 南京大学 Human body image synthesis method and system based on SMP L model
CN111914618A (en) * 2020-06-10 2020-11-10 华南理工大学 Three-dimensional human body posture estimation method based on countermeasure type relative depth constraint network
CN111914618B (en) * 2020-06-10 2024-05-24 华南理工大学 Three-dimensional human body posture estimation method based on countermeasure type relative depth constraint network

Also Published As

Publication number Publication date
CN110826500B (en) 2023-04-14

Similar Documents

Publication Publication Date Title
CN110826500B (en) Method for estimating 3D human body posture based on antagonistic network of motion link space
CN106778604B (en) Pedestrian re-identification method based on matching convolutional neural network
CN109658445A (en) Network training method, increment build drawing method, localization method, device and equipment
CN107767419A (en) A kind of skeleton critical point detection method and device
CN109978021B (en) Double-flow video generation method based on different feature spaces of text
CN110276768B (en) Image segmentation method, image segmentation device, image segmentation apparatus, and medium
CN102256065A (en) Automatic video condensing method based on video monitoring network
CN111695523B (en) Double-flow convolutional neural network action recognition method based on skeleton space-time and dynamic information
CN112288627A (en) Recognition-oriented low-resolution face image super-resolution method
CN110351548B (en) Stereo image quality evaluation method guided by deep learning and disparity map weighting
WO2022052782A1 (en) Image processing method and related device
CN109977827A (en) A kind of more people's 3 d pose estimation methods using multi-view matching method
CN114743273A (en) Human skeleton behavior identification method and system based on multi-scale residual error map convolutional network
CN117252928B (en) Visual image positioning system for modular intelligent assembly of electronic products
CN111507276B (en) Construction site safety helmet detection method based on hidden layer enhanced features
CN116524601B (en) Self-adaptive multi-stage human behavior recognition model for assisting in monitoring of pension robot
CN112288812A (en) Mobile robot real-time positioning method based on visual features
CN115063717B (en) Video target detection and tracking method based on real scene modeling of key area
CN117152829A (en) Industrial boxing action recognition method of multi-view self-adaptive skeleton network
Kang et al. An improved 3D human pose estimation model based on temporal convolution with gaussian error linear units
CN116092189A (en) Bimodal human behavior recognition method based on RGB data and bone data
Gao et al. Study of improved Yolov5 algorithms for gesture recognition
CN115641644A (en) Twin MViT-based multi-view gait recognition method
CN115496859A (en) Three-dimensional scene motion trend estimation method based on scattered point cloud cross attention learning
Benhamida et al. Theater Aid System for the Visually Impaired Through Transfer Learning of Spatio-Temporal Graph Convolution Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant