CN108830150B - One kind being based on 3 D human body Attitude estimation method and device - Google Patents

One kind being based on 3 D human body Attitude estimation method and device Download PDF

Info

Publication number
CN108830150B
CN108830150B CN201810426144.1A CN201810426144A CN108830150B CN 108830150 B CN108830150 B CN 108830150B CN 201810426144 A CN201810426144 A CN 201810426144A CN 108830150 B CN108830150 B CN 108830150B
Authority
CN
China
Prior art keywords
image
human body
key point
depth image
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810426144.1A
Other languages
Chinese (zh)
Other versions
CN108830150A (en
Inventor
吕蕾
张凯
张桂娟
刘弘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Micro-Chain Daoi Technology Co.,Ltd.
Original Assignee
Shandong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Normal University filed Critical Shandong Normal University
Priority to CN201810426144.1A priority Critical patent/CN108830150B/en
Publication of CN108830150A publication Critical patent/CN108830150A/en
Application granted granted Critical
Publication of CN108830150B publication Critical patent/CN108830150B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses one kind to be based on 3 D human body Attitude estimation method and device, wherein this method includes S1: with the depth image and RGB color image of monocular camera acquisition human body different angle;S2: constructing skeleton critical point detection neural network based on RGB color image, obtains key point mark image;S3: construction hand joint node 2D-3D mapping network;S4: the depth image and key point of calibration human body equal angular mark image, and then carry out three-dimensional point cloud coloring conversion to respective depth image, obtain coloring depth image;S5: based on key point mark image and coloring depth image, the skeleton key point marked corresponding position in depth image is predicted using default learning network;S6: merging the output of step S3 and step S5, realizes that the fining to 3 D human body Attitude estimation is estimated.

Description

One kind being based on 3 D human body Attitude estimation method and device
Technical field
The invention belongs to computer vision, image procossing, computer graphics and deep learning application fields, more particularly to One kind being based on 3 D human body Attitude estimation method and device.
Background technique
So-called human body attitude estimation, which refers to, matches abstraction hierarchy feature with manikin, to obtain different moments Posture locating for target.Human body attitude estimation is the key problem of human body motion capture.The posture expression of human body includes two sides Face, first is that position and direction of the entire human body in world coordinates;Second is that the angle in body parts joint and being influenced by joint angle Skin deformation.The main application fields of human motion Attitude estimation can be divided into three general orientation: monitoring, control and analysis:
(1) in monitoring application aspect, some traditional applications be included in detect and position automatically in airport or subway pedestrian, Demographics or crowd's flowing, congestion analysis etc..With the raising of awareness of safety, occur some novel answer in recent years With --- the analysis of personal or crowd behavior and movement.Such as in queuing and shopping, irregular behavior or progress are detected Identification etc..
(2) in control application aspect, people control target using motion estimation result or attitude parameter.This Application in terms of human-computer interaction is at most.In entertainment industry such as film and game animation etc., using also increasingly wider.People can benefit With the shape, appearance and movement of the people captured, to make 3D film or rebuild the threedimensional model of the people in game.
(3) in analysis application aspect, including automatic diagnosis, the analysis and improvement to player motion to surgical patient Deng.In terms of visual media, there is the application such as content based video retrieval system, video compress.In addition, in terms of automobile industry also Relevant application, such as automatic control, sleep detection and the pedestrian detection of air bag etc. are arrived.
The human movement capture system of comparative maturity has based on motor machine currently on the market, electromagnetism and special optical The types such as mark.Magnetic or optical label is attached on the limbs of people, their three-dimensional track is used to description target fortune Dynamic, these systems are automatic, but its disadvantages are that: equipment is very heavy, and expensive, is unable to get extensive Using.
Therefore, research hotspot is had become based on computer vision human body motion capture technology.It utilizes computer vision Basic principle, 3 d human motion sequence is directly extracted from video.This method, which does not need to add on human synovial, appoints What sensor, ensure that human motion is unrestricted, and low cost, high-efficient.Method mostly uses be based on greatly currently popular The matching technique of manikin.The target of this method is that one group of attitude parameter is found in state space, so that corresponding to this The human body attitude of parameter meets the most with the low-level image feature extracted from observed image.
In this field of motion tracking based on computer vision, the research method generally used is:
In the beginning of tracking, determine in image sequence the position of human body of first frame, in subsequent sequence the determination of human body target according to Rely the continuity and kinematical constraint condition in human motion.Wherein it is determined that there are two types of methods for first frame position of human body:
First is that the first posture of artificial regulation target or headed by manikin is set frame approximate posture, this is unfavorable for The automation of human body tracking.
Second is that determining each position of body, this method using location detection method after removing the background other than human body The strict guarantee that can partially realize automation, but people's scape is needed to divide.
In subsequent human body tracking and 3 d pose estimation, there is the method based on model and model-free.Wherein:
(1) conventional method based on model is the 3D model for establishing human body in advance, by the first frame of model and motion sequence Match, it is further using optimization methods such as gradient decline or stochastical samplings using conditions such as kinematic parameter limitations in subsequent tracking The model parameter of each frame is estimated, to obtain model sport sequence.The shortcomings that this method is: the tracking of subsequent frame exists tired Product error, tracks be easy error for a long time.
(2) model-free methods do not need to establish manikin, but according to the geometry of human motion presentation, texture, color Etc. information, estimate using study or based on the method for sample human motion posture.The shortcomings that this method, is: human motion Posture is difficult to depend on priori knowledge, and can only track specific behavior aggregate with limited state description.
Monocular-camera all can be used based on both of model and model-free trackings or multi-lens camera is realized.By There is the ambiguousness from three-dimensional to two-dimensional map in the reconstruction in the normal image for not having depth information, and for compound movement Attitude estimation is extremely difficult, therefore in the research of more than ten years in past, and being all based on for most of human body motion tracking technologies is more It is realized under the conditions of lens camera, depth information is obtained with this.But be using the condition of multi-lens camera: need calibrate and Inconvenience is arranged in average family, is unfavorable for the application popularization of movement capturing technology into huge numbers of families.
In conclusion for the limitation of multi-lens camera use condition in the prior art and in order to quickly and easily identify Depth image out needs a kind of effective solution scheme.
Summary of the invention
In order to solve the deficiencies in the prior art, the first object of the present invention is to provide a kind of based on 3 D human body Attitude estimation Method can accurately identify the 3 D human body posture in depth image.
A kind of technical solution based on 3 D human body Attitude estimation method of the invention are as follows:
One kind being based on 3 D human body Attitude estimation method, comprising:
S1: with the depth image and RGB color image of monocular camera acquisition human body different angle;
S2: skeleton critical point detection neural network is constructed based on RGB color image, obtains key point mark figure Picture;
S3: image is marked based on corresponding RGB color image and key point, construction hand joint node 2D-3D maps net Network;
S4: the depth image and key point of calibration human body equal angular mark image, and then carry out to respective depth image Three-dimensional point cloud coloring conversion, obtains coloring depth image;
S5: based on key point mark image and coloring depth image, the human body of mark is predicted using default learning network Bone key point corresponding position in depth image;
S6: merging the output of step S3 and step S5, realizes that the fining to 3 D human body Attitude estimation is estimated.
In the step 1, monocular camera can be realized using Kinect camera.
Kinect is more more intelligent than general camera.Firstly, it can emit infrared ray, to carry out to entire room Stereoscopic localized.Camera can then identify the movement of human body by infrared ray.In addition to this, one be combined on Xbox 360 A little high end softwares can carry out real-time tracing to 48 positions of human body.
It should be noted that monocular camera is other than Kinect camera, can also using other existing monocular cameras come It realizes.
Further, skeleton critical point detection neural network is constructed based on RGB color image in the step S2, It specifically includes:
The skeleton key point in RGB color image is marked, data set is constructed;
The data set of building is divided into training set and test set, and training set is input to default skeleton key point It is trained in detection neural network;
Skeleton critical point detection neural network after testing training using test set, until reaching preset requirement.
In the step S2, trained human body is formed by marking skeleton key point to the RGB color image of acquisition The data set of bone critical point detection neural network can rapidly and accurately obtain the skeleton key point of preset requirement in this way Detect neural network.Wherein, preset requirement is the skeleton key point of skeleton critical point detection neural network output Precision is being preset in accuracy rating.
Wherein, skeleton critical point detection neural network can (T be more than or equal to 1 by being connected to T after VGG-19 network Positive integer) a stage, each stage has the structure of 2 full convolutional networks to constitute.
Wherein, VGG (Visual Geometry Group) belongs to Scientific Engineering system, Oxford University, has issued some column The convolutional network model started with VGG.
It should be noted that skeleton critical point detection neural network may be other existing neural network moulds Type.
Further, in the step S3, the hand joint node 2D-3D mapping network of construction exports hand Segmentation figure Picture, the structure of hand joint node 2D-3D mapping network are as follows: in (convolutional layer+ReLu active coating)+maximum pond layer+bilinearity Sampling.
The loss function of above-mentioned hand joint node 2D-3D mapping network uses softmax and cross entropy loss function.
In the present invention, by 2D hand test problems be converted into segmentation problem eliminate different manpowers size dimension difference it is right Network accuracy influences.
It should be noted that hand joint node 2D-3D mapping network is in addition to the foregoing structure, other can also be used Existing neural network structure is realized.
Further, it in the step S4, obtains specifically including the step of colouring depth image:
The depth image and key point that human body equal angular is demarcated using chessboard method mark image;
Match the key point mark image and depth image of human body equal angular;
It adjusts the depth image size after matching and carries out volume rending point cloud.
The present invention demarcates the depth image of human body equal angular using chessboard method and key point marks image, can be accurate Obtain the coordinate information of key point in image.
Further, in the step S5, presetting learning network is U-shaped intensified learning network.
Wherein, U-shaped intensified learning network is the mapping learnt from ambient condition to behavior, so that the behavior of intelligent body selection The maximum award of environment can be obtained, so that evaluation (or whole system of the external environment to learning system under certain meaning Runnability) it is best.
The structure of U-shaped intensified learning network are as follows: the convolution operation of preset times and the pond of preset times are carried out to input It operates (max pool down-sampling), each convolution is followed by one layer of ReLU active coating, repeated several times, the convolution filter after down-sampling Device quantity increases corresponding multiple;
The preset step-length of convolution operation and preset times to the result progress preset times obtained after down-sampling goes to roll up Product operation (up-sampling), each convolution are followed by a ReLU active coating, repeated several times, number of filters reduction phase when up-sampling Answer multiple;Obtained result and corresponding left part convolution results carries out convolution after being attached again;
Finally export accordingly result.
It should be noted that default learning network may be Q type intensified learning network.
Second purpose of invention is to provide one kind based on 3 D human body attitude estimating device, can accurately identify depth Spend the 3 D human body posture in image.
A kind of technical solution based on 3 D human body attitude estimating device of the invention are as follows:
One kind being based on 3 D human body attitude estimating device, comprising:
Image acquisition units, with the depth image and RGB color image of monocular camera acquisition human body different angle;
Key point marks unit, is used to construct skeleton critical point detection neural network based on RGB color image, Obtain key point mark image;
Hard recognition unit is used to construct hand joint based on corresponding RGB color image and key point mark image Node 2D-3D mapping network;
Depth image coloring units, the depth image and key point for being used to demarcate human body equal angular mark image, into And three-dimensional point cloud coloring conversion is carried out to respective depth image, obtain coloring depth image;
Depth image key point predicting unit is used for based on key point mark image and coloring depth image, using pre- If learning network come predict mark skeleton key point in depth image corresponding position;
3 D human body Attitude estimation unit is used to merge hard recognition unit and depth image key point predicting unit Output realizes that the fining to 3 D human body Attitude estimation is estimated.
Wherein, monocular camera can be realized using Kinect camera.
Kinect is more more intelligent than general camera.Firstly, it can emit infrared ray, to carry out to entire room Stereoscopic localized.Camera can then identify the movement of human body by infrared ray.In addition to this, one be combined on Xbox 360 A little high end softwares can carry out real-time tracing to 48 positions of human body.
It should be noted that monocular camera is other than Kinect camera, can also using other existing monocular cameras come It realizes.
Further, the key point marks unit, comprising:
Data set constructs subelement, is used to mark the skeleton key point in RGB color image, constructs data Collection;
Neural metwork training subelement is used to for the data set of building to be divided into training set and test set, and will train Collection, which is input in default skeleton critical point detection neural network, to be trained;
Neural network detection sub-unit, the skeleton critical point detection mind after being used to test training using test set Through network, until reaching preset requirement.
In key point mark unit, formed by marking skeleton key point to the RGB color image of acquisition The data set of training skeleton critical point detection neural network, can rapidly and accurately obtain the human body bone of preset requirement in this way Bone critical point detection neural network.Wherein, preset requirement is the skeleton of skeleton critical point detection neural network output The precision of key point is being preset in accuracy rating.
Wherein, skeleton critical point detection neural network can (T be more than or equal to 1 by being connected to T after VGG-19 network Positive integer) a stage, each stage has the structure of 2 full convolutional networks to constitute.
Wherein, VGG (Visual Geometry Group) belongs to Scientific Engineering system, Oxford University, has issued some column The convolutional network model started with VGG.
It should be noted that skeleton critical point detection neural network may be other existing neural network moulds Type.
Further, in the hard recognition unit, the hand joint node 2D-3D mapping network of construction exports hand Segmented image, the structure of hand joint node 2D-3D mapping network are as follows: (convolutional layer+ReLu active coating)+maximum pond layer+bis- Linear up-sampling.
The loss function of above-mentioned hand joint node 2D-3D mapping network uses softmax and cross entropy loss function.
In the present invention, by 2D hand test problems be converted into segmentation problem eliminate different manpowers size dimension difference it is right Network accuracy influences.
It should be noted that hand joint node 2D-3D mapping network is in addition to the foregoing structure, other can also be used Existing neural network structure is realized.
Further, the depth image coloring units, comprising:
Subelement is demarcated, is used to demarcate the depth image and key point mark figure of human body equal angular using chessboard method Picture;
Coupling subelement is used to match the key point mark image and depth image of human body equal angular;
Volume rending point cloud subelement, the depth image size for being used to adjust after matching simultaneously carry out volume rending point cloud.
The present invention demarcates the depth image of human body equal angular using chessboard method and key point marks image, can be accurate Obtain the coordinate information of key point in image.
Further, in the depth image key point predicting unit, presetting learning network is U-shaped intensified learning net Network.
Wherein, U-shaped intensified learning network is the mapping learnt from ambient condition to behavior, so that the behavior of intelligent body selection The maximum award of environment can be obtained, so that evaluation (or whole system of the external environment to learning system under certain meaning Runnability) it is best.
The structure of U-shaped intensified learning network are as follows: the convolution operation of preset times and the pond of preset times are carried out to input It operates (max pool down-sampling), each convolution is followed by one layer of ReLU active coating, repeated several times, the convolution filter after down-sampling Device quantity increases corresponding multiple;
The preset step-length of convolution operation and preset times to the result progress preset times obtained after down-sampling goes to roll up Product operation (up-sampling), each convolution are followed by a ReLU active coating, repeated several times, number of filters reduction phase when up-sampling Answer multiple;Obtained result and corresponding left part convolution results carries out convolution after being attached again;
Finally export accordingly result.
It should be noted that default learning network may be Q type intensified learning network.
Compared with prior art, the beneficial effects of the present invention are:
(1) present invention is solved with the depth image and RGB color image of monocular camera acquisition human body different angle It is limited in human body attitude estimation field using the condition of more mesh cameras, this method is easier to realize, and can accurately identify 3 D human body posture in depth image out.
(2) present invention can be by identifying 3 D human body posture to reaching after neural metwork training in real time.
(3) trained neural network model can be stored in Miniature Terminal equipment by the present invention, be conveniently integrated into intelligence In energy household, intelligent interactive equipment.
Detailed description of the invention
The accompanying drawings constituting a part of this application is used to provide further understanding of the present application, and the application's shows Meaning property embodiment and its explanation are not constituted an undue limitation on the present application for explaining the application.
Fig. 1 is of the invention based on 3 D human body Attitude estimation method flow diagram;
Fig. 2 is one embodiment schematic diagram of the invention based on 3 D human body Attitude estimation method;
Fig. 3 is one embodiment schematic diagram of skeleton critical point detection neural network of the invention;
Fig. 4 is neural network one embodiment schematic diagram of hand joint node 2D-3D mapping of the invention;
Fig. 5 is a kind of U-shaped intensified learning neural network one embodiment schematic diagram of the invention;
Fig. 6 is of the invention based on 3 D human body attitude estimating device structural schematic diagram.
Specific embodiment
It is noted that following detailed description is all illustrative, it is intended to provide further instruction to the application.Unless another It indicates, all technical and scientific terms used herein has usual with the application person of an ordinary skill in the technical field The identical meanings of understanding.
It should be noted that term used herein above is merely to describe specific embodiment, and be not intended to restricted root According to the illustrative embodiments of the application.As used herein, unless the context clearly indicates otherwise, otherwise singular Also it is intended to include plural form, additionally, it should be understood that, when in the present specification using term "comprising" and/or " packet Include " when, indicate existing characteristics, step, operation, device, component and/or their combination.
As shown in Figure 1, of the invention based on 3 D human body Attitude estimation method, including step S1~step S6.
Specifically, below with reference to of the invention to illustrate based on one embodiment of 3 D human body Attitude estimation method Technical solution, as shown in Figure 2:
It is of the invention based on 3 D human body Attitude estimation method, comprising:
S1: with the depth image and RGB color image of monocular camera acquisition human body different angle.
In the step 1, monocular camera can be realized using Kinect camera.
Kinect is more more intelligent than general camera.Firstly, it can emit infrared ray, to carry out to entire room Stereoscopic localized.Camera can then identify the movement of human body by infrared ray.In addition to this, one be combined on Xbox 360 A little high end softwares can carry out real-time tracing to 48 positions of human body.
It should be noted that monocular camera is other than Kinect camera, can also using other existing monocular cameras come It realizes.
S2: skeleton critical point detection neural network is constructed based on RGB color image, obtains key point mark figure Picture.
Wherein, skeleton critical point detection neural network is constructed based on RGB color image in the step S2, specifically Include:
Step S21: the skeleton key point in mark RGB color image constructs data set;
Specifically, the step of constructing data set are as follows:
Step S211: 12 kinect depth cameras are used, three positions different in a room, each position are placed on 4 kinect depth cameras are placed, four different visual angles is formed in each position, several male and female is shot not respectively With the image of human posture, by collected photo finishing at a picture library.
Step S212: gesture data collection is established using more depth cameras;This data set is acquisition 20 people, 39 differences Gesture motion image, data set is divided into a training set and a test set, then to the intensity of illumination in image, Background image carries out random rendering and expands data diversity.
Step S213: carrying out bone key point mark for the obtained picture library of step S211 and step S212, will be crucial Label of point coordinate information (x, y, d) as image, writes script using shell, is lmdb by image and image tag unloading Or hdf5 formatted file.Wherein: x, y are transverse and longitudinal coordinate of the key point in depth image, and d is depth coordinate.
Step S22: the data set of building is divided into training set and test set, and training set is input to default human body bone It is trained in bone critical point detection neural network;
Step S23: the skeleton critical point detection neural network after testing training using test set, until reaching pre- If it is required that.
In the step S2, trained human body is formed by marking skeleton key point to the RGB color image of acquisition The data set of bone critical point detection neural network can rapidly and accurately obtain the skeleton key point of preset requirement in this way Detect neural network.Wherein, preset requirement is the skeleton key point of skeleton critical point detection neural network output Precision is being preset in accuracy rating.
Wherein, as shown in figure 3, skeleton critical point detection neural network can (T be by being connected to T after VGG-19 network More than or equal to 1 positive integer) a stage, each stage has the structure of 2 full convolutional networks to constitute.
Wherein, VGG (Visual Geometry Group) belongs to Scientific Engineering system, Oxford University, has issued some column The convolutional network model started with VGG.
Specifically, the treatment process of skeleton critical point detection neural network is as follows in this example:
S222: it first using the 2D-RGB image of the w*h obtained by kinect as input, is obtained via first 10 layers of VGG-19 Obtain characteristic pattern F, the input as each branch of model first stage.
S223: each branch in the model first stage, stage generates a series of detection confidence map S respectively11(F) With one group of local relation domain L11(F);Wherein ρ1(F) and Φ1It (F) is one Zhong Liang branch convolutional neural networks of stage respectively Inference.
S224: the specific design of full convolutional network branch 1 is as follows:
(a) because in the present invention 3 d pose identification can be carried out to more people simultaneously, first to each of RGB image Generate independent confidence map
(b) x is usedJ, k∈R2Indicate the actual position of k-th of people, j-th of physical feeling in image.Wherein, j and k is big In 0 positive integer;
(c) keep detected physical feeling key point highlighted using Gaussian Profile:
(d) the maximum key point of Gauss value is taken in every width confidence map:
Wherein, p is pixel coordinate.
S225: full convolutional network branch 2 is used to detect the position and direction information of key point line, and specific design is as follows:
(a): construction supervises true local association domainWherein c is c-th liang of key point connection on k-th of human body Line segment.Construction process is as follows:
(b): enablingWithIn separated image on k-th of human body c-th liang of key point line key point.
(c): the local association vector of the body limb on c-th of line is found out using following formula:
Wherein if p equation (3) on limbs c is v, otherwise equation (3) is 0
(d): two key points on c line do linear difference, approximately find out pixel p and are located at k people in line c On pixel coordinate:
pu=(1-u) xj1-uxj2, 0≤u≤1 (5)
(e): found out using formula (5) has the proprietary relation domain of overlapping relation on c line in image:
Wherein nc(p) it is non-vanishing vector number in point p.
(f): the local relation domain of prediction being sampled, L is usedcThe confidence of k people's lap of measurement is gone along line segment c Degree:
S223: Liang Ge branch per stage is all made of 33 × 3 and 22 × 2 convolutional layers;
The output of first stage full convolutional network and primitive character figure F: being incorporated as the input of second stage by S224, with This iterates to stage T;
S225: two branch models constantly refine each branch target with T stage, in order to effectively gradient be avoided to disappear Losing each stage is added L2Loss function, as supervisory role.A branch penalty function is defined as follows:
Wherein S*It is the value of the true confidence map marked when building database,The confidence map values of prediction are represented, t, which is represented, to be divided Branch model stage, t ∈ [1,2 ..., T], m represent key point position coordinates in figure, and j represents j-th of key point, are two with W (p) System mark, if W (p)=0 when key point labeled data lacks, is otherwise 1, avoids punishing true position in network training Set prediction.
S226: the human body position confidence map that obtains after stage T to two branches and artis relationship are using greedy Center algorithm obtains the 2D key point image of people.Formula (10) is the model formation of entire critical point detection network:
It should be noted that skeleton critical point detection neural network may be other existing neural network moulds Type.
S3: image is marked based on corresponding RGB color image and key point, construction hand joint node 2D-3D maps net Network.
Wherein, in the step S3, the hand joint node 2D-3D mapping network of construction exports hand Segmentation image, The structure of hand joint node 2D-3D mapping network are as follows: adopted in (convolutional layer+ReLu active coating)+maximum pond layer+bilinearity Sample.
The loss function of above-mentioned hand joint node 2D-3D mapping network uses softmax and cross entropy loss function.
In the present invention, by 2D hand test problems be converted into segmentation problem eliminate different manpowers size dimension difference it is right Network accuracy influences.
Wherein, the detailed process of hand joint node 2D-3D mapping network is constructed, as shown in Figure 4:
S31: being 256*256*3 as the input of hand images segmentation network, network using original 2DRGB Image Adjusting size Using (convolutional layer+ReLu active coating)+maximum pond layer+bilinearity up-sampling structure, loss function uses softmax and friendship Entropy loss function is pitched, the hand Segmentation image of 256*256*3 is exported.
S32: using one and the neural network of S31 same structure, using the output of S31 as input, the neural network pair 21 joints generation bounding boxes of hand, and it is 0 that mean value, which is added, at bounding box center, the Gaussian noise that variance is 10, network will divide It Sheng Cheng not 21 32 × 32 × 1 artis thermal maps.
S33: ask 21 2D artis thermal maps to the estimated value of 3D, the specific method is as follows;
S34: a three-dimensional hand joint point coordinate set w is defined firsti=(xi, yi, zi), i ∈ [1, J], J=21.
S35: 3 dimensional database of hand, the one full convolutional neural networks of training obtained using S12 use L2Loss function. Network uses the structure of (convolutional layer+ReLu active coating)+full articulamentum.
S36: the priori knowledge obtained using the full convolutional neural networks of S35 training builds each key point of 2D hand images Vertical regularization coordinate set, formula are as follows:
S=| | wk+1-wk|| (12)
Wherein [1,20] k ∈.
S37: establishing relative coordinate system, and artis position is opposite caused by eliminating because of reasons such as hand size differences loses Very.Using the first joint of index finger as root node, i.e. s=1 in this example, it is opposite furthermore remaining node will to be found out using formula (13) In the relative position of the first artis of index finger.
R is index finger first node.
It should be noted that hand joint node 2D-3D mapping network is in addition to the foregoing structure, other can also be used Existing neural network structure is realized.
S4: the depth image and key point of calibration human body equal angular mark image, and then carry out to respective depth image Three-dimensional point cloud coloring conversion, obtains coloring depth image;
Wherein, it in the step S4, obtains specifically including the step of colouring depth image:
The depth image and key point that human body equal angular is demarcated using chessboard method mark image;
Match the key point mark image and depth image of human body equal angular;
It adjusts the depth image size after matching and carries out volume rending point cloud.
The present invention demarcates the depth image of human body equal angular using chessboard method and key point marks image, can be accurate Obtain the coordinate information of key point in image
S41: it is demarcated using RGB camera of the chessboard method to kinect, utilizes Matlab Camera Calibration Toolbox calculates RGB internal reference.
S42: it is demarcated using depth camera of the chessboard method to kinect, utilizes Matlab Camera Calibration Toolbox calculates RGB internal reference.
S43: 2D-RGB camera and 3D depth camera are registrated, the specific steps are as follows:
S44: depth image space coordinates are established using formula (14):
Pir=HirPir (14)
Wherein PirFor the space coordinate that certain is put under depth camera coordinate, pirFor the point in the plane projection coordinate (x, Y unit is pixel, and z is depth value, and unit is millimeter), HirFor the internal reference matrix of depth camera.
S45: being that RGB camera establishes space coordinate using formula (15), (16):
Prgb=RPir+T (15)
prgb=HrgbPrgb (16)
Wherein PrgbFor the space coordinate of the same point under RGB camera coordinate, prgbFor this in RGB as the throwing in plane Shadow coordinate, HrgbFor the internal reference matrix of RGB camera, R is spin matrix, and T is translation vector.
S46: using matrix is joined outside camera, by the point transformation in global coordinate system to camera matrix, transformation for mula is such as Formula (17):
Wherein spin matrix Rir(Rrgb) and translation vector Tir(Trgb) be depth camera (RGB camera) outer ginseng square Battle array
S47: the volume rending point cloud matrix for being 64 × 64 × 64 by the Image Adjusting size after registration.
S5: based on key point mark image and coloring depth image, the human body of mark is predicted using default learning network Bone key point corresponding position in depth image;
Wherein, in the step S5, presetting learning network is U-shaped intensified learning network.
Wherein, U-shaped intensified learning network is the mapping learnt from ambient condition to behavior, so that the behavior of intelligent body selection The maximum award of environment can be obtained, so that evaluation (or whole system of the external environment to learning system under certain meaning Runnability) it is best.
The structure of U-shaped intensified learning network are as follows: the convolution operation of preset times and the pond of preset times are carried out to input It operates (max pool down-sampling), each convolution is followed by one layer of ReLU active coating, repeated several times, the convolution filter after down-sampling Device quantity increases corresponding multiple;
The preset step-length of convolution operation and preset times to the result progress preset times obtained after down-sampling goes to roll up Product operation (up-sampling), each convolution are followed by a ReLU active coating, repeated several times, number of filters reduction phase when up-sampling Answer multiple;Obtained result and corresponding left part convolution results carries out convolution after being attached again;
Finally export accordingly result.
It should be noted that default learning network may be Q type intensified learning network.
Specifically, U-shaped intensified learning network structure, as shown in Figure 5:
S52: 23 × 3 × 3 convolution operations are carried out to the input of S2, S4 and 12 × 2 × 2 pondization operates (max pool Down-sampling), each convolution is followed by one layer of ReLU active coating, is repeated 4 times, and the Convolution Filter quantity after down-sampling increases by 2 times.
S53: the behaviour of deconvoluting of 23 × 3 convolution operations and 1 hyposynchronization a length of 2 × 2 is carried out to the result obtained after down-sampling Make (up-sampling), each convolution is followed by a ReLU active coating, is repeated 4 times, and 2 times of reduction of number of filters when up-sampling, obtains Result and corresponding left part convolution results be attached after carry out convolution again, at this moment Convolution Filter quantity reduces 2 times.
S54: key point confidence map in output point cloud.
S6: merging the output of step S3 and step S5, realizes that the fining to 3 D human body Attitude estimation is estimated.
It is of the invention based on 3 D human body Attitude estimation method, with the depth map of monocular camera acquisition human body different angle Picture and RGB color image, are solved and are limited in human body attitude estimation field using the condition of more mesh cameras, and this method is easier It realizes, and can accurately identify the 3 D human body posture in depth image.
As shown in fig. 6, a kind of technical solution based on 3 D human body attitude estimating device of the invention are as follows:
One kind being based on 3 D human body attitude estimating device, comprising:
(1) image acquisition units, with the depth image and RGB color figure of monocular camera acquisition human body different angle Picture;
Wherein, monocular camera can be realized using Kinect camera.
Kinect is more more intelligent than general camera.Firstly, it can emit infrared ray, to carry out to entire room Stereoscopic localized.Camera can then identify the movement of human body by infrared ray.In addition to this, one be combined on Xbox 360 A little high end softwares can carry out real-time tracing to 48 positions of human body.
It should be noted that monocular camera is other than Kinect camera, can also using other existing monocular cameras come It realizes.
(2) key point marks unit, is used to construct skeleton critical point detection nerve net based on RGB color image Network obtains key point mark image;
Wherein, the key point marks unit, comprising:
Data set constructs subelement, is used to mark the skeleton key point in RGB color image, constructs data Collection;
Neural metwork training subelement is used to for the data set of building to be divided into training set and test set, and will train Collection, which is input in default skeleton critical point detection neural network, to be trained;
Neural network detection sub-unit, the skeleton critical point detection mind after being used to test training using test set Through network, until reaching preset requirement.
In key point mark unit, formed by marking skeleton key point to the RGB color image of acquisition The data set of training skeleton critical point detection neural network, can rapidly and accurately obtain the human body bone of preset requirement in this way Bone critical point detection neural network.Wherein, preset requirement is the skeleton of skeleton critical point detection neural network output The precision of key point is being preset in accuracy rating.
Wherein, skeleton critical point detection neural network can (T be more than or equal to 1 by being connected to T after VGG-19 network Positive integer) a stage, each stage has the structure of 2 full convolutional networks to constitute.
Wherein, VGG (Visual Geometry Group) belongs to Scientific Engineering system, Oxford University, has issued some column The convolutional network model started with VGG.
It should be noted that skeleton critical point detection neural network may be other existing neural network moulds Type.
(3) hard recognition unit is used to construct hand based on corresponding RGB color image and key point mark image Articulation nodes 2D-3D mapping network;
In the hard recognition unit, the hand joint node 2D-3D mapping network of construction exports hand Segmentation image, The structure of hand joint node 2D-3D mapping network are as follows: adopted in (convolutional layer+ReLu active coating)+maximum pond layer+bilinearity Sample.
The loss function of above-mentioned hand joint node 2D-3D mapping network uses softmax and cross entropy loss function.
In the present invention, by 2D hand test problems be converted into segmentation problem eliminate different manpowers size dimension difference it is right Network accuracy influences.
It should be noted that hand joint node 2D-3D mapping network is in addition to the foregoing structure, other can also be used Existing neural network structure is realized.
(4) depth image coloring units, the depth image and key point for being used to demarcate human body equal angular mark image, And then three-dimensional point cloud coloring conversion is carried out to respective depth image, obtain coloring depth image;
Wherein, the depth image coloring units, comprising:
Subelement is demarcated, is used to demarcate the depth image and key point mark figure of human body equal angular using chessboard method Picture;
Coupling subelement is used to match the key point mark image and depth image of human body equal angular;
Volume rending point cloud subelement, the depth image size for being used to adjust after matching simultaneously carry out volume rending point cloud.
The present invention demarcates the depth image of human body equal angular using chessboard method and key point marks image, can be accurate Obtain the coordinate information of key point in image.
(5) depth image key point predicting unit is used to utilize based on key point mark image and coloring depth image Default learning network come predict mark skeleton key point in depth image corresponding position;
Wherein, in the depth image key point predicting unit, presetting learning network is U-shaped intensified learning network.
Wherein, U-shaped intensified learning network is the mapping learnt from ambient condition to behavior, so that the behavior of intelligent body selection The maximum award of environment can be obtained, so that evaluation (or whole system of the external environment to learning system under certain meaning Runnability) it is best.
The structure of U-shaped intensified learning network are as follows: the convolution operation of preset times and the pond of preset times are carried out to input It operates (max pool down-sampling), each convolution is followed by one layer of ReLU active coating, repeated several times, the convolution filter after down-sampling Device quantity increases corresponding multiple;
The preset step-length of convolution operation and preset times to the result progress preset times obtained after down-sampling goes to roll up Product operation (up-sampling), each convolution are followed by a ReLU active coating, repeated several times, number of filters reduction phase when up-sampling Answer multiple;Obtained result and corresponding left part convolution results carries out convolution after being attached again;
Finally export accordingly result.
It should be noted that default learning network may be Q type intensified learning network.
(6) 3 D human body Attitude estimation unit is used to merge hard recognition unit and depth image key point prediction list The output of member realizes that the fining to 3 D human body Attitude estimation is estimated.
It is of the invention based on 3 D human body attitude estimating device, with the depth map of monocular camera acquisition human body different angle Picture and RGB color image, are solved and are limited in human body attitude estimation field using the condition of more mesh cameras, and this method is easier It realizes, and can accurately identify the 3 D human body posture in depth image.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system, device or computer Program product.Therefore, hardware embodiment, software implementation or embodiment combining software and hardware aspects can be used in the present invention Form.It can be used moreover, the present invention can be used in the computer that one or more wherein includes computer usable program code The form for the computer program product implemented on storage medium (including but not limited to magnetic disk storage and optical memory etc.).
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the program can be stored in a computer-readable storage medium In, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, the storage medium can be magnetic Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random AccessMemory, RAM) etc..
Above-mentioned, although the foregoing specific embodiments of the present invention is described with reference to the accompanying drawings, not protects model to the present invention The limitation enclosed, those skilled in the art should understand that, based on the technical solutions of the present invention, those skilled in the art are not Need to make the creative labor the various modifications or changes that can be made still within protection scope of the present invention.

Claims (10)

1. one kind is based on 3 D human body Attitude estimation method characterized by comprising
S1: with the depth image and RGB color image of monocular camera acquisition human body different angle;
S2: constructing skeleton critical point detection neural network based on RGB color image, obtains key point mark image;
S3: image is marked based on corresponding RGB color image and key point, constructs hand joint node 2D-3D mapping network;
S4: the depth image and key point of calibration human body equal angular mark image, and then carry out to respective depth image three-dimensional Point cloud coloring conversion, obtains coloring depth image;
S5: based on key point mark image and coloring depth image, the skeleton of mark is predicted using default learning network Key point corresponding position in depth image;
S6: merging the output of step S3 and step S5, realizes that the fining to 3 D human body Attitude estimation is estimated.
2. as described in claim 1 a kind of based on 3 D human body Attitude estimation method, which is characterized in that base in the step S2 Skeleton critical point detection neural network is constructed in RGB color image, is specifically included:
The skeleton key point in RGB color image is marked, data set is constructed;
The data set of building is divided into training set and test set, and training set is input to default skeleton critical point detection It is trained in neural network;
Skeleton critical point detection neural network after testing training using test set, until reaching preset requirement.
3. as described in claim 1 a kind of based on 3 D human body Attitude estimation method, which is characterized in that in the step S3 In, the hand joint node 2D-3D mapping network of construction exports hand Segmentation image, hand joint node 2D-3D mapping network Structure are as follows: (convolutional layer+ReLu active coating)+maximum pond layer+bilinearity up-samples.
4. as described in claim 1 a kind of based on 3 D human body Attitude estimation method, which is characterized in that in the step S4 In, it obtains specifically including the step of colouring depth image:
The depth image and key point that human body equal angular is demarcated using chessboard method mark image;
Match the key point mark image and depth image of human body equal angular;
It adjusts the depth image size after matching and carries out volume rending point cloud.
5. as described in claim 1 a kind of based on 3 D human body Attitude estimation method, which is characterized in that in the step S5 In, presetting learning network is U-shaped intensified learning network.
6. one kind is based on 3 D human body attitude estimating device characterized by comprising
Image acquisition units, with the depth image and RGB color image of monocular camera acquisition human body different angle;
Key point marks unit, is used to construct skeleton critical point detection neural network based on RGB color image, obtain Key point marks image;
Hard recognition unit is used to construct hand joint node based on corresponding RGB color image and key point mark image 2D-3D mapping network;
Depth image coloring units, the depth image and key point for being used to demarcate human body equal angular mark image, and then right Respective depth image carries out three-dimensional point cloud coloring conversion, obtains coloring depth image;
Depth image key point predicting unit is used to utilize default based on key point mark image and coloring depth image Network is practised to predict the skeleton key point marked corresponding position in depth image;
3 D human body Attitude estimation unit is used to merge the defeated of hard recognition unit and depth image key point predicting unit Out, realize that the fining to 3 D human body Attitude estimation is estimated.
7. as claimed in claim 6 a kind of based on 3 D human body attitude estimating device, which is characterized in that the key point mark Unit, comprising:
Data set constructs subelement, is used to mark the skeleton key point in RGB color image, constructs data set;
Neural metwork training subelement is used to the data set of building being divided into training set and test set, and training set is defeated Enter into default skeleton critical point detection neural network and is trained;
Neural network detection sub-unit, the skeleton critical point detection nerve net after being used to test training using test set Network, until reaching preset requirement.
8. as claimed in claim 6 a kind of based on 3 D human body attitude estimating device, which is characterized in that in the hard recognition In unit, the hand joint node 2D-3D mapping network of construction exports hand Segmentation image, hand joint node 2D-3D mapping The structure of network are as follows: (convolutional layer+ReLu active coating)+maximum pond layer+bilinearity up-sampling.
9. as claimed in claim 6 a kind of based on 3 D human body attitude estimating device, which is characterized in that the depth image Color element, comprising:
Subelement is demarcated, the depth image and key point for being used to demarcate human body equal angular using chessboard method mark image;
Coupling subelement is used to match the key point mark image and depth image of human body equal angular;
Volume rending point cloud subelement, the depth image size for being used to adjust after matching simultaneously carry out volume rending point cloud.
10. as claimed in claim 6 a kind of based on 3 D human body attitude estimating device, which is characterized in that in the depth map It is U-shaped intensified learning network as in key point predicting unit, presetting learning network.
CN201810426144.1A 2018-05-07 2018-05-07 One kind being based on 3 D human body Attitude estimation method and device Active CN108830150B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810426144.1A CN108830150B (en) 2018-05-07 2018-05-07 One kind being based on 3 D human body Attitude estimation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810426144.1A CN108830150B (en) 2018-05-07 2018-05-07 One kind being based on 3 D human body Attitude estimation method and device

Publications (2)

Publication Number Publication Date
CN108830150A CN108830150A (en) 2018-11-16
CN108830150B true CN108830150B (en) 2019-05-28

Family

ID=64147503

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810426144.1A Active CN108830150B (en) 2018-05-07 2018-05-07 One kind being based on 3 D human body Attitude estimation method and device

Country Status (1)

Country Link
CN (1) CN108830150B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11809616B1 (en) 2022-06-23 2023-11-07 Qing Zhang Twin pose detection method and system based on interactive indirect inference

Families Citing this family (59)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111222379A (en) * 2018-11-27 2020-06-02 株式会社日立制作所 Hand detection method and device
CN109684943B (en) * 2018-12-07 2021-03-16 北京首钢自动化信息技术有限公司 Athlete auxiliary training data acquisition method and device and electronic equipment
CN109815813B (en) * 2018-12-21 2021-03-05 深圳云天励飞技术有限公司 Image processing method and related product
CN109871123B (en) * 2019-01-21 2022-08-16 广东精标科技股份有限公司 Teaching method based on gesture or eye control
CN109886986B (en) * 2019-01-23 2020-09-08 北京航空航天大学 Dermatoscope image segmentation method based on multi-branch convolutional neural network
CN109920208A (en) * 2019-01-31 2019-06-21 深圳绿米联创科技有限公司 Tumble prediction technique, device, electronic equipment and system
CN109934111B (en) * 2019-02-12 2020-11-24 清华大学深圳研究生院 Fitness posture estimation method and system based on key points
CN109949368B (en) * 2019-03-14 2020-11-06 郑州大学 Human body three-dimensional attitude estimation method based on image retrieval
CN110032992B (en) * 2019-04-25 2023-05-23 沈阳图为科技有限公司 Examination cheating detection method based on gestures
CN110175528B (en) * 2019-04-29 2021-10-26 北京百度网讯科技有限公司 Human body tracking method and device, computer equipment and readable medium
CN111914595B (en) * 2019-05-09 2022-11-15 中国科学院软件研究所 Human hand three-dimensional attitude estimation method and device based on color image
CN110119148B (en) * 2019-05-14 2022-04-29 深圳大学 Six-degree-of-freedom attitude estimation method and device and computer readable storage medium
CN110188633B (en) * 2019-05-14 2023-04-07 广州虎牙信息科技有限公司 Human body posture index prediction method and device, electronic equipment and storage medium
CN110135375B (en) * 2019-05-20 2021-06-01 中国科学院宁波材料技术与工程研究所 Multi-person attitude estimation method based on global information integration
CN110176016B (en) * 2019-05-28 2021-04-30 招远市国有资产经营有限公司 Virtual fitting method based on human body contour segmentation and skeleton recognition
CN110197156B (en) * 2019-05-30 2021-08-17 清华大学 Single-image human hand action and shape reconstruction method and device based on deep learning
CN112102223B (en) * 2019-06-18 2024-05-14 通用电气精准医疗有限责任公司 Method and system for automatically setting scan range
CN110298916B (en) * 2019-06-21 2022-07-01 湖南大学 Three-dimensional human body reconstruction method based on synthetic depth data
CN110472476B (en) * 2019-06-24 2024-06-28 平安科技(深圳)有限公司 Gesture matching degree acquisition method, device, computer and storage medium
CN110472481B (en) * 2019-07-01 2024-01-05 华南师范大学 Sleeping gesture detection method, device and equipment
CN110495889B (en) * 2019-07-04 2022-05-27 平安科技(深圳)有限公司 Posture evaluation method, electronic device, computer device, and storage medium
CN110428493B (en) * 2019-07-12 2021-11-02 清华大学 Single-image human body three-dimensional reconstruction method and system based on grid deformation
CN110348524B (en) * 2019-07-15 2022-03-04 深圳市商汤科技有限公司 Human body key point detection method and device, electronic equipment and storage medium
CN110427917B (en) * 2019-08-14 2022-03-22 北京百度网讯科技有限公司 Method and device for detecting key points
CN110555412B (en) * 2019-09-05 2023-05-16 深圳龙岗智能视听研究院 End-to-end human body gesture recognition method based on combination of RGB and point cloud
CN110728739B (en) * 2019-09-30 2023-04-14 杭州师范大学 Virtual human control and interaction method based on video stream
CN111079523B (en) * 2019-11-05 2024-05-14 北京迈格威科技有限公司 Object detection method, device, computer equipment and storage medium
CN111027407B (en) * 2019-11-19 2023-04-07 东南大学 Color image hand posture estimation method for shielding situation
CN111062326B (en) * 2019-12-02 2023-07-25 北京理工大学 Self-supervision human body 3D gesture estimation network training method based on geometric driving
CN111028283B (en) * 2019-12-11 2024-01-12 北京迈格威科技有限公司 Image detection method, device, equipment and readable storage medium
CN113012091A (en) * 2019-12-20 2021-06-22 中国科学院沈阳计算技术研究所有限公司 Impeller quality detection method and device based on multi-dimensional monocular depth estimation
CN111179419B (en) * 2019-12-31 2023-09-05 北京奇艺世纪科技有限公司 Three-dimensional key point prediction and deep learning model training method, device and equipment
CN111160375B (en) * 2019-12-31 2024-01-23 北京奇艺世纪科技有限公司 Three-dimensional key point prediction and deep learning model training method, device and equipment
CN111429499B (en) * 2020-02-24 2023-03-10 中山大学 High-precision three-dimensional reconstruction method for hand skeleton based on single depth camera
CN113382154A (en) * 2020-02-25 2021-09-10 荣耀终端有限公司 Human body image beautifying method based on depth and electronic equipment
CN111046858B (en) * 2020-03-18 2020-09-08 成都大熊猫繁育研究基地 Image-based animal species fine classification method, system and medium
CN113449565A (en) * 2020-03-27 2021-09-28 海信集团有限公司 Three-dimensional attitude estimation method, intelligent device and storage medium
CN111582204A (en) * 2020-05-13 2020-08-25 北京市商汤科技开发有限公司 Attitude detection method and apparatus, computer device and storage medium
CN111753669A (en) * 2020-05-29 2020-10-09 广州幻境科技有限公司 Hand data identification method, system and storage medium based on graph convolution network
CN111753747B (en) * 2020-06-28 2023-11-24 高新兴科技集团股份有限公司 Violent motion detection method based on monocular camera and three-dimensional attitude estimation
CN111753801A (en) * 2020-07-02 2020-10-09 上海万面智能科技有限公司 Human body posture tracking and animation generation method and device
CN111968235B (en) * 2020-07-08 2024-04-12 杭州易现先进科技有限公司 Object attitude estimation method, device and system and computer equipment
CN112076073A (en) * 2020-07-27 2020-12-15 深圳瀚维智能医疗科技有限公司 Automatic massage area dividing method and device, massage robot and storage medium
CN112069933A (en) * 2020-08-21 2020-12-11 董秀园 Skeletal muscle stress estimation method based on posture recognition and human body biomechanics
CN111881887A (en) * 2020-08-21 2020-11-03 董秀园 Multi-camera-based motion attitude monitoring and guiding method and device
CN112107318B (en) * 2020-09-24 2024-02-27 自达康(北京)科技有限公司 Physical activity ability evaluation system
CN112287866B (en) * 2020-11-10 2024-05-31 上海依图网络科技有限公司 Human body action recognition method and device based on human body key points
CN112116653B (en) * 2020-11-23 2021-03-30 华南理工大学 Object posture estimation method for multiple RGB pictures
CN112836594B (en) * 2021-01-15 2023-08-08 西北大学 Three-dimensional hand gesture estimation method based on neural network
CN112766153B (en) * 2021-01-19 2022-03-11 合肥工业大学 Three-dimensional human body posture estimation method and system based on deep learning
CN112836824B (en) * 2021-03-04 2023-04-18 上海交通大学 Monocular three-dimensional human body pose unsupervised learning method, system and medium
CN113112583B (en) * 2021-03-22 2023-06-20 成都理工大学 3D human body reconstruction method based on infrared thermal imaging
CN113158910A (en) * 2021-04-25 2021-07-23 北京华捷艾米科技有限公司 Human skeleton recognition method and device, computer equipment and storage medium
CN113362452B (en) * 2021-06-07 2022-11-15 中南大学 Hand posture three-dimensional reconstruction method and device and storage medium
CN113609993A (en) * 2021-08-06 2021-11-05 烟台艾睿光电科技有限公司 Attitude estimation method, device and equipment and computer readable storage medium
CN113762177A (en) * 2021-09-13 2021-12-07 成都市谛视科技有限公司 Real-time human body 3D posture estimation method and device, computer equipment and storage medium
CN113689503B (en) * 2021-10-25 2022-02-25 北京市商汤科技开发有限公司 Target object posture detection method, device, equipment and storage medium
CN114821819B (en) * 2022-06-30 2022-09-23 南通同兴健身器材有限公司 Real-time monitoring method for body-building action and artificial intelligence recognition system
CN116797625B (en) * 2023-07-20 2024-04-19 无锡埃姆维工业控制设备有限公司 Monocular three-dimensional workpiece pose estimation method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102855470A (en) * 2012-07-31 2013-01-02 中国科学院自动化研究所 Estimation method of human posture based on depth image
CN102982557A (en) * 2012-11-06 2013-03-20 桂林电子科技大学 Method for processing space hand signal gesture command based on depth camera
CN103597515A (en) * 2011-06-06 2014-02-19 微软公司 System for recognizing an open or closed hand
CN104715493A (en) * 2015-03-23 2015-06-17 北京工业大学 Moving body posture estimating method
CN105069423A (en) * 2015-07-29 2015-11-18 北京格灵深瞳信息技术有限公司 Human body posture detection method and device
CN106570903A (en) * 2016-10-13 2017-04-19 华南理工大学 Visual identification and positioning method based on RGB-D camera
CN107066935A (en) * 2017-01-25 2017-08-18 网易(杭州)网络有限公司 Hand gestures method of estimation and device based on deep learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103597515A (en) * 2011-06-06 2014-02-19 微软公司 System for recognizing an open or closed hand
CN102855470A (en) * 2012-07-31 2013-01-02 中国科学院自动化研究所 Estimation method of human posture based on depth image
CN102982557A (en) * 2012-11-06 2013-03-20 桂林电子科技大学 Method for processing space hand signal gesture command based on depth camera
CN104715493A (en) * 2015-03-23 2015-06-17 北京工业大学 Moving body posture estimating method
CN105069423A (en) * 2015-07-29 2015-11-18 北京格灵深瞳信息技术有限公司 Human body posture detection method and device
CN106570903A (en) * 2016-10-13 2017-04-19 华南理工大学 Visual identification and positioning method based on RGB-D camera
CN107066935A (en) * 2017-01-25 2017-08-18 网易(杭州)网络有限公司 Hand gestures method of estimation and device based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于Kinect骨架信息的人体动作识别;刘飞;《中国优秀硕士学位论文全文数据库 信息科技辑》;20140615(第06期);I138-955

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11809616B1 (en) 2022-06-23 2023-11-07 Qing Zhang Twin pose detection method and system based on interactive indirect inference

Also Published As

Publication number Publication date
CN108830150A (en) 2018-11-16

Similar Documents

Publication Publication Date Title
CN108830150B (en) One kind being based on 3 D human body Attitude estimation method and device
CN111126272B (en) Posture acquisition method, and training method and device of key point coordinate positioning model
CN105787439B (en) A kind of depth image human synovial localization method based on convolutional neural networks
CN104715493B (en) A kind of method of movement human Attitude estimation
CN105069746B (en) Video real-time face replacement method and its system based on local affine invariant and color transfer technology
CN105631861B (en) Restore the method for 3 D human body posture from unmarked monocular image in conjunction with height map
CN104978580B (en) A kind of insulator recognition methods for unmanned plane inspection transmission line of electricity
CN102855470B (en) Estimation method of human posture based on depth image
CN100543775C (en) The method of following the tracks of based on the 3 d human motion of many orders camera
CN104794737B (en) A kind of depth information Auxiliary Particle Filter tracking
CN107767419A (en) A kind of skeleton critical point detection method and device
CN104036488B (en) Binocular vision-based human body posture and action research method
CN107423730A (en) A kind of body gait behavior active detecting identifying system and method folded based on semanteme
Nguyen et al. Static hand gesture recognition using artificial neural network
CN106997605A (en) It is a kind of that the method that foot type video and sensing data obtain three-dimensional foot type is gathered by smart mobile phone
CN101520902A (en) System and method for low cost motion capture and demonstration
CN106023211A (en) Robot image positioning method and system base on deep learning
CN109087245A (en) Unmanned aerial vehicle remote sensing image mosaic system based on neighbouring relations model
CN104537705A (en) Augmented reality based mobile platform three-dimensional biomolecule display system and method
CN111160294A (en) Gait recognition method based on graph convolution network
CN109000655A (en) Robot bionic indoor positioning air navigation aid
CN102289822A (en) Method for tracking moving target collaboratively by multiple cameras
CN114036969A (en) 3D human body action recognition algorithm under multi-view condition
CN117557755B (en) Virtual scene secondary normal school biochemical body and clothing visualization method and system
Liu et al. Key algorithm for human motion recognition in virtual reality video sequences based on hidden markov model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20210415

Address after: 102300 No.1 Qiaoyuan Road, Mentougou District, Beijing

Patentee after: Beijing Micro-Chain Daoi Technology Co.,Ltd.

Address before: 250014 No. 88, Wenhua East Road, Lixia District, Shandong, Ji'nan

Patentee before: SHANDONG NORMAL University

TR01 Transfer of patent right