CN108830150B - One kind being based on 3 D human body Attitude estimation method and device - Google Patents
One kind being based on 3 D human body Attitude estimation method and device Download PDFInfo
- Publication number
- CN108830150B CN108830150B CN201810426144.1A CN201810426144A CN108830150B CN 108830150 B CN108830150 B CN 108830150B CN 201810426144 A CN201810426144 A CN 201810426144A CN 108830150 B CN108830150 B CN 108830150B
- Authority
- CN
- China
- Prior art keywords
- image
- human body
- key point
- depth image
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses one kind to be based on 3 D human body Attitude estimation method and device, wherein this method includes S1: with the depth image and RGB color image of monocular camera acquisition human body different angle;S2: constructing skeleton critical point detection neural network based on RGB color image, obtains key point mark image;S3: construction hand joint node 2D-3D mapping network;S4: the depth image and key point of calibration human body equal angular mark image, and then carry out three-dimensional point cloud coloring conversion to respective depth image, obtain coloring depth image;S5: based on key point mark image and coloring depth image, the skeleton key point marked corresponding position in depth image is predicted using default learning network;S6: merging the output of step S3 and step S5, realizes that the fining to 3 D human body Attitude estimation is estimated.
Description
Technical field
The invention belongs to computer vision, image procossing, computer graphics and deep learning application fields, more particularly to
One kind being based on 3 D human body Attitude estimation method and device.
Background technique
So-called human body attitude estimation, which refers to, matches abstraction hierarchy feature with manikin, to obtain different moments
Posture locating for target.Human body attitude estimation is the key problem of human body motion capture.The posture expression of human body includes two sides
Face, first is that position and direction of the entire human body in world coordinates;Second is that the angle in body parts joint and being influenced by joint angle
Skin deformation.The main application fields of human motion Attitude estimation can be divided into three general orientation: monitoring, control and analysis:
(1) in monitoring application aspect, some traditional applications be included in detect and position automatically in airport or subway pedestrian,
Demographics or crowd's flowing, congestion analysis etc..With the raising of awareness of safety, occur some novel answer in recent years
With --- the analysis of personal or crowd behavior and movement.Such as in queuing and shopping, irregular behavior or progress are detected
Identification etc..
(2) in control application aspect, people control target using motion estimation result or attitude parameter.This
Application in terms of human-computer interaction is at most.In entertainment industry such as film and game animation etc., using also increasingly wider.People can benefit
With the shape, appearance and movement of the people captured, to make 3D film or rebuild the threedimensional model of the people in game.
(3) in analysis application aspect, including automatic diagnosis, the analysis and improvement to player motion to surgical patient
Deng.In terms of visual media, there is the application such as content based video retrieval system, video compress.In addition, in terms of automobile industry also
Relevant application, such as automatic control, sleep detection and the pedestrian detection of air bag etc. are arrived.
The human movement capture system of comparative maturity has based on motor machine currently on the market, electromagnetism and special optical
The types such as mark.Magnetic or optical label is attached on the limbs of people, their three-dimensional track is used to description target fortune
Dynamic, these systems are automatic, but its disadvantages are that: equipment is very heavy, and expensive, is unable to get extensive
Using.
Therefore, research hotspot is had become based on computer vision human body motion capture technology.It utilizes computer vision
Basic principle, 3 d human motion sequence is directly extracted from video.This method, which does not need to add on human synovial, appoints
What sensor, ensure that human motion is unrestricted, and low cost, high-efficient.Method mostly uses be based on greatly currently popular
The matching technique of manikin.The target of this method is that one group of attitude parameter is found in state space, so that corresponding to this
The human body attitude of parameter meets the most with the low-level image feature extracted from observed image.
In this field of motion tracking based on computer vision, the research method generally used is:
In the beginning of tracking, determine in image sequence the position of human body of first frame, in subsequent sequence the determination of human body target according to
Rely the continuity and kinematical constraint condition in human motion.Wherein it is determined that there are two types of methods for first frame position of human body:
First is that the first posture of artificial regulation target or headed by manikin is set frame approximate posture, this is unfavorable for
The automation of human body tracking.
Second is that determining each position of body, this method using location detection method after removing the background other than human body
The strict guarantee that can partially realize automation, but people's scape is needed to divide.
In subsequent human body tracking and 3 d pose estimation, there is the method based on model and model-free.Wherein:
(1) conventional method based on model is the 3D model for establishing human body in advance, by the first frame of model and motion sequence
Match, it is further using optimization methods such as gradient decline or stochastical samplings using conditions such as kinematic parameter limitations in subsequent tracking
The model parameter of each frame is estimated, to obtain model sport sequence.The shortcomings that this method is: the tracking of subsequent frame exists tired
Product error, tracks be easy error for a long time.
(2) model-free methods do not need to establish manikin, but according to the geometry of human motion presentation, texture, color
Etc. information, estimate using study or based on the method for sample human motion posture.The shortcomings that this method, is: human motion
Posture is difficult to depend on priori knowledge, and can only track specific behavior aggregate with limited state description.
Monocular-camera all can be used based on both of model and model-free trackings or multi-lens camera is realized.By
There is the ambiguousness from three-dimensional to two-dimensional map in the reconstruction in the normal image for not having depth information, and for compound movement
Attitude estimation is extremely difficult, therefore in the research of more than ten years in past, and being all based on for most of human body motion tracking technologies is more
It is realized under the conditions of lens camera, depth information is obtained with this.But be using the condition of multi-lens camera: need calibrate and
Inconvenience is arranged in average family, is unfavorable for the application popularization of movement capturing technology into huge numbers of families.
In conclusion for the limitation of multi-lens camera use condition in the prior art and in order to quickly and easily identify
Depth image out needs a kind of effective solution scheme.
Summary of the invention
In order to solve the deficiencies in the prior art, the first object of the present invention is to provide a kind of based on 3 D human body Attitude estimation
Method can accurately identify the 3 D human body posture in depth image.
A kind of technical solution based on 3 D human body Attitude estimation method of the invention are as follows:
One kind being based on 3 D human body Attitude estimation method, comprising:
S1: with the depth image and RGB color image of monocular camera acquisition human body different angle;
S2: skeleton critical point detection neural network is constructed based on RGB color image, obtains key point mark figure
Picture;
S3: image is marked based on corresponding RGB color image and key point, construction hand joint node 2D-3D maps net
Network;
S4: the depth image and key point of calibration human body equal angular mark image, and then carry out to respective depth image
Three-dimensional point cloud coloring conversion, obtains coloring depth image;
S5: based on key point mark image and coloring depth image, the human body of mark is predicted using default learning network
Bone key point corresponding position in depth image;
S6: merging the output of step S3 and step S5, realizes that the fining to 3 D human body Attitude estimation is estimated.
In the step 1, monocular camera can be realized using Kinect camera.
Kinect is more more intelligent than general camera.Firstly, it can emit infrared ray, to carry out to entire room
Stereoscopic localized.Camera can then identify the movement of human body by infrared ray.In addition to this, one be combined on Xbox 360
A little high end softwares can carry out real-time tracing to 48 positions of human body.
It should be noted that monocular camera is other than Kinect camera, can also using other existing monocular cameras come
It realizes.
Further, skeleton critical point detection neural network is constructed based on RGB color image in the step S2,
It specifically includes:
The skeleton key point in RGB color image is marked, data set is constructed;
The data set of building is divided into training set and test set, and training set is input to default skeleton key point
It is trained in detection neural network;
Skeleton critical point detection neural network after testing training using test set, until reaching preset requirement.
In the step S2, trained human body is formed by marking skeleton key point to the RGB color image of acquisition
The data set of bone critical point detection neural network can rapidly and accurately obtain the skeleton key point of preset requirement in this way
Detect neural network.Wherein, preset requirement is the skeleton key point of skeleton critical point detection neural network output
Precision is being preset in accuracy rating.
Wherein, skeleton critical point detection neural network can (T be more than or equal to 1 by being connected to T after VGG-19 network
Positive integer) a stage, each stage has the structure of 2 full convolutional networks to constitute.
Wherein, VGG (Visual Geometry Group) belongs to Scientific Engineering system, Oxford University, has issued some column
The convolutional network model started with VGG.
It should be noted that skeleton critical point detection neural network may be other existing neural network moulds
Type.
Further, in the step S3, the hand joint node 2D-3D mapping network of construction exports hand Segmentation figure
Picture, the structure of hand joint node 2D-3D mapping network are as follows: in (convolutional layer+ReLu active coating)+maximum pond layer+bilinearity
Sampling.
The loss function of above-mentioned hand joint node 2D-3D mapping network uses softmax and cross entropy loss function.
In the present invention, by 2D hand test problems be converted into segmentation problem eliminate different manpowers size dimension difference it is right
Network accuracy influences.
It should be noted that hand joint node 2D-3D mapping network is in addition to the foregoing structure, other can also be used
Existing neural network structure is realized.
Further, it in the step S4, obtains specifically including the step of colouring depth image:
The depth image and key point that human body equal angular is demarcated using chessboard method mark image;
Match the key point mark image and depth image of human body equal angular;
It adjusts the depth image size after matching and carries out volume rending point cloud.
The present invention demarcates the depth image of human body equal angular using chessboard method and key point marks image, can be accurate
Obtain the coordinate information of key point in image.
Further, in the step S5, presetting learning network is U-shaped intensified learning network.
Wherein, U-shaped intensified learning network is the mapping learnt from ambient condition to behavior, so that the behavior of intelligent body selection
The maximum award of environment can be obtained, so that evaluation (or whole system of the external environment to learning system under certain meaning
Runnability) it is best.
The structure of U-shaped intensified learning network are as follows: the convolution operation of preset times and the pond of preset times are carried out to input
It operates (max pool down-sampling), each convolution is followed by one layer of ReLU active coating, repeated several times, the convolution filter after down-sampling
Device quantity increases corresponding multiple;
The preset step-length of convolution operation and preset times to the result progress preset times obtained after down-sampling goes to roll up
Product operation (up-sampling), each convolution are followed by a ReLU active coating, repeated several times, number of filters reduction phase when up-sampling
Answer multiple;Obtained result and corresponding left part convolution results carries out convolution after being attached again;
Finally export accordingly result.
It should be noted that default learning network may be Q type intensified learning network.
Second purpose of invention is to provide one kind based on 3 D human body attitude estimating device, can accurately identify depth
Spend the 3 D human body posture in image.
A kind of technical solution based on 3 D human body attitude estimating device of the invention are as follows:
One kind being based on 3 D human body attitude estimating device, comprising:
Image acquisition units, with the depth image and RGB color image of monocular camera acquisition human body different angle;
Key point marks unit, is used to construct skeleton critical point detection neural network based on RGB color image,
Obtain key point mark image;
Hard recognition unit is used to construct hand joint based on corresponding RGB color image and key point mark image
Node 2D-3D mapping network;
Depth image coloring units, the depth image and key point for being used to demarcate human body equal angular mark image, into
And three-dimensional point cloud coloring conversion is carried out to respective depth image, obtain coloring depth image;
Depth image key point predicting unit is used for based on key point mark image and coloring depth image, using pre-
If learning network come predict mark skeleton key point in depth image corresponding position;
3 D human body Attitude estimation unit is used to merge hard recognition unit and depth image key point predicting unit
Output realizes that the fining to 3 D human body Attitude estimation is estimated.
Wherein, monocular camera can be realized using Kinect camera.
Kinect is more more intelligent than general camera.Firstly, it can emit infrared ray, to carry out to entire room
Stereoscopic localized.Camera can then identify the movement of human body by infrared ray.In addition to this, one be combined on Xbox 360
A little high end softwares can carry out real-time tracing to 48 positions of human body.
It should be noted that monocular camera is other than Kinect camera, can also using other existing monocular cameras come
It realizes.
Further, the key point marks unit, comprising:
Data set constructs subelement, is used to mark the skeleton key point in RGB color image, constructs data
Collection;
Neural metwork training subelement is used to for the data set of building to be divided into training set and test set, and will train
Collection, which is input in default skeleton critical point detection neural network, to be trained;
Neural network detection sub-unit, the skeleton critical point detection mind after being used to test training using test set
Through network, until reaching preset requirement.
In key point mark unit, formed by marking skeleton key point to the RGB color image of acquisition
The data set of training skeleton critical point detection neural network, can rapidly and accurately obtain the human body bone of preset requirement in this way
Bone critical point detection neural network.Wherein, preset requirement is the skeleton of skeleton critical point detection neural network output
The precision of key point is being preset in accuracy rating.
Wherein, skeleton critical point detection neural network can (T be more than or equal to 1 by being connected to T after VGG-19 network
Positive integer) a stage, each stage has the structure of 2 full convolutional networks to constitute.
Wherein, VGG (Visual Geometry Group) belongs to Scientific Engineering system, Oxford University, has issued some column
The convolutional network model started with VGG.
It should be noted that skeleton critical point detection neural network may be other existing neural network moulds
Type.
Further, in the hard recognition unit, the hand joint node 2D-3D mapping network of construction exports hand
Segmented image, the structure of hand joint node 2D-3D mapping network are as follows: (convolutional layer+ReLu active coating)+maximum pond layer+bis-
Linear up-sampling.
The loss function of above-mentioned hand joint node 2D-3D mapping network uses softmax and cross entropy loss function.
In the present invention, by 2D hand test problems be converted into segmentation problem eliminate different manpowers size dimension difference it is right
Network accuracy influences.
It should be noted that hand joint node 2D-3D mapping network is in addition to the foregoing structure, other can also be used
Existing neural network structure is realized.
Further, the depth image coloring units, comprising:
Subelement is demarcated, is used to demarcate the depth image and key point mark figure of human body equal angular using chessboard method
Picture;
Coupling subelement is used to match the key point mark image and depth image of human body equal angular;
Volume rending point cloud subelement, the depth image size for being used to adjust after matching simultaneously carry out volume rending point cloud.
The present invention demarcates the depth image of human body equal angular using chessboard method and key point marks image, can be accurate
Obtain the coordinate information of key point in image.
Further, in the depth image key point predicting unit, presetting learning network is U-shaped intensified learning net
Network.
Wherein, U-shaped intensified learning network is the mapping learnt from ambient condition to behavior, so that the behavior of intelligent body selection
The maximum award of environment can be obtained, so that evaluation (or whole system of the external environment to learning system under certain meaning
Runnability) it is best.
The structure of U-shaped intensified learning network are as follows: the convolution operation of preset times and the pond of preset times are carried out to input
It operates (max pool down-sampling), each convolution is followed by one layer of ReLU active coating, repeated several times, the convolution filter after down-sampling
Device quantity increases corresponding multiple;
The preset step-length of convolution operation and preset times to the result progress preset times obtained after down-sampling goes to roll up
Product operation (up-sampling), each convolution are followed by a ReLU active coating, repeated several times, number of filters reduction phase when up-sampling
Answer multiple;Obtained result and corresponding left part convolution results carries out convolution after being attached again;
Finally export accordingly result.
It should be noted that default learning network may be Q type intensified learning network.
Compared with prior art, the beneficial effects of the present invention are:
(1) present invention is solved with the depth image and RGB color image of monocular camera acquisition human body different angle
It is limited in human body attitude estimation field using the condition of more mesh cameras, this method is easier to realize, and can accurately identify
3 D human body posture in depth image out.
(2) present invention can be by identifying 3 D human body posture to reaching after neural metwork training in real time.
(3) trained neural network model can be stored in Miniature Terminal equipment by the present invention, be conveniently integrated into intelligence
In energy household, intelligent interactive equipment.
Detailed description of the invention
The accompanying drawings constituting a part of this application is used to provide further understanding of the present application, and the application's shows
Meaning property embodiment and its explanation are not constituted an undue limitation on the present application for explaining the application.
Fig. 1 is of the invention based on 3 D human body Attitude estimation method flow diagram;
Fig. 2 is one embodiment schematic diagram of the invention based on 3 D human body Attitude estimation method;
Fig. 3 is one embodiment schematic diagram of skeleton critical point detection neural network of the invention;
Fig. 4 is neural network one embodiment schematic diagram of hand joint node 2D-3D mapping of the invention;
Fig. 5 is a kind of U-shaped intensified learning neural network one embodiment schematic diagram of the invention;
Fig. 6 is of the invention based on 3 D human body attitude estimating device structural schematic diagram.
Specific embodiment
It is noted that following detailed description is all illustrative, it is intended to provide further instruction to the application.Unless another
It indicates, all technical and scientific terms used herein has usual with the application person of an ordinary skill in the technical field
The identical meanings of understanding.
It should be noted that term used herein above is merely to describe specific embodiment, and be not intended to restricted root
According to the illustrative embodiments of the application.As used herein, unless the context clearly indicates otherwise, otherwise singular
Also it is intended to include plural form, additionally, it should be understood that, when in the present specification using term "comprising" and/or " packet
Include " when, indicate existing characteristics, step, operation, device, component and/or their combination.
As shown in Figure 1, of the invention based on 3 D human body Attitude estimation method, including step S1~step S6.
Specifically, below with reference to of the invention to illustrate based on one embodiment of 3 D human body Attitude estimation method
Technical solution, as shown in Figure 2:
It is of the invention based on 3 D human body Attitude estimation method, comprising:
S1: with the depth image and RGB color image of monocular camera acquisition human body different angle.
In the step 1, monocular camera can be realized using Kinect camera.
Kinect is more more intelligent than general camera.Firstly, it can emit infrared ray, to carry out to entire room
Stereoscopic localized.Camera can then identify the movement of human body by infrared ray.In addition to this, one be combined on Xbox 360
A little high end softwares can carry out real-time tracing to 48 positions of human body.
It should be noted that monocular camera is other than Kinect camera, can also using other existing monocular cameras come
It realizes.
S2: skeleton critical point detection neural network is constructed based on RGB color image, obtains key point mark figure
Picture.
Wherein, skeleton critical point detection neural network is constructed based on RGB color image in the step S2, specifically
Include:
Step S21: the skeleton key point in mark RGB color image constructs data set;
Specifically, the step of constructing data set are as follows:
Step S211: 12 kinect depth cameras are used, three positions different in a room, each position are placed on
4 kinect depth cameras are placed, four different visual angles is formed in each position, several male and female is shot not respectively
With the image of human posture, by collected photo finishing at a picture library.
Step S212: gesture data collection is established using more depth cameras;This data set is acquisition 20 people, 39 differences
Gesture motion image, data set is divided into a training set and a test set, then to the intensity of illumination in image,
Background image carries out random rendering and expands data diversity.
Step S213: carrying out bone key point mark for the obtained picture library of step S211 and step S212, will be crucial
Label of point coordinate information (x, y, d) as image, writes script using shell, is lmdb by image and image tag unloading
Or hdf5 formatted file.Wherein: x, y are transverse and longitudinal coordinate of the key point in depth image, and d is depth coordinate.
Step S22: the data set of building is divided into training set and test set, and training set is input to default human body bone
It is trained in bone critical point detection neural network;
Step S23: the skeleton critical point detection neural network after testing training using test set, until reaching pre-
If it is required that.
In the step S2, trained human body is formed by marking skeleton key point to the RGB color image of acquisition
The data set of bone critical point detection neural network can rapidly and accurately obtain the skeleton key point of preset requirement in this way
Detect neural network.Wherein, preset requirement is the skeleton key point of skeleton critical point detection neural network output
Precision is being preset in accuracy rating.
Wherein, as shown in figure 3, skeleton critical point detection neural network can (T be by being connected to T after VGG-19 network
More than or equal to 1 positive integer) a stage, each stage has the structure of 2 full convolutional networks to constitute.
Wherein, VGG (Visual Geometry Group) belongs to Scientific Engineering system, Oxford University, has issued some column
The convolutional network model started with VGG.
Specifically, the treatment process of skeleton critical point detection neural network is as follows in this example:
S222: it first using the 2D-RGB image of the w*h obtained by kinect as input, is obtained via first 10 layers of VGG-19
Obtain characteristic pattern F, the input as each branch of model first stage.
S223: each branch in the model first stage, stage generates a series of detection confidence map S respectively1=ρ1(F)
With one group of local relation domain L1=φ1(F);Wherein ρ1(F) and Φ1It (F) is one Zhong Liang branch convolutional neural networks of stage respectively
Inference.
S224: the specific design of full convolutional network branch 1 is as follows:
(a) because in the present invention 3 d pose identification can be carried out to more people simultaneously, first to each of RGB image
Generate independent confidence map
(b) x is usedJ, k∈R2Indicate the actual position of k-th of people, j-th of physical feeling in image.Wherein, j and k is big
In 0 positive integer;
(c) keep detected physical feeling key point highlighted using Gaussian Profile:
(d) the maximum key point of Gauss value is taken in every width confidence map:
Wherein, p is pixel coordinate.
S225: full convolutional network branch 2 is used to detect the position and direction information of key point line, and specific design is as follows:
(a): construction supervises true local association domainWherein c is c-th liang of key point connection on k-th of human body
Line segment.Construction process is as follows:
(b): enablingWithIn separated image on k-th of human body c-th liang of key point line key point.
(c): the local association vector of the body limb on c-th of line is found out using following formula:
Wherein if p equation (3) on limbs c is v, otherwise equation (3) is 0
(d): two key points on c line do linear difference, approximately find out pixel p and are located at k people in line c
On pixel coordinate:
pu=(1-u) xj1-uxj2, 0≤u≤1 (5)
(e): found out using formula (5) has the proprietary relation domain of overlapping relation on c line in image:
Wherein nc(p) it is non-vanishing vector number in point p.
(f): the local relation domain of prediction being sampled, L is usedcThe confidence of k people's lap of measurement is gone along line segment c
Degree:
S223: Liang Ge branch per stage is all made of 33 × 3 and 22 × 2 convolutional layers;
The output of first stage full convolutional network and primitive character figure F: being incorporated as the input of second stage by S224, with
This iterates to stage T;
S225: two branch models constantly refine each branch target with T stage, in order to effectively gradient be avoided to disappear
Losing each stage is added L2Loss function, as supervisory role.A branch penalty function is defined as follows:
Wherein S*It is the value of the true confidence map marked when building database,The confidence map values of prediction are represented, t, which is represented, to be divided
Branch model stage, t ∈ [1,2 ..., T], m represent key point position coordinates in figure, and j represents j-th of key point, are two with W (p)
System mark, if W (p)=0 when key point labeled data lacks, is otherwise 1, avoids punishing true position in network training
Set prediction.
S226: the human body position confidence map that obtains after stage T to two branches and artis relationship are using greedy
Center algorithm obtains the 2D key point image of people.Formula (10) is the model formation of entire critical point detection network:
It should be noted that skeleton critical point detection neural network may be other existing neural network moulds
Type.
S3: image is marked based on corresponding RGB color image and key point, construction hand joint node 2D-3D maps net
Network.
Wherein, in the step S3, the hand joint node 2D-3D mapping network of construction exports hand Segmentation image,
The structure of hand joint node 2D-3D mapping network are as follows: adopted in (convolutional layer+ReLu active coating)+maximum pond layer+bilinearity
Sample.
The loss function of above-mentioned hand joint node 2D-3D mapping network uses softmax and cross entropy loss function.
In the present invention, by 2D hand test problems be converted into segmentation problem eliminate different manpowers size dimension difference it is right
Network accuracy influences.
Wherein, the detailed process of hand joint node 2D-3D mapping network is constructed, as shown in Figure 4:
S31: being 256*256*3 as the input of hand images segmentation network, network using original 2DRGB Image Adjusting size
Using (convolutional layer+ReLu active coating)+maximum pond layer+bilinearity up-sampling structure, loss function uses softmax and friendship
Entropy loss function is pitched, the hand Segmentation image of 256*256*3 is exported.
S32: using one and the neural network of S31 same structure, using the output of S31 as input, the neural network pair
21 joints generation bounding boxes of hand, and it is 0 that mean value, which is added, at bounding box center, the Gaussian noise that variance is 10, network will divide
It Sheng Cheng not 21 32 × 32 × 1 artis thermal maps.
S33: ask 21 2D artis thermal maps to the estimated value of 3D, the specific method is as follows;
S34: a three-dimensional hand joint point coordinate set w is defined firsti=(xi, yi, zi), i ∈ [1, J], J=21.
S35: 3 dimensional database of hand, the one full convolutional neural networks of training obtained using S12 use L2Loss function.
Network uses the structure of (convolutional layer+ReLu active coating)+full articulamentum.
S36: the priori knowledge obtained using the full convolutional neural networks of S35 training builds each key point of 2D hand images
Vertical regularization coordinate set, formula are as follows:
S=| | wk+1-wk|| (12)
Wherein [1,20] k ∈.
S37: establishing relative coordinate system, and artis position is opposite caused by eliminating because of reasons such as hand size differences loses
Very.Using the first joint of index finger as root node, i.e. s=1 in this example, it is opposite furthermore remaining node will to be found out using formula (13)
In the relative position of the first artis of index finger.
R is index finger first node.
It should be noted that hand joint node 2D-3D mapping network is in addition to the foregoing structure, other can also be used
Existing neural network structure is realized.
S4: the depth image and key point of calibration human body equal angular mark image, and then carry out to respective depth image
Three-dimensional point cloud coloring conversion, obtains coloring depth image;
Wherein, it in the step S4, obtains specifically including the step of colouring depth image:
The depth image and key point that human body equal angular is demarcated using chessboard method mark image;
Match the key point mark image and depth image of human body equal angular;
It adjusts the depth image size after matching and carries out volume rending point cloud.
The present invention demarcates the depth image of human body equal angular using chessboard method and key point marks image, can be accurate
Obtain the coordinate information of key point in image
S41: it is demarcated using RGB camera of the chessboard method to kinect, utilizes Matlab Camera
Calibration Toolbox calculates RGB internal reference.
S42: it is demarcated using depth camera of the chessboard method to kinect, utilizes Matlab Camera
Calibration Toolbox calculates RGB internal reference.
S43: 2D-RGB camera and 3D depth camera are registrated, the specific steps are as follows:
S44: depth image space coordinates are established using formula (14):
Pir=HirPir (14)
Wherein PirFor the space coordinate that certain is put under depth camera coordinate, pirFor the point in the plane projection coordinate (x,
Y unit is pixel, and z is depth value, and unit is millimeter), HirFor the internal reference matrix of depth camera.
S45: being that RGB camera establishes space coordinate using formula (15), (16):
Prgb=RPir+T (15)
prgb=HrgbPrgb (16)
Wherein PrgbFor the space coordinate of the same point under RGB camera coordinate, prgbFor this in RGB as the throwing in plane
Shadow coordinate, HrgbFor the internal reference matrix of RGB camera, R is spin matrix, and T is translation vector.
S46: using matrix is joined outside camera, by the point transformation in global coordinate system to camera matrix, transformation for mula is such as
Formula (17):
Wherein spin matrix Rir(Rrgb) and translation vector Tir(Trgb) be depth camera (RGB camera) outer ginseng square
Battle array
S47: the volume rending point cloud matrix for being 64 × 64 × 64 by the Image Adjusting size after registration.
S5: based on key point mark image and coloring depth image, the human body of mark is predicted using default learning network
Bone key point corresponding position in depth image;
Wherein, in the step S5, presetting learning network is U-shaped intensified learning network.
Wherein, U-shaped intensified learning network is the mapping learnt from ambient condition to behavior, so that the behavior of intelligent body selection
The maximum award of environment can be obtained, so that evaluation (or whole system of the external environment to learning system under certain meaning
Runnability) it is best.
The structure of U-shaped intensified learning network are as follows: the convolution operation of preset times and the pond of preset times are carried out to input
It operates (max pool down-sampling), each convolution is followed by one layer of ReLU active coating, repeated several times, the convolution filter after down-sampling
Device quantity increases corresponding multiple;
The preset step-length of convolution operation and preset times to the result progress preset times obtained after down-sampling goes to roll up
Product operation (up-sampling), each convolution are followed by a ReLU active coating, repeated several times, number of filters reduction phase when up-sampling
Answer multiple;Obtained result and corresponding left part convolution results carries out convolution after being attached again;
Finally export accordingly result.
It should be noted that default learning network may be Q type intensified learning network.
Specifically, U-shaped intensified learning network structure, as shown in Figure 5:
S52: 23 × 3 × 3 convolution operations are carried out to the input of S2, S4 and 12 × 2 × 2 pondization operates (max pool
Down-sampling), each convolution is followed by one layer of ReLU active coating, is repeated 4 times, and the Convolution Filter quantity after down-sampling increases by 2 times.
S53: the behaviour of deconvoluting of 23 × 3 convolution operations and 1 hyposynchronization a length of 2 × 2 is carried out to the result obtained after down-sampling
Make (up-sampling), each convolution is followed by a ReLU active coating, is repeated 4 times, and 2 times of reduction of number of filters when up-sampling, obtains
Result and corresponding left part convolution results be attached after carry out convolution again, at this moment Convolution Filter quantity reduces 2 times.
S54: key point confidence map in output point cloud.
S6: merging the output of step S3 and step S5, realizes that the fining to 3 D human body Attitude estimation is estimated.
It is of the invention based on 3 D human body Attitude estimation method, with the depth map of monocular camera acquisition human body different angle
Picture and RGB color image, are solved and are limited in human body attitude estimation field using the condition of more mesh cameras, and this method is easier
It realizes, and can accurately identify the 3 D human body posture in depth image.
As shown in fig. 6, a kind of technical solution based on 3 D human body attitude estimating device of the invention are as follows:
One kind being based on 3 D human body attitude estimating device, comprising:
(1) image acquisition units, with the depth image and RGB color figure of monocular camera acquisition human body different angle
Picture;
Wherein, monocular camera can be realized using Kinect camera.
Kinect is more more intelligent than general camera.Firstly, it can emit infrared ray, to carry out to entire room
Stereoscopic localized.Camera can then identify the movement of human body by infrared ray.In addition to this, one be combined on Xbox 360
A little high end softwares can carry out real-time tracing to 48 positions of human body.
It should be noted that monocular camera is other than Kinect camera, can also using other existing monocular cameras come
It realizes.
(2) key point marks unit, is used to construct skeleton critical point detection nerve net based on RGB color image
Network obtains key point mark image;
Wherein, the key point marks unit, comprising:
Data set constructs subelement, is used to mark the skeleton key point in RGB color image, constructs data
Collection;
Neural metwork training subelement is used to for the data set of building to be divided into training set and test set, and will train
Collection, which is input in default skeleton critical point detection neural network, to be trained;
Neural network detection sub-unit, the skeleton critical point detection mind after being used to test training using test set
Through network, until reaching preset requirement.
In key point mark unit, formed by marking skeleton key point to the RGB color image of acquisition
The data set of training skeleton critical point detection neural network, can rapidly and accurately obtain the human body bone of preset requirement in this way
Bone critical point detection neural network.Wherein, preset requirement is the skeleton of skeleton critical point detection neural network output
The precision of key point is being preset in accuracy rating.
Wherein, skeleton critical point detection neural network can (T be more than or equal to 1 by being connected to T after VGG-19 network
Positive integer) a stage, each stage has the structure of 2 full convolutional networks to constitute.
Wherein, VGG (Visual Geometry Group) belongs to Scientific Engineering system, Oxford University, has issued some column
The convolutional network model started with VGG.
It should be noted that skeleton critical point detection neural network may be other existing neural network moulds
Type.
(3) hard recognition unit is used to construct hand based on corresponding RGB color image and key point mark image
Articulation nodes 2D-3D mapping network;
In the hard recognition unit, the hand joint node 2D-3D mapping network of construction exports hand Segmentation image,
The structure of hand joint node 2D-3D mapping network are as follows: adopted in (convolutional layer+ReLu active coating)+maximum pond layer+bilinearity
Sample.
The loss function of above-mentioned hand joint node 2D-3D mapping network uses softmax and cross entropy loss function.
In the present invention, by 2D hand test problems be converted into segmentation problem eliminate different manpowers size dimension difference it is right
Network accuracy influences.
It should be noted that hand joint node 2D-3D mapping network is in addition to the foregoing structure, other can also be used
Existing neural network structure is realized.
(4) depth image coloring units, the depth image and key point for being used to demarcate human body equal angular mark image,
And then three-dimensional point cloud coloring conversion is carried out to respective depth image, obtain coloring depth image;
Wherein, the depth image coloring units, comprising:
Subelement is demarcated, is used to demarcate the depth image and key point mark figure of human body equal angular using chessboard method
Picture;
Coupling subelement is used to match the key point mark image and depth image of human body equal angular;
Volume rending point cloud subelement, the depth image size for being used to adjust after matching simultaneously carry out volume rending point cloud.
The present invention demarcates the depth image of human body equal angular using chessboard method and key point marks image, can be accurate
Obtain the coordinate information of key point in image.
(5) depth image key point predicting unit is used to utilize based on key point mark image and coloring depth image
Default learning network come predict mark skeleton key point in depth image corresponding position;
Wherein, in the depth image key point predicting unit, presetting learning network is U-shaped intensified learning network.
Wherein, U-shaped intensified learning network is the mapping learnt from ambient condition to behavior, so that the behavior of intelligent body selection
The maximum award of environment can be obtained, so that evaluation (or whole system of the external environment to learning system under certain meaning
Runnability) it is best.
The structure of U-shaped intensified learning network are as follows: the convolution operation of preset times and the pond of preset times are carried out to input
It operates (max pool down-sampling), each convolution is followed by one layer of ReLU active coating, repeated several times, the convolution filter after down-sampling
Device quantity increases corresponding multiple;
The preset step-length of convolution operation and preset times to the result progress preset times obtained after down-sampling goes to roll up
Product operation (up-sampling), each convolution are followed by a ReLU active coating, repeated several times, number of filters reduction phase when up-sampling
Answer multiple;Obtained result and corresponding left part convolution results carries out convolution after being attached again;
Finally export accordingly result.
It should be noted that default learning network may be Q type intensified learning network.
(6) 3 D human body Attitude estimation unit is used to merge hard recognition unit and depth image key point prediction list
The output of member realizes that the fining to 3 D human body Attitude estimation is estimated.
It is of the invention based on 3 D human body attitude estimating device, with the depth map of monocular camera acquisition human body different angle
Picture and RGB color image, are solved and are limited in human body attitude estimation field using the condition of more mesh cameras, and this method is easier
It realizes, and can accurately identify the 3 D human body posture in depth image.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system, device or computer
Program product.Therefore, hardware embodiment, software implementation or embodiment combining software and hardware aspects can be used in the present invention
Form.It can be used moreover, the present invention can be used in the computer that one or more wherein includes computer usable program code
The form for the computer program product implemented on storage medium (including but not limited to magnetic disk storage and optical memory etc.).
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product
Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions
The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs
Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real
The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, the program can be stored in a computer-readable storage medium
In, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, the storage medium can be magnetic
Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random
AccessMemory, RAM) etc..
Above-mentioned, although the foregoing specific embodiments of the present invention is described with reference to the accompanying drawings, not protects model to the present invention
The limitation enclosed, those skilled in the art should understand that, based on the technical solutions of the present invention, those skilled in the art are not
Need to make the creative labor the various modifications or changes that can be made still within protection scope of the present invention.
Claims (10)
1. one kind is based on 3 D human body Attitude estimation method characterized by comprising
S1: with the depth image and RGB color image of monocular camera acquisition human body different angle;
S2: constructing skeleton critical point detection neural network based on RGB color image, obtains key point mark image;
S3: image is marked based on corresponding RGB color image and key point, constructs hand joint node 2D-3D mapping network;
S4: the depth image and key point of calibration human body equal angular mark image, and then carry out to respective depth image three-dimensional
Point cloud coloring conversion, obtains coloring depth image;
S5: based on key point mark image and coloring depth image, the skeleton of mark is predicted using default learning network
Key point corresponding position in depth image;
S6: merging the output of step S3 and step S5, realizes that the fining to 3 D human body Attitude estimation is estimated.
2. as described in claim 1 a kind of based on 3 D human body Attitude estimation method, which is characterized in that base in the step S2
Skeleton critical point detection neural network is constructed in RGB color image, is specifically included:
The skeleton key point in RGB color image is marked, data set is constructed;
The data set of building is divided into training set and test set, and training set is input to default skeleton critical point detection
It is trained in neural network;
Skeleton critical point detection neural network after testing training using test set, until reaching preset requirement.
3. as described in claim 1 a kind of based on 3 D human body Attitude estimation method, which is characterized in that in the step S3
In, the hand joint node 2D-3D mapping network of construction exports hand Segmentation image, hand joint node 2D-3D mapping network
Structure are as follows: (convolutional layer+ReLu active coating)+maximum pond layer+bilinearity up-samples.
4. as described in claim 1 a kind of based on 3 D human body Attitude estimation method, which is characterized in that in the step S4
In, it obtains specifically including the step of colouring depth image:
The depth image and key point that human body equal angular is demarcated using chessboard method mark image;
Match the key point mark image and depth image of human body equal angular;
It adjusts the depth image size after matching and carries out volume rending point cloud.
5. as described in claim 1 a kind of based on 3 D human body Attitude estimation method, which is characterized in that in the step S5
In, presetting learning network is U-shaped intensified learning network.
6. one kind is based on 3 D human body attitude estimating device characterized by comprising
Image acquisition units, with the depth image and RGB color image of monocular camera acquisition human body different angle;
Key point marks unit, is used to construct skeleton critical point detection neural network based on RGB color image, obtain
Key point marks image;
Hard recognition unit is used to construct hand joint node based on corresponding RGB color image and key point mark image
2D-3D mapping network;
Depth image coloring units, the depth image and key point for being used to demarcate human body equal angular mark image, and then right
Respective depth image carries out three-dimensional point cloud coloring conversion, obtains coloring depth image;
Depth image key point predicting unit is used to utilize default based on key point mark image and coloring depth image
Network is practised to predict the skeleton key point marked corresponding position in depth image;
3 D human body Attitude estimation unit is used to merge the defeated of hard recognition unit and depth image key point predicting unit
Out, realize that the fining to 3 D human body Attitude estimation is estimated.
7. as claimed in claim 6 a kind of based on 3 D human body attitude estimating device, which is characterized in that the key point mark
Unit, comprising:
Data set constructs subelement, is used to mark the skeleton key point in RGB color image, constructs data set;
Neural metwork training subelement is used to the data set of building being divided into training set and test set, and training set is defeated
Enter into default skeleton critical point detection neural network and is trained;
Neural network detection sub-unit, the skeleton critical point detection nerve net after being used to test training using test set
Network, until reaching preset requirement.
8. as claimed in claim 6 a kind of based on 3 D human body attitude estimating device, which is characterized in that in the hard recognition
In unit, the hand joint node 2D-3D mapping network of construction exports hand Segmentation image, hand joint node 2D-3D mapping
The structure of network are as follows: (convolutional layer+ReLu active coating)+maximum pond layer+bilinearity up-sampling.
9. as claimed in claim 6 a kind of based on 3 D human body attitude estimating device, which is characterized in that the depth image
Color element, comprising:
Subelement is demarcated, the depth image and key point for being used to demarcate human body equal angular using chessboard method mark image;
Coupling subelement is used to match the key point mark image and depth image of human body equal angular;
Volume rending point cloud subelement, the depth image size for being used to adjust after matching simultaneously carry out volume rending point cloud.
10. as claimed in claim 6 a kind of based on 3 D human body attitude estimating device, which is characterized in that in the depth map
It is U-shaped intensified learning network as in key point predicting unit, presetting learning network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810426144.1A CN108830150B (en) | 2018-05-07 | 2018-05-07 | One kind being based on 3 D human body Attitude estimation method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810426144.1A CN108830150B (en) | 2018-05-07 | 2018-05-07 | One kind being based on 3 D human body Attitude estimation method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108830150A CN108830150A (en) | 2018-11-16 |
CN108830150B true CN108830150B (en) | 2019-05-28 |
Family
ID=64147503
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810426144.1A Active CN108830150B (en) | 2018-05-07 | 2018-05-07 | One kind being based on 3 D human body Attitude estimation method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108830150B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11809616B1 (en) | 2022-06-23 | 2023-11-07 | Qing Zhang | Twin pose detection method and system based on interactive indirect inference |
Families Citing this family (59)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111222379A (en) * | 2018-11-27 | 2020-06-02 | 株式会社日立制作所 | Hand detection method and device |
CN109684943B (en) * | 2018-12-07 | 2021-03-16 | 北京首钢自动化信息技术有限公司 | Athlete auxiliary training data acquisition method and device and electronic equipment |
CN109815813B (en) * | 2018-12-21 | 2021-03-05 | 深圳云天励飞技术有限公司 | Image processing method and related product |
CN109871123B (en) * | 2019-01-21 | 2022-08-16 | 广东精标科技股份有限公司 | Teaching method based on gesture or eye control |
CN109886986B (en) * | 2019-01-23 | 2020-09-08 | 北京航空航天大学 | Dermatoscope image segmentation method based on multi-branch convolutional neural network |
CN109920208A (en) * | 2019-01-31 | 2019-06-21 | 深圳绿米联创科技有限公司 | Tumble prediction technique, device, electronic equipment and system |
CN109934111B (en) * | 2019-02-12 | 2020-11-24 | 清华大学深圳研究生院 | Fitness posture estimation method and system based on key points |
CN109949368B (en) * | 2019-03-14 | 2020-11-06 | 郑州大学 | Human body three-dimensional attitude estimation method based on image retrieval |
CN110032992B (en) * | 2019-04-25 | 2023-05-23 | 沈阳图为科技有限公司 | Examination cheating detection method based on gestures |
CN110175528B (en) * | 2019-04-29 | 2021-10-26 | 北京百度网讯科技有限公司 | Human body tracking method and device, computer equipment and readable medium |
CN111914595B (en) * | 2019-05-09 | 2022-11-15 | 中国科学院软件研究所 | Human hand three-dimensional attitude estimation method and device based on color image |
CN110119148B (en) * | 2019-05-14 | 2022-04-29 | 深圳大学 | Six-degree-of-freedom attitude estimation method and device and computer readable storage medium |
CN110188633B (en) * | 2019-05-14 | 2023-04-07 | 广州虎牙信息科技有限公司 | Human body posture index prediction method and device, electronic equipment and storage medium |
CN110135375B (en) * | 2019-05-20 | 2021-06-01 | 中国科学院宁波材料技术与工程研究所 | Multi-person attitude estimation method based on global information integration |
CN110176016B (en) * | 2019-05-28 | 2021-04-30 | 招远市国有资产经营有限公司 | Virtual fitting method based on human body contour segmentation and skeleton recognition |
CN110197156B (en) * | 2019-05-30 | 2021-08-17 | 清华大学 | Single-image human hand action and shape reconstruction method and device based on deep learning |
CN112102223B (en) * | 2019-06-18 | 2024-05-14 | 通用电气精准医疗有限责任公司 | Method and system for automatically setting scan range |
CN110298916B (en) * | 2019-06-21 | 2022-07-01 | 湖南大学 | Three-dimensional human body reconstruction method based on synthetic depth data |
CN110472476B (en) * | 2019-06-24 | 2024-06-28 | 平安科技(深圳)有限公司 | Gesture matching degree acquisition method, device, computer and storage medium |
CN110472481B (en) * | 2019-07-01 | 2024-01-05 | 华南师范大学 | Sleeping gesture detection method, device and equipment |
CN110495889B (en) * | 2019-07-04 | 2022-05-27 | 平安科技(深圳)有限公司 | Posture evaluation method, electronic device, computer device, and storage medium |
CN110428493B (en) * | 2019-07-12 | 2021-11-02 | 清华大学 | Single-image human body three-dimensional reconstruction method and system based on grid deformation |
CN110348524B (en) * | 2019-07-15 | 2022-03-04 | 深圳市商汤科技有限公司 | Human body key point detection method and device, electronic equipment and storage medium |
CN110427917B (en) * | 2019-08-14 | 2022-03-22 | 北京百度网讯科技有限公司 | Method and device for detecting key points |
CN110555412B (en) * | 2019-09-05 | 2023-05-16 | 深圳龙岗智能视听研究院 | End-to-end human body gesture recognition method based on combination of RGB and point cloud |
CN110728739B (en) * | 2019-09-30 | 2023-04-14 | 杭州师范大学 | Virtual human control and interaction method based on video stream |
CN111079523B (en) * | 2019-11-05 | 2024-05-14 | 北京迈格威科技有限公司 | Object detection method, device, computer equipment and storage medium |
CN111027407B (en) * | 2019-11-19 | 2023-04-07 | 东南大学 | Color image hand posture estimation method for shielding situation |
CN111062326B (en) * | 2019-12-02 | 2023-07-25 | 北京理工大学 | Self-supervision human body 3D gesture estimation network training method based on geometric driving |
CN111028283B (en) * | 2019-12-11 | 2024-01-12 | 北京迈格威科技有限公司 | Image detection method, device, equipment and readable storage medium |
CN113012091A (en) * | 2019-12-20 | 2021-06-22 | 中国科学院沈阳计算技术研究所有限公司 | Impeller quality detection method and device based on multi-dimensional monocular depth estimation |
CN111179419B (en) * | 2019-12-31 | 2023-09-05 | 北京奇艺世纪科技有限公司 | Three-dimensional key point prediction and deep learning model training method, device and equipment |
CN111160375B (en) * | 2019-12-31 | 2024-01-23 | 北京奇艺世纪科技有限公司 | Three-dimensional key point prediction and deep learning model training method, device and equipment |
CN111429499B (en) * | 2020-02-24 | 2023-03-10 | 中山大学 | High-precision three-dimensional reconstruction method for hand skeleton based on single depth camera |
CN113382154A (en) * | 2020-02-25 | 2021-09-10 | 荣耀终端有限公司 | Human body image beautifying method based on depth and electronic equipment |
CN111046858B (en) * | 2020-03-18 | 2020-09-08 | 成都大熊猫繁育研究基地 | Image-based animal species fine classification method, system and medium |
CN113449565A (en) * | 2020-03-27 | 2021-09-28 | 海信集团有限公司 | Three-dimensional attitude estimation method, intelligent device and storage medium |
CN111582204A (en) * | 2020-05-13 | 2020-08-25 | 北京市商汤科技开发有限公司 | Attitude detection method and apparatus, computer device and storage medium |
CN111753669A (en) * | 2020-05-29 | 2020-10-09 | 广州幻境科技有限公司 | Hand data identification method, system and storage medium based on graph convolution network |
CN111753747B (en) * | 2020-06-28 | 2023-11-24 | 高新兴科技集团股份有限公司 | Violent motion detection method based on monocular camera and three-dimensional attitude estimation |
CN111753801A (en) * | 2020-07-02 | 2020-10-09 | 上海万面智能科技有限公司 | Human body posture tracking and animation generation method and device |
CN111968235B (en) * | 2020-07-08 | 2024-04-12 | 杭州易现先进科技有限公司 | Object attitude estimation method, device and system and computer equipment |
CN112076073A (en) * | 2020-07-27 | 2020-12-15 | 深圳瀚维智能医疗科技有限公司 | Automatic massage area dividing method and device, massage robot and storage medium |
CN112069933A (en) * | 2020-08-21 | 2020-12-11 | 董秀园 | Skeletal muscle stress estimation method based on posture recognition and human body biomechanics |
CN111881887A (en) * | 2020-08-21 | 2020-11-03 | 董秀园 | Multi-camera-based motion attitude monitoring and guiding method and device |
CN112107318B (en) * | 2020-09-24 | 2024-02-27 | 自达康(北京)科技有限公司 | Physical activity ability evaluation system |
CN112287866B (en) * | 2020-11-10 | 2024-05-31 | 上海依图网络科技有限公司 | Human body action recognition method and device based on human body key points |
CN112116653B (en) * | 2020-11-23 | 2021-03-30 | 华南理工大学 | Object posture estimation method for multiple RGB pictures |
CN112836594B (en) * | 2021-01-15 | 2023-08-08 | 西北大学 | Three-dimensional hand gesture estimation method based on neural network |
CN112766153B (en) * | 2021-01-19 | 2022-03-11 | 合肥工业大学 | Three-dimensional human body posture estimation method and system based on deep learning |
CN112836824B (en) * | 2021-03-04 | 2023-04-18 | 上海交通大学 | Monocular three-dimensional human body pose unsupervised learning method, system and medium |
CN113112583B (en) * | 2021-03-22 | 2023-06-20 | 成都理工大学 | 3D human body reconstruction method based on infrared thermal imaging |
CN113158910A (en) * | 2021-04-25 | 2021-07-23 | 北京华捷艾米科技有限公司 | Human skeleton recognition method and device, computer equipment and storage medium |
CN113362452B (en) * | 2021-06-07 | 2022-11-15 | 中南大学 | Hand posture three-dimensional reconstruction method and device and storage medium |
CN113609993A (en) * | 2021-08-06 | 2021-11-05 | 烟台艾睿光电科技有限公司 | Attitude estimation method, device and equipment and computer readable storage medium |
CN113762177A (en) * | 2021-09-13 | 2021-12-07 | 成都市谛视科技有限公司 | Real-time human body 3D posture estimation method and device, computer equipment and storage medium |
CN113689503B (en) * | 2021-10-25 | 2022-02-25 | 北京市商汤科技开发有限公司 | Target object posture detection method, device, equipment and storage medium |
CN114821819B (en) * | 2022-06-30 | 2022-09-23 | 南通同兴健身器材有限公司 | Real-time monitoring method for body-building action and artificial intelligence recognition system |
CN116797625B (en) * | 2023-07-20 | 2024-04-19 | 无锡埃姆维工业控制设备有限公司 | Monocular three-dimensional workpiece pose estimation method |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102855470A (en) * | 2012-07-31 | 2013-01-02 | 中国科学院自动化研究所 | Estimation method of human posture based on depth image |
CN102982557A (en) * | 2012-11-06 | 2013-03-20 | 桂林电子科技大学 | Method for processing space hand signal gesture command based on depth camera |
CN103597515A (en) * | 2011-06-06 | 2014-02-19 | 微软公司 | System for recognizing an open or closed hand |
CN104715493A (en) * | 2015-03-23 | 2015-06-17 | 北京工业大学 | Moving body posture estimating method |
CN105069423A (en) * | 2015-07-29 | 2015-11-18 | 北京格灵深瞳信息技术有限公司 | Human body posture detection method and device |
CN106570903A (en) * | 2016-10-13 | 2017-04-19 | 华南理工大学 | Visual identification and positioning method based on RGB-D camera |
CN107066935A (en) * | 2017-01-25 | 2017-08-18 | 网易(杭州)网络有限公司 | Hand gestures method of estimation and device based on deep learning |
-
2018
- 2018-05-07 CN CN201810426144.1A patent/CN108830150B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103597515A (en) * | 2011-06-06 | 2014-02-19 | 微软公司 | System for recognizing an open or closed hand |
CN102855470A (en) * | 2012-07-31 | 2013-01-02 | 中国科学院自动化研究所 | Estimation method of human posture based on depth image |
CN102982557A (en) * | 2012-11-06 | 2013-03-20 | 桂林电子科技大学 | Method for processing space hand signal gesture command based on depth camera |
CN104715493A (en) * | 2015-03-23 | 2015-06-17 | 北京工业大学 | Moving body posture estimating method |
CN105069423A (en) * | 2015-07-29 | 2015-11-18 | 北京格灵深瞳信息技术有限公司 | Human body posture detection method and device |
CN106570903A (en) * | 2016-10-13 | 2017-04-19 | 华南理工大学 | Visual identification and positioning method based on RGB-D camera |
CN107066935A (en) * | 2017-01-25 | 2017-08-18 | 网易(杭州)网络有限公司 | Hand gestures method of estimation and device based on deep learning |
Non-Patent Citations (1)
Title |
---|
基于Kinect骨架信息的人体动作识别;刘飞;《中国优秀硕士学位论文全文数据库 信息科技辑》;20140615(第06期);I138-955 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11809616B1 (en) | 2022-06-23 | 2023-11-07 | Qing Zhang | Twin pose detection method and system based on interactive indirect inference |
Also Published As
Publication number | Publication date |
---|---|
CN108830150A (en) | 2018-11-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108830150B (en) | One kind being based on 3 D human body Attitude estimation method and device | |
CN111126272B (en) | Posture acquisition method, and training method and device of key point coordinate positioning model | |
CN105787439B (en) | A kind of depth image human synovial localization method based on convolutional neural networks | |
CN104715493B (en) | A kind of method of movement human Attitude estimation | |
CN105069746B (en) | Video real-time face replacement method and its system based on local affine invariant and color transfer technology | |
CN105631861B (en) | Restore the method for 3 D human body posture from unmarked monocular image in conjunction with height map | |
CN104978580B (en) | A kind of insulator recognition methods for unmanned plane inspection transmission line of electricity | |
CN102855470B (en) | Estimation method of human posture based on depth image | |
CN100543775C (en) | The method of following the tracks of based on the 3 d human motion of many orders camera | |
CN104794737B (en) | A kind of depth information Auxiliary Particle Filter tracking | |
CN107767419A (en) | A kind of skeleton critical point detection method and device | |
CN104036488B (en) | Binocular vision-based human body posture and action research method | |
CN107423730A (en) | A kind of body gait behavior active detecting identifying system and method folded based on semanteme | |
Nguyen et al. | Static hand gesture recognition using artificial neural network | |
CN106997605A (en) | It is a kind of that the method that foot type video and sensing data obtain three-dimensional foot type is gathered by smart mobile phone | |
CN101520902A (en) | System and method for low cost motion capture and demonstration | |
CN106023211A (en) | Robot image positioning method and system base on deep learning | |
CN109087245A (en) | Unmanned aerial vehicle remote sensing image mosaic system based on neighbouring relations model | |
CN104537705A (en) | Augmented reality based mobile platform three-dimensional biomolecule display system and method | |
CN111160294A (en) | Gait recognition method based on graph convolution network | |
CN109000655A (en) | Robot bionic indoor positioning air navigation aid | |
CN102289822A (en) | Method for tracking moving target collaboratively by multiple cameras | |
CN114036969A (en) | 3D human body action recognition algorithm under multi-view condition | |
CN117557755B (en) | Virtual scene secondary normal school biochemical body and clothing visualization method and system | |
Liu et al. | Key algorithm for human motion recognition in virtual reality video sequences based on hidden markov model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20210415 Address after: 102300 No.1 Qiaoyuan Road, Mentougou District, Beijing Patentee after: Beijing Micro-Chain Daoi Technology Co.,Ltd. Address before: 250014 No. 88, Wenhua East Road, Lixia District, Shandong, Ji'nan Patentee before: SHANDONG NORMAL University |
|
TR01 | Transfer of patent right |