CN108108722A - A kind of accurate three-dimensional hand and estimation method of human posture based on single depth image - Google Patents

A kind of accurate three-dimensional hand and estimation method of human posture based on single depth image Download PDF

Info

Publication number
CN108108722A
CN108108722A CN201810046261.5A CN201810046261A CN108108722A CN 108108722 A CN108108722 A CN 108108722A CN 201810046261 A CN201810046261 A CN 201810046261A CN 108108722 A CN108108722 A CN 108108722A
Authority
CN
China
Prior art keywords
mrow
voxel
dimensional
network
volume
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201810046261.5A
Other languages
Chinese (zh)
Inventor
夏春秋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Vision Technology Co Ltd
Original Assignee
Shenzhen Vision Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Vision Technology Co Ltd filed Critical Shenzhen Vision Technology Co Ltd
Priority to CN201810046261.5A priority Critical patent/CN108108722A/en
Publication of CN108108722A publication Critical patent/CN108108722A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm

Abstract

A kind of the accurate three-dimensional hand and estimation method of human posture based on single depth image proposed in the present invention, main contents include:Network model, improved target location, the input of system, voxel predict network to voxel, its process is, the overall architecture of network is provided first, then utilize and the position of target is improved based on the method for convolutional neural networks, then the input of system is constructed using Back-projection technique, finally four class building block of block is up-sampled with volume basic block, volume residual block, volume down-sampling block and volume and encoder and decoder composition voxel predicts network to voxel.The present invention solves the problems, such as perspective distortion and Nonlinear Mapping, can obtain the three-dimensional hand of pinpoint accuracy and human body attitude estimation, and takes less, can accomplish to carry out human body behavior prediction and estimation in real time.

Description

A kind of accurate three-dimensional hand and estimation method of human posture based on single depth image
Technical field
The present invention relates to three-dimensional hand and human body attitude estimation field, more particularly, to a kind of based on single depth image Accurate three-dimensional hand and estimation method of human posture.
Background technology
Human body behavior interaction is computer by positioning and identifying the mankind, tracking human limb's movement locus, tracking expression Feature so as to understand the action of the mankind and behavior, and responds.Its application background is very extensive, is concentrated mainly on man-machine friendship Mutually, virtual reality, smart home, intelligent security guard, intelligent video monitoring, patient monitoring system, sportsman's supplemental training, in addition base In the method that many human body behaviors interactions have also been used in video frequency searching and intelligent image compression of content etc..Such as by train It stands, the suspicious hand motion or posture of the detection of the public arenas such as airport and estimation personage, Security Personnel can be assisted to judge that it is No is that will implement theft or the suspect of other hazardous acts, so as to effectively reduce the generation of thievery and hazard event. For another example, by the camera supervised patient with major disease of dispensary's fitting depth, detection and the gesture for estimating patient And human body attitude, medical staff can so be helped to judge whether patient wants help, and make corresponding processing in time.People The main task of machine behavior interaction is three-dimensional hand and human body attitude estimation.With the appearance of cheap depth camera, based on single Three-dimensional hand and the human body attitude estimation of depth image are increasingly subject to the concern of people.Recently, the method based on convolutional neural networks It is used for the three-dimensional hand of single depth image and human body attitude estimation problem and achieves great accuracy.But this kind of side Method still have limitation, particularly when there are it is serious self block, depth image is second-rate when.It is in addition, traditional Three-dimensional hand and estimation method of human posture tool there are two deficiency:First be there are the perspective distortion of two-dimensional depth image, so as to Cause to estimate distortion;Second is there are the Nonlinear Mapping relation of height between depth image and three-dimensional coordinate, this is non-linear Mapping relations hinder the study course of system, and influence the three-dimensional coordinate that network accurately estimates target.
The present invention proposes a kind of accurate three-dimensional hand and estimation method of human posture based on single depth image, gives first Go out the overall architecture of network, then utilize and the position of target is improved based on the method for convolutional neural networks, then use Back projection's means construct the input of system, finally in volume basic block, volume residual block, volume down-sampling block and volume Four class building block of sampling block and encoder and decoder composition voxel predict network to voxel.The present invention solves perspective distortion And the problem of Nonlinear Mapping, the three-dimensional hand of pinpoint accuracy and human body attitude estimation can be obtained, and take it is less, can be with Accomplish to carry out human body behavior prediction and estimation in real time.
The content of the invention
The problem of for perspective distortion and Nonlinear Mapping, it is an object of the invention to provide one kind to be based on single depth The accurate three-dimensional hand and estimation method of human posture of image provide the overall architecture of network first, then using based on convolution god Method through network is improved the position of target, and the input of system is then constructed using Back-projection technique, finally uses body Product basic block, volume residual block, volume down-sampling block and volume up-sampling four class building block of block and encoder and decoder It forms voxel and network is predicted to voxel.
For the certainly solution above problem, the present invention provides a kind of accurate three-dimensional hand and human body attitude based on single depth image and estimates Meter method, main contents include:
(1) network model;
(2) improved target location;
(3) input of system;
(4) voxel predicts network to voxel.
Wherein, the network model, the task of model is the articulate three-dimensional coordinate of estimation institute, is broadly divided into following three A step:First, by point back projection to three dimensions and the continuous space of discretization, turning so as to fulfill by two-dimensional depth figure Turn to three-D volumes expression;Second, using the data of three-dimensional voxel as input of the voxel to voxel prediction network, for estimating Count the likelihood value of each voxel in each joint;3rd, find out the position corresponding to the maximum likelihood value in each joint And the true coordinate representated by it, and using this as the final result of model.
Wherein, the improved target location, precondition be need one comprising the hand in three dimensions or The three-dimensional frame of human body.
Further, the three-dimensional frame, position is generally near reference point;And reference point can select to demarcate Common point or can by the region of hand limit a simple depth threshold after choose its barycenter.
Further, the common point demarcated and barycenter, with following limitation:
First, for the common point demarcated, it is not easy to obtain in practical applications;
Second, for barycenter, in complex environment, since barycenter is there are error, so as to cause it cannot be guaranteed that target is accurate Really inside obtained three-dimensional frame.
Further, the limitation, can be by one simple two-dimensional convolution neutral net of training, for estimating One accurate reference point.
Further, the two-dimensional convolution neutral net, by limiting a simple depth threshold in the region of hand, It is as a reference point to calculate its barycenter;A depth image is inputted, and exports the public position for calculating the reference point of gained and having demarcated 3-D migration amount between the central point put;Then in the reference point obtained by calculating before, in addition this offset, is improved Reference point.
Wherein, the input of the system, first, each pixel back projection of two-dimensional depth figure to three dimensions; Then, three dimensions is discretized into as pre-defined voxel size;Then, three-dimensional frame is drawn around reference point, extracts mesh Mark;Finally, it is 1 to set the voxel value consistent with depth point position, and the voxel value of other positions is 0.
Wherein, the voxel predicts network to voxel, mainly including following three parts:
First, using four class building blocks, i.e., adopted in volume basic block, volume residual block, volume down-sampling block and volume Sample block;
Second, network is built, then network passes through three continuous bodies by volume basic block and volume down-sampling BOB(beginning of block) Product residual block extracts useful local feature, subsequently enters encoder and decoder;
3rd, three-dimensional hotspot graph is constructed to supervise the pre- voxel likelihood function in each joint, wherein, the average quilt of Gaussian peak The common point demarcated is fixed on, i.e.,:
Meanwhile
Cost function is represented using the mean square error function shown in above formula.
Further, the encoder and decoder, for encoder, volume down-sampling block reduces the space of characteristic pattern Size, volume residual block increase the quantity of channel;For decoder, volume up-sampling block increases the bulk of characteristic pattern, when During up-sampling, network reduces the quantity of channel, so as to compress the feature of extraction.
Description of the drawings
Fig. 1 is that the present invention is a kind of based on the accurate three-dimensional hand of single depth image and the voxel pair of estimation method of human posture Voxel predicts the integrated stand composition of network.
Fig. 2 is a kind of three-dimensional appearance of accurate three-dimensional hand and estimation method of human posture based on single depth image of the present invention The constitutional diagram of the different input and output of state estimation network.
Fig. 3 is that the present invention is a kind of based on the accurate three-dimensional hand of single depth image and the reference point of estimation method of human posture Improve network.
Fig. 4 is that the present invention is a kind of based on the accurate three-dimensional hand of single depth image and the voxel pair of estimation method of human posture Voxel predicts the coder structure figure of network.
Fig. 5 is that the present invention is a kind of based on the accurate three-dimensional hand of single depth image and the voxel pair of estimation method of human posture Voxel predicts the decoder architecture figure of network.
Specific embodiment
It should be noted that in the case where there is no conflict, the feature in embodiment and embodiment in the application can phase It mutually combines, the present invention is described in further detail in the following with reference to the drawings and specific embodiments.
Fig. 1 is that the present invention is a kind of based on the accurate three-dimensional hand of single depth image and the voxel pair of estimation method of human posture Voxel predicts the integrated stand composition of network.First, by point back projection to three dimensions and the continuous space of discretization, thus It realizes and two-dimensional depth figure is converted into three-D volumes expression;Then, it is the data of three-dimensional voxel are pre- to voxel as voxel The input of survey grid network, for estimating the likelihood value of each voxel in each joint;Finally, the maximum in each joint is found out Position corresponding to likelihood value and the true coordinate representated by it, and using this as the final result of model.
Fig. 2 is a kind of three-dimensional appearance of accurate three-dimensional hand and estimation method of human posture based on single depth image of the present invention The constitutional diagram of the different input and output of state estimation network.In order to solve the problems, such as perspective distortion and non-linear projection, the present invention A kind of voxel is provided, Attitude estimation is used for voxel prediction network.Unlike pervious method, voxel is to the pre- survey grid of voxel Network estimates the likelihood value of each voxel in each joint using voxelization grid as inputting.
By two-dimensional depth image being converted into the form of three-dimensional voxel, as the input of network, network can be without mistake The actual look of true ground display target object.Meanwhile the likelihood value of each voxel by estimating each joint, it can allow network The more easily task of Expectation of Learning.
Fig. 3 is that the present invention is a kind of based on the accurate three-dimensional hand of single depth image and the reference point of estimation method of human posture Improve network.For positioning joint, precondition is to need to include the hand or the three-dimensional frame of human body in three dimensions. The position of three-dimensional frame is generally near reference point;And reference point can select the common point demarcated or can pass through Its barycenter is chosen after limiting a simple depth threshold in the region of hand.But the common point demarcated is with following Limitation:
First, for the common point demarcated, it is not easy to obtain in practical applications;
Second, for barycenter, in complex environment, since barycenter is there are error, so as to cause it cannot be guaranteed that target is accurate Really inside obtained three-dimensional frame.
It therefore, can be by one simple two-dimensional convolution neutral net of training, for estimating in order to overcome more than limitation Count an accurate reference point.Specifically, by limiting a simple depth threshold in the region of hand, its barycenter work is calculated For reference point;Input a depth image, and export the central point of common point that calculates the reference point of gained and demarcated it Between 3-D migration amount;Then in the reference point obtained by calculating before, in addition this offset, obtains improved reference point.
Fig. 4 is that the present invention is a kind of based on the accurate three-dimensional hand of single depth image and the voxel pair of estimation method of human posture Voxel predicts the coder structure figure of network.Voxel mainly includes following three parts to voxel prediction network:
First, using four class building blocks, i.e., adopted in volume basic block, volume residual block, volume down-sampling block and volume Sample block;
Second, network is built, then network passes through three continuous bodies by volume basic block and volume down-sampling BOB(beginning of block) Product residual block extracts useful local feature, subsequently enters encoder and decoder;
3rd, three-dimensional hotspot graph is constructed to supervise the pre- voxel likelihood function in each joint, wherein, the average quilt of Gaussian peak The common point demarcated is fixed on, i.e.,:
Meanwhile
Cost function is represented using the mean square error function shown in above formula.
For encoder, volume down-sampling block reduces the bulk of characteristic pattern, and volume residual block increases the quantity of channel.
Fig. 5 is that the present invention is a kind of based on the accurate three-dimensional hand of single depth image and the voxel pair of estimation method of human posture Voxel predicts the decoder architecture figure of network.For decoder, the bulk of volume up-sampling block increase characteristic pattern, when above adopting During sample, network reduces the quantity of channel, so as to compress the feature of extraction.
For those skilled in the art, the present invention is not limited to the details of above-described embodiment, in the essence without departing substantially from the present invention In the case of refreshing and scope, the present invention can be realized in other specific forms.In addition, those skilled in the art can be to this hair Bright to carry out various modification and variations without departing from the spirit and scope of the present invention, these improvements and modifications also should be regarded as the present invention's Protection domain.Therefore, appended claims are intended to be construed to include preferred embodiment and fall into all changes of the scope of the invention More and change.

Claims (10)

1. a kind of accurate three-dimensional hand and estimation method of human posture based on single depth image, which is characterized in that mainly include Network model (one);Improved target location (two);The input (three) of system;Voxel is to voxel prediction network (four).
2. based on the network model (one) described in claims 1, which is characterized in that the task of model is that estimation institute is articulate Three-dimensional coordinate is broadly divided into following three steps:
First, by the way that point back projection to three dimensions and the continuous space of discretization, is converted so as to fulfill by two-dimensional depth figure For three-D volumes expression;
Second, using the data of three-dimensional voxel as input of the voxel to voxel prediction network, for estimating each joint The likelihood value of each voxel;
3rd, find out the position corresponding to the maximum likelihood value in each joint and the true coordinate representated by it, and by this Final result as model.
3. the improved target location (two) described in based on claims 1, which is characterized in that its precondition is to need one Three-dimensional frame comprising the hand in three dimensions or human body.
4. the three-dimensional frame described in based on claims 3, which is characterized in that its position is generally near reference point;And it refers to Point can select the common point demarcated or can be by being selected after limiting a simple depth threshold in the region of hand Take its barycenter.
5. based on the common point demarcated and barycenter described in claims 4, which is characterized in that it is with following limitation Property:
First, for the common point demarcated, it is not easy to obtain in practical applications;
Second, for barycenter, in complex environment, since barycenter is there are error, so as to cause it cannot be guaranteed that target exactly Inside obtained three-dimensional frame.
6. based on the limitation described in claims 5, which is characterized in that in order to overcome limitation, training one can be passed through Simple two-dimensional convolution neutral net, for estimating an accurate reference point.
7. based on the two-dimensional convolution neutral net described in claims 6, which is characterized in that by limiting one in the region of hand It is as a reference point to calculate its barycenter for simple depth threshold;Input a depth image, and export calculate gained reference point with 3-D migration amount between the central point for the common point demarcated;Then in the reference point obtained by calculating before, in addition this Offset obtains improved reference point.
8. the input (three) based on the system described in claims 1, which is characterized in that first, each of two-dimensional depth figure A pixel back projection is to three dimensions;Then, three dimensions is discretized into as pre-defined voxel size;Then, joining Three-dimensional frame is drawn around examination point, extracts target;Finally, it is 1 to set the voxel value consistent with depth point position, the body of other positions Element value is 0.
9. based on the voxel described in claims 1 to voxel prediction network (four), which is characterized in that mainly including following three Part:
First, use four class building blocks, i.e. volume basic block, volume residual block, volume down-sampling block and volume up-sampling block;
Second, network is built, network is then residual by three continuous volumes by volume basic block and volume down-sampling BOB(beginning of block) Remaining block extracts useful local feature, subsequently enters encoder and decoder;
3rd, three-dimensional hotspot graph is constructed to supervise the pre- voxel likelihood function in each joint, wherein, the average of Gaussian peak is fixed In the common point demarcated, i.e.,:
<mrow> <msubsup> <mi>H</mi> <mi>n</mi> <mo>*</mo> </msubsup> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>,</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>=</mo> <mi>exp</mi> <mrow> <mo>(</mo> <mo>-</mo> <mfrac> <mrow> <msup> <mrow> <mo>(</mo> <mi>i</mi> <mo>-</mo> <msub> <mi>i</mi> <mi>n</mi> </msub> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>+</mo> <msup> <mrow> <mo>(</mo> <mi>j</mi> <mo>-</mo> <msub> <mi>j</mi> <mi>n</mi> </msub> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>+</mo> <msup> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <msub> <mi>k</mi> <mi>n</mi> </msub> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> <mrow> <mn>2</mn> <msup> <mi>&amp;sigma;</mi> <mn>2</mn> </msup> </mrow> </mfrac> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow>
Meanwhile
<mrow> <mi>L</mi> <mo>=</mo> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </msubsup> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>,</mo> <mi>k</mi> </mrow> </msub> <mo>|</mo> <mo>|</mo> <msubsup> <mi>H</mi> <mi>n</mi> <mo>*</mo> </msubsup> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>,</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>H</mi> <mi>n</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>,</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>2</mn> <mo>)</mo> </mrow> </mrow>
Cost function is represented using the mean square error function shown in above formula.
10. based on the encoder and decoder described in claims 9, which is characterized in that for encoder, volume down-sampling block The bulk of characteristic pattern is reduced, volume residual block increases the quantity of channel;For decoder, volume up-sampling block increase feature The bulk of figure, when up-sampling, network reduces the quantity of channel, so as to compress the feature of extraction.
CN201810046261.5A 2018-01-17 2018-01-17 A kind of accurate three-dimensional hand and estimation method of human posture based on single depth image Withdrawn CN108108722A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810046261.5A CN108108722A (en) 2018-01-17 2018-01-17 A kind of accurate three-dimensional hand and estimation method of human posture based on single depth image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810046261.5A CN108108722A (en) 2018-01-17 2018-01-17 A kind of accurate three-dimensional hand and estimation method of human posture based on single depth image

Publications (1)

Publication Number Publication Date
CN108108722A true CN108108722A (en) 2018-06-01

Family

ID=62220174

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810046261.5A Withdrawn CN108108722A (en) 2018-01-17 2018-01-17 A kind of accurate three-dimensional hand and estimation method of human posture based on single depth image

Country Status (1)

Country Link
CN (1) CN108108722A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111047548A (en) * 2020-03-12 2020-04-21 腾讯科技(深圳)有限公司 Attitude transformation data processing method and device, computer equipment and storage medium
CN111724414A (en) * 2020-06-23 2020-09-29 宁夏大学 Basketball movement analysis method based on 3D attitude estimation
CN111932678A (en) * 2020-08-13 2020-11-13 北京未澜科技有限公司 Multi-view real-time human motion, gesture, expression and texture reconstruction system
CN112446923A (en) * 2020-11-23 2021-03-05 中国科学技术大学 Human body three-dimensional posture estimation method and device, electronic equipment and storage medium
WO2021129569A1 (en) * 2019-12-25 2021-07-01 神思电子技术股份有限公司 Human action recognition method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8787663B2 (en) * 2010-03-01 2014-07-22 Primesense Ltd. Tracking body parts by combined color image and depth processing
CN105069423A (en) * 2015-07-29 2015-11-18 北京格灵深瞳信息技术有限公司 Human body posture detection method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8787663B2 (en) * 2010-03-01 2014-07-22 Primesense Ltd. Tracking body parts by combined color image and depth processing
CN105069423A (en) * 2015-07-29 2015-11-18 北京格灵深瞳信息技术有限公司 Human body posture detection method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
GYEONGSIK MOON ET AL: ""V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map"", 《HTTPS://ARXIV.ORG/PDF/1711.07399V1》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021129569A1 (en) * 2019-12-25 2021-07-01 神思电子技术股份有限公司 Human action recognition method
CN111047548A (en) * 2020-03-12 2020-04-21 腾讯科技(深圳)有限公司 Attitude transformation data processing method and device, computer equipment and storage medium
CN111047548B (en) * 2020-03-12 2020-07-03 腾讯科技(深圳)有限公司 Attitude transformation data processing method and device, computer equipment and storage medium
CN111724414A (en) * 2020-06-23 2020-09-29 宁夏大学 Basketball movement analysis method based on 3D attitude estimation
CN111724414B (en) * 2020-06-23 2024-01-26 宁夏大学 Basketball motion analysis method based on 3D gesture estimation
CN111932678A (en) * 2020-08-13 2020-11-13 北京未澜科技有限公司 Multi-view real-time human motion, gesture, expression and texture reconstruction system
CN111932678B (en) * 2020-08-13 2021-05-14 北京未澜科技有限公司 Multi-view real-time human motion, gesture, expression and texture reconstruction system
CN112446923A (en) * 2020-11-23 2021-03-05 中国科学技术大学 Human body three-dimensional posture estimation method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN108108722A (en) A kind of accurate three-dimensional hand and estimation method of human posture based on single depth image
Liu et al. Tracking-based 3D human skeleton extraction from stereo video camera toward an on-site safety and ergonomic analysis
Von Marcard et al. Sparse inertial poser: Automatic 3d human pose estimation from sparse imus
US11436745B1 (en) Reconstruction method of three-dimensional (3D) human body model, storage device and control device
Achilles et al. Patient MoCap: Human pose estimation under blanket occlusion for hospital monitoring applications
CN105787439A (en) Depth image human body joint positioning method based on convolution nerve network
Min et al. Support vector machine approach to fall recognition based on simplified expression of human skeleton action and fast detection of start key frame using torso angle
CN105912985A (en) Human skeleton joint point behavior motion expression method based on energy function
CN103237155B (en) The tracking of the target that a kind of single-view is blocked and localization method
CN105760809A (en) Method and apparatus for head pose estimation
Araújo et al. Circle: Capture in rich contextual environments
Yang et al. Depth map super-resolution using stereo-vision-assisted model
Guðmundsson et al. Improved 3D reconstruction in smart-room environments using ToF imaging
Hou et al. Handheld 3D reconstruction based on closed-loop detection and nonlinear optimization
Luo et al. Scene semantic reconstruction from egocentric rgb-d-thermal videos
Huan et al. GeoRec: Geometry-enhanced semantic 3D reconstruction of RGB-D indoor scenes
He et al. Volumeter: 3D human body parameters measurement with a single Kinect
CN106203350A (en) A kind of moving target is across yardstick tracking and device
Ruget et al. Pixels2pose: Super-resolution time-of-flight imaging for 3d pose estimation
Dai et al. A novel STSOSLAM algorithm based on strong tracking second order central difference Kalman filter
Pintore et al. Mobile mapping and visualization of indoor structures to simplify scene understanding and location awareness
Folgado et al. A block-based model for monitoring of human activity
Raunhardt et al. Immersive singularity‐free full‐body interactions with reduced marker set
Kim et al. Absolute motion and structure from stereo image sequences without stereo correspondence and analysis of degenerate cases
Ruget et al. Real-time, low-cost multi-person 3D pose estimation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20180601