CN110135304A - Human body method for recognizing position and attitude and device - Google Patents
Human body method for recognizing position and attitude and device Download PDFInfo
- Publication number
- CN110135304A CN110135304A CN201910363750.8A CN201910363750A CN110135304A CN 110135304 A CN110135304 A CN 110135304A CN 201910363750 A CN201910363750 A CN 201910363750A CN 110135304 A CN110135304 A CN 110135304A
- Authority
- CN
- China
- Prior art keywords
- human body
- human
- pose
- recognition result
- coordinate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
Abstract
Disclose a kind of human body method for recognizing position and attitude and device, comprising: the human figure region comprising human body is cut out from video to be identified;Pose identification is carried out to human body in human figure region, obtains the first pose recognition result and human skeleton data corresponding with human body;Based on the pose of human skeleton data identification human body, the second pose recognition result is obtained;Based on the first pose recognition result and the second pose recognition result, the pose of human body is determined.The application is due to during determining human body pose, not only consider the first pose recognition result obtained according to human figure region, it can also combine on the basis of the first pose recognition result and be analyzed according to the second pose recognition result that human skeleton data obtain, remove the pose of determining human body jointly by the first pose recognition result and the second pose recognition result, human skeleton noise can not only be reduced, nor scene and other objects in meeting over-fitting video, effectively improve the accuracy of human body pose identification.
Description
Technical field
This application involves image identification technical field more particularly to human body method for recognizing position and attitude and device.
Background technique
Human body pose identifies in the fields such as computer vision, pattern-recognition, artificial intelligence, has become a great meaning
The research hotspot of justice, with the man-machine friendship such as wide application field, including virtual reality, biomethanics, game, medical treatment & health
Mutual field.
There are following two to carry out knowledge method for distinguishing to human body pose in video for the prior art:
In the first recognition methods, using human skeleton in depth transducer estimation video, so that it is determined that human body position out
Appearance, it is more using human skeleton noise obtained by the above method, so that the human body pose identified is inaccurate, and due to depth
The limitation for spending sensor use environment, is only applicable to interior.
In second of recognition methods, the image that random cropping goes out in video go forward side by side line position appearance identification, using the above method
The scene and object over-fitting being easy to cause in video, so as to cause the human body pose inaccuracy identified.
In conclusion either using the first above-mentioned recognition methods or second of recognition methods, there is human body position
The technical problem of appearance recognition result inaccuracy.
Summary of the invention
In order to solve the above-mentioned technical problem, the application is proposed.Embodiments herein provides a kind of human body pose knowledge
Other method and device.
According to the first aspect of the application, a kind of human body method for recognizing position and attitude is provided, comprising:
The human figure region comprising human body is cut out from video to be identified;
The human figure region to the human body carry out pose identification, obtain the first pose recognition result and with it is described
The corresponding human skeleton data of human body;
The pose that the human body is identified based on the human skeleton data obtains the second pose recognition result;
Based on the first pose recognition result and the second pose recognition result, the pose of the human body is determined.
According to the second aspect of the application, a kind of human body pose identification device is provided, comprising:
Module is cut, for cutting out the human figure region comprising human body from video to be identified;
First output module obtains first for carrying out pose identification to the human body in the human figure region
Appearance recognition result and human skeleton data corresponding with the human body;
Second output module obtains the second pose for identifying the pose of the human body based on the human skeleton data
Recognition result;
Determining module, described in determining based on the first pose recognition result and the second pose recognition result
The pose of human body.
In terms of according to the third of the application, a kind of computer readable storage medium, the storage medium storage are provided
There is computer program, the computer program is used to execute the human body method for recognizing position and attitude of above-mentioned first aspect.
According to the 4th of the application the aspect, a kind of electronic equipment is provided, the electronic equipment includes:
Processor;
For storing the memory of the processor-executable instruction;
The processor, for executing the human body method for recognizing position and attitude of above-mentioned first aspect.
According to the human body method for recognizing position and attitude and device of the application, by cutting out from video to be identified comprising human body
Human figure region, and pose identification is carried out to human body in human figure region, obtain the first pose recognition result and and human body
Corresponding human skeleton data.Then the pose based on human skeleton data identification human body, obtains the second pose recognition result.Most
Eventually, the pose of human body is determined in conjunction with the first pose recognition result and the second pose recognition result.Due to determining human body pose
In the process, not only consider the first pose recognition result obtained according to human figure region, it can also be in the first pose recognition result
On the basis of combine analyzed according to the second pose recognition result that human skeleton data obtain, pass through the first pose identification knot
Fruit and the second pose recognition result go to determine the pose of human body jointly, can not only reduce human skeleton noise, nor meeting
Scene and other objects in over-fitting video, effectively improve the accuracy of human body pose identification.
Detailed description of the invention
The embodiment of the present application is described in more detail in conjunction with the accompanying drawings, the above-mentioned and other purposes of the application,
Feature and advantage will be apparent.Attached drawing is used to provide to further understand the embodiment of the present application, and constitutes explanation
A part of book is used to explain the application together with the embodiment of the present application, does not constitute the limitation to the application.In the accompanying drawings,
Identical reference label usually indicates same parts or step.
Fig. 1 is system block diagram corresponding to the human body method for recognizing position and attitude of the application.
Fig. 2 is neural network structure figure corresponding to the human body method for recognizing position and attitude of the application.
Fig. 3 is the flow diagram of human body method for recognizing position and attitude provided by the embodiments of the present application.
Fig. 4 is that the application is corresponding with human body to human body progress pose identification output in human figure region using I3D model
Human skeleton data procedures schematic flow chart.
Fig. 5 is that the application utilizes 3D convolutional neural networks acquisition human skeleton coordinate mistake corresponding with characteristics of human body's figure
The schematic flow chart of journey.
Fig. 6 is the schematic flow that the application determines coordinate process of the human body key point in characteristics of human body's figure according to thermal map
Figure.
Fig. 7 is the schematic flow chart that the application obtains bias vector process.
Fig. 8 is pose of the application using 2D convolutional neural networks based on human skeleton data identification human body, obtains second
The schematic flow chart of pose recognition result process.
Fig. 9 is the structural schematic diagram of the human body pose identification device of one embodiment of the application.
Figure 10 is the structural schematic diagram of the second output module 903 of the application.
Figure 11 is the structural schematic diagram of the first output module 902 of the application.
Figure 12 is that the human skeleton coordinate of the application obtains the structural schematic diagram of submodule 9022.
Figure 13 is the structural schematic diagram of the human body key point determination unit 90222 of the application.
Figure 14 is the structural schematic diagram of the human body pose identification device of another embodiment of the application.
Figure 15 is the structural schematic diagram of the pose identification submodule 9031 of the application.
Figure 16 is the structural schematic diagram of the human body pose identification device of another embodiment of the application.
Figure 17 is the structural schematic diagram of the determining module 904 of the application.
Figure 18 is the structure chart for the electronic equipment that one exemplary embodiment of the application provides.
Wherein, 11 be human testing frame, and 12 be I3D model, and 13 be warp lamination, and 14 be 2D convolutional neural networks, and 21 are
Video image, 22 be the first pose recognition result, and 23 scheme for characteristics of human body, and 24 be the first intermediate data, and 25 be the second mediant
According to 26 be third intermediate data, and 27 be the 4th intermediate data, and 28 be the 5th intermediate data, and 29 be the second pose recognition result.
Specific embodiment
In the following, example embodiment according to the application will be described in detail by referring to the drawings.Obviously, described embodiment is only
It is only a part of the embodiment of the application, rather than the whole embodiments of the application, it should be appreciated that the application is not by described herein
The limitation of example embodiment.
Application is summarized
As described above, there are the technologies of recognition result inaccuracy when human body pose identifies in video for the prior art
Problem.
Based on above-mentioned technical problem, human body method for recognizing position and attitude and device provided by the present application, first from video to be identified
In cut out the human figure region comprising human body.Secondly pose identification is carried out to human body in human figure region, obtains first
Pose recognition result and human skeleton data corresponding with human body.Pose again based on human skeleton data identification human body, obtains
Second pose recognition result.It is finally based on the first pose recognition result and the second pose recognition result, determines the pose of human body.
In this way, not only considering first obtained according to human figure region due to during determining human body pose
Appearance recognition result can also combine on the basis of the first pose recognition result and be known according to the second pose that human skeleton data obtain
Other result is analyzed, and removes the pose for determining human body jointly by the first pose recognition result and the second pose recognition result, no
It only can reduce human skeleton noise, nor scene and other objects in meeting over-fitting video, effectively improve people
The accuracy of posture identification.
After describing the basic principle of the application, carry out the various non-limits for specifically introducing the application below with reference to the accompanying drawings
Property embodiment processed.
Exemplary system
Fig. 1 shows the system block diagram of the human body method for recognizing position and attitude of the application, according to the system block diagram: firstly,
Video to be identified obtains human figure region after cutting, and human figure region is the region in human testing frame 11.Then,
Pose identification is carried out to human body using I3D model 12, which specifically includes: based on RGB classification (RGB-based
Action recognition), human body pose estimation (Pose estimation) and based on human skeleton classify (Pose-
based action recognition).Wherein, the estimation of human body pose is carried out using warp lamination 13, utilizes 2D convolutional Neural
Network 14 classified based on human skeleton.The first pose recognition result and second are being obtained by above-mentioned pose identification process
After appearance recognition result, the first pose recognition result and the second pose recognition result are merged, the position of human body is finally determined
Appearance.
Fig. 2 shows in the human body method for recognizing position and attitude of the application with neural network corresponding to each pose cognitive phase
Structure chart, according to the neural network structure figure: after cutting out human figure region, if the view that human figure region includes
The data structure of frequency image 21 be 8 × 7 × 7 × 2048,8 × 7 × 7 × 2048 indicate: time dimension 8, elevation dimension 7,
Width dimensions are 7, data characteristics 2048.To the video image 21 in the process of processing, firstly, by the video image
21, which are input to I3D model 12, carries out pose identification, obtains the first pose recognition result 22 and human skeleton data.Wherein, first
The data characteristics of pose recognition result 22 is 2048.And the process of human skeleton data is obtained the following steps are included: firstly, splitting
The data structure of 8 characteristics of human body Figure 23, each characteristics of human body Figure 23 are 7 × 7 × 2048,7 × 7 × 2048 expressions: height out
Dimension is 7, width dimensions 7, data characteristics 2048.Then, deconvolution twice is carried out to each characteristics of human body Figure 23 and rises dimension.
The first intermediate data 24 is obtained after first time deconvolution rises dimension, the data structure of the first intermediate data 24 is 14 × 14 × 256,
14 × 14 × 256 indicate: elevation dimension 14, width dimensions 14, data characteristics 256.After second of deconvolution rises dimension
The second intermediate data 25 is obtained, the data structure of the second intermediate data 25 is 28 × 28 × 256,28 × 28 × 256 expressions: height
Dimension is 28, width dimensions 28, data characteristics 256.Then, it is obtained according to the second intermediate data crucial comprising each human body
The third intermediate data 26 of point corresponding thermal map and offset data, the data structure of third intermediate data 26 is 28 × 28 × (3 ×
17), 28 × 28 × (3 × 17) indicate: elevation dimension 28, width dimensions 28, thermal map and offset data correspond to 3 altogether and lead to
Road, human body keypoint quantity are 17.Further, human skeleton can be obtained by being analyzed and processed to third intermediate data 26
Data.Then, human skeleton data are converted into human skeleton tensor and add objective degrees of confidence and obtain the 4th intermediate data 27,
The data structure of 4th intermediate data 27 is 8 × 17 × 3,8 × 17 × 3 expressions: time dimension 8, human body keypoint quantity are
17, xy both direction coordinate and objective degrees of confidence correspond to 3 channels altogether.Then, the 4th intermediate data 27 is converted into 2D convolution
The 5th intermediate data 28 that neural network 14 is capable of handling, the data structure of the 5th intermediate data 28 are 8 × 17 × 512,8 × 17
× 512 indicate: time dimension 8, human body keypoint quantity be 17, data characteristics 512.Finally, by the 5th intermediate data 28
It is input to 2D convolutional neural networks 14 and carries out pose identification, obtain the second pose recognition result 29.
Illustrative methods
Fig. 3 is the flow diagram for the human body method for recognizing position and attitude that one exemplary embodiment of the application provides.The present embodiment
It can be applicable on electronic equipment, which is the terminal device with analysis processing visual ability, for example, mobile phone, plate
The terminal devices such as computer or computer.As shown in figure 3, including the following steps:
Step 301, the human figure region comprising human body is cut out from video to be identified.
Step 302, pose identification is carried out to human body in human figure region, obtains the first pose recognition result 22 and and people
The corresponding human skeleton data of body.
Step 303, the pose based on human skeleton data identification human body, obtains the second pose recognition result 29.
Step 304, it is based on the first pose recognition result 22 and the second pose recognition result 29, determines the pose of human body.
According to the human body method for recognizing position and attitude and device of the application, by cutting out from video to be identified comprising human body
Human figure region, and pose identification is carried out to human body in human figure region, obtain the first pose recognition result 22 and and people
The corresponding human skeleton data of body.Then the pose based on human skeleton data identification human body, obtains the second pose recognition result
29.Finally, the pose of human body is determined in conjunction with the first pose recognition result 22 and the second pose recognition result 29.Due to determining people
During posture, the first pose recognition result 22 obtained according to human figure region is not only considered, it can also be at first
It combines on the basis of appearance recognition result 22 and is analyzed according to the second pose recognition result 29 that human skeleton data obtain, passed through
First pose recognition result 22 and the second pose recognition result 29 go to determine the pose of human body jointly, can not only reduce human body bone
Frame noise, nor scene and other objects in meeting over-fitting video, effectively improve the accurate of human body pose identification
Property.
For step 301, in this application, human testing is carried out to video to be identified using human body detector.People
Human region can be identified by detector by human testing frame 11, referring to Fig. 1.Human testing frame 11 is to wrap in image
Minimum circumscribed rectangle containing human body.Region in human testing frame 11 is to be cut out from video to be identified comprising human body
Human figure region.How human figure region is cut out from video to be identified for human body detector, belongs to the prior art,
The application does not limit this.
For step 302, the application is realized using I3D (Inflated 3D ConvNet) model 12.I3D model
12 be based on model obtained from the training of human body video collection.In this application, make in human figure region step 301 obtained
For the input of I3D model 12, the first pose recognition result 22 and human skeleton data will be exported by I3D model 12.For
For each frame image in video to be identified, a first pose recognition result 22 and a human skeleton number can be obtained
According to.Wherein, correspond to the process for obtaining the first pose recognition result 22, the human body pose in Fig. 1 based on RGB classification in Fig. 1
Estimation corresponds to the process for obtaining human skeleton data.
It is identified for how to carry out pose identification the first pose of output to human body in human figure region using I3D model 12
As a result 22, detailed description given below:
The I3D model 12 of the application is obtained by the way that the 2D convolution kernel of ResNet-50 is extended to 3D convolution kernel, after extension
Neural network comprising 3D convolution kernel is also referred to as 3D convolutional neural networks.Wherein, 2D convolution kernel is height (H) × width (W),
It only includes height and two dimensions of width.Referring to fig. 2,3D convolution kernel is time (T) × H × W, not only includes elevation dimension
And width dimensions, it also include time dimension.It is special in addition to human body pose can be captured for 3D convolution kernel is compared to 2D convolution kernel
Sign, additionally it is possible to capture temporal characteristics, and then establish the relationship between human body pose feature and temporal characteristics, finally identify view
Frequently the pose of human body of each moment improves the accuracy of pose recognition result.
For further, the I3D model 12 of the application is made of two warp laminations 13 and two 1 × 1 convolutional layers, finally
One convolutional layer connects global pool layer and full articulamentum as output.For warp lamination 13, two deconvolution
Layer 13 is made of 256 filters respectively, and the convolution kernel size of each warp lamination 13 is 4 × 4, step-length 2, in warp lamination
Access normalization layer and ReLU activation primitive after 13.For 1 × 1 convolutional layer, two 1 × 1 convolutional layers are placed parallel.
Specifically, if the human figure region data set that step 301 is obtainedIt indicates,In include N frame image and include n kind classification, yi∈ { 1, K, n } is the label of i-th of sample, thus according to as follows
Formula (1) obtains the first pose recognition result 22:
Wherein, Yrgb(yi) it is I3D behavior class prediction vector YrgbYiDimension, LrFor the first pose recognition result 22.
For Fig. 2, after cutting out human figure region, if the number for the video image 21 that human figure region includes
According to structure be 8 × 7 × 7 × 2048,8 × 7 × 7 × 2048 indicate: time dimension 8, elevation dimension 7, width dimensions 7,
Data characteristics is 2048.So, which is input to I3D model 12 and carries out pose identification, obtained the first pose and know
Other result 22, the data characteristics of the first pose recognition result 22 are 2048.
For how using I3D model 12 human figure region to human body carry out pose identification, export it is corresponding with human body
Human skeleton data, as shown in figure 4, including the following steps: on the basis of above-mentioned embodiment illustrated in fig. 3
Step 401: human figure region is split as multiple characteristics of human body Figure 23.
Step 402: obtaining human skeleton corresponding with each characteristics of human body Figure 23 respectively using 3D convolutional neural networks and sit
Mark.
Step 403: will each human skeleton coordinate corresponding with each characteristics of human body Figure 23 as human skeleton data.
The application after fractionation obtains multiple characteristics of human body Figure 23, by being obtained respectively using 3D convolutional neural networks and
The corresponding human skeleton coordinate of each characteristics of human body Figure 23, and using human skeleton coordinate as human skeleton data, for subsequent
Pose identification is carried out according to human skeleton data, the 3D convolution kernel for including due to 3D convolutional neural networks is in addition to that can capture people
Posture feature, additionally it is possible to temporal characteristics are captured, so as to improve the accuracy of pose recognition result.
Specifically, firstly, in step 401, being split for human figure region to video to be identified, if obtaining
Dry characteristics of human body Figure 23, wherein each frame image in video to be identified can all correspond to that there are a characteristics of human body Figure 23, human bodies
The image that characteristic pattern 23 is included for human figure region in the frame image.Secondly, in step 402, obtaining multiple human bodies
After characteristic pattern 23, human skeleton corresponding with each characteristics of human body Figure 23 is obtained respectively by the processing of 3D convolutional neural networks
Coordinate.Finally, in step 403, using the corresponding human skeleton coordinate of each characteristics of human body Figure 23 as human skeleton data.Its
In, human skeleton is made of several human body key points, and human body key point is otherwise known as skeleton key point, is able to reflect people
Body bone information.In general, human body key point includes the positions such as head, neck, shoulder, elbow, hand, stern, knee and foot.Human skeleton coordinate is behaved
The set of the coordinate for several human body key points that body skeleton includes.For example, if preset human body key point includes head, neck and shoulder,
So coordinate, the coordinate of neck and the coordinate of shoulder of the human skeleton coordinate comprising head.
For step 402, for how corresponding with a characteristics of human body Figure 23 using the acquisition of 3D convolutional neural networks
Human skeleton coordinate, as shown in figure 5, including the following steps: on the basis of above-mentioned embodiment illustrated in fig. 4
Step 501: obtaining heat corresponding with human body key point each in characteristics of human body Figure 23 using 3D convolutional neural networks
Figure.
Step 502: according to each thermal map, determining coordinate of each human body key point in characteristics of human body Figure 23 respectively.
Step 503: according to coordinate of each human body key point in characteristics of human body Figure 23, obtaining and Figure 23 pairs of characteristics of human body
The human skeleton coordinate answered.
The application combines thermal map to analyze and determines that human body closes by obtaining thermal map corresponding to each human body key point respectively
Coordinate of the key point in characteristics of human body Figure 23, then human body bone is obtained by coordinate of each human body key point in characteristics of human body Figure 23
Rack coordinate, since thermal map can intuitively and accurately reflect human body key point feature, the human body bone determined using thermal map
Rack coordinate is more accurate.
Specifically, in this application, each human body key in characteristics of human body Figure 23 is obtained using 3D convolutional neural networks
The corresponding thermal map (Heatmap) of point.Wherein, the corresponding width thermal map of each human body key point, every width thermal map indicate corresponding human body
A possibility that key point position, a width thermal map namely a channel.For example, if a width characteristics of human body Figure 23 includes P
Human body key point, then P thermal map will be obtained, and there are P channels.For Fig. 2, in the process for obtaining thermal map
In, firstly, if splitting out 8 characteristics of human body Figure 23, the data structure of each characteristics of human body Figure 23 be 7 data structures be 7 × 7 ×
2048,7 × 7 × 2048 indicate: elevation dimension 7, width dimensions 7, data characteristics 2048.Then, special to each human body
It levies Figure 23 and carries out the liter dimension of deconvolution twice.The first intermediate data 24, the first intermediate data are obtained after first time deconvolution rises dimension
24 data structure is 14 × 14 × 256,14 × 14 × 256 expressions: elevation dimension 14, width dimensions 14, data characteristics
It is 256.The second intermediate data 25 is obtained after second of deconvolution rises dimension, the data structure of the second intermediate data 25 is 28 × 28
× 256,28 × 28 × 256 indicate: elevation dimension 28, width dimensions 28, data characteristics 256.Then, according in second
Between data obtain include each human body key point corresponding thermal map and offset data third intermediate data 26, third intermediate data
26 data structure be 28 × 28 × (3 × 17), 28 × 28 × (3 × 17) indicate: elevation dimension 28, width dimensions 28,
Thermal map and offset data correspond to 3 channels altogether, human body keypoint quantity is 17, finally, extracting from third intermediate data 26
The corresponding thermal map of human body key point.
After obtaining thermal map, for step 502, for how according to thermal map determine human body key point human body spy
The coordinate in Figure 23 is levied, as shown in fig. 6, on the basis of above-mentioned embodiment illustrated in fig. 5, it may include following steps:
Step 601: utilizing trained pixel point target corresponding with thermal map, obtain each pixel in thermal map and belong to human body
The probability of key point.
Step 602: by the coordinate of the corresponding pixel of maximum value in probability, being determined as the first coordinate of human body key point.
Step 603: bias vector being superimposed to the first coordinate, obtains the second coordinate of human body key point.
Step 604: based on the proportionate relationship between thermal map and characteristics of human body Figure 23, the second coordinate being amplified, is obtained
Coordinate of the human body key point in characteristics of human body Figure 23.
The application obtains training pixel point target by carrying out training in advance to pixel, further according to training pixel point target
It determines that each pixel in thermal map belongs to the probability of human body key point, is closed the coordinate of the pixel of maximum probability as human body
First coordinate of key point, while considering the biasing of coordinate, it is superimposed bias vector on the first coordinate and obtains the second coordinate, by the
Coordinate of two coordinates as human body key point, therefore the accuracy for the human body key point coordinate that can be improved, and then indirectly
Ground improves the accuracy of final pose identification.
In the specific implementation process, it is assumed that the two-dimensional coordinate of each pixel i on thermal map is xi, i ∈ { 1, K, Q }, i are
The index of pixel, Q are the total quantity of pixel on thermal map, and coordinate of k-th of human body key point on thermal map is lk.If M is
It is corresponding with the trusted area of k-th of human body key point to preset credible radius, then when | | xi-lk| |≤M, it is meant that pixel
xiBy center of circle radius of k-th of human body key point in the border circular areas of M, that is, pixel xiIn k-th human body key point
In trusted area.And then neural network is to pixel xiExpected probability output be hk(xi)=1.Wherein, hk(xi) reflection be
K-th of human body key point is in pixel xiA possibility that place, hk(xi) the more big then possibility of numerical value it is higher.
Further, in this application, the training in advance of corresponding thermal map has trained pixel point target, and training pixel point target is It is non-zero i.e. 1 figure.When | | xi-lk| | when≤M,In the case of otherEqual to 0.
By specifiedTo pixel xiTwo classification problems are solved with human body key point, minimize hk(xi) andPoor is exhausted
To value, neural network output h can be supervised in the training processk(xi) closeIt is closed to which study obtains k-th of human body
Key point is in pixel xiA possibility that place namely pixel belong to the probability of human body key point.Then, probability value is maximum
Pixel is determined as human body key point, and the coordinate of human body key point is determined according to the coordinate of the pixel.Specifically, it can adopt
The coordinate for obtaining the maximum pixel of probability value is operated with argmax, executes the formula of argmax operation referring to following formula (2):
It wherein, is the first coordinate by the coordinate that above-mentioned formula (2) obtains.After obtaining the first coordinate, for how
The coordinate for determining human body key point, there are following two embodiments.In the first embodiment, directly first can be sat
It is denoted as the coordinate of human body key point.But often there is biggish error using the first embodiment.Therefore, the application
Second embodiment is provided, in this embodiment, bias vector is added to the first coordinate, obtains the second coordinate, second is sat
Mark the coordinate as human body key point.Since second of embodiment considers the biasing of coordinate, can be improved
The accuracy of human body key point coordinate, and then the accuracy of final pose identification is improved indirectly.
For further, for how to obtain bias vector, as shown in fig. 7, comprises following steps:
Step 701: obtaining offset data corresponding with human body key point using 3D convolutional neural networks.
Step 702: the difference in the prediction coordinate based on human body key point and thermal map between the coordinate of pixel, to compensation
Data are normalized, and obtain bias vector.
In the specific implementation process, each human body key point in characteristics of human body Figure 23 is being obtained using 3D convolutional neural networks
While corresponding thermal map, it is corresponding that each human body key point in characteristics of human body Figure 23 can also be obtained using 3D convolutional neural networks
Offset data (Offset).Wherein, the corresponding offset data of each human body key point, offset data indicate each pixel
Biasing of the position relative to human body key point, each offset data include two channels, correspond respectively to coordinate system the direction x and
The direction y, when there are P human body key point, then there are 2P channels for offset data.For Fig. 2, offset data is and heat
Figure acquires simultaneously, and therefore, the process for obtaining offset data is similar to the acquisition process of thermal map, it may be assumed that firstly, if split out
The data structure of 8 characteristics of human body Figure 23, each characteristics of human body Figure 23 are 7 × 7 × 2048,7 × 7 × 2048 expressions: height dimension
Degree is 7, width dimensions 7, data characteristics 2048.Then, deconvolution twice is carried out to each characteristics of human body Figure 23 and rises dimension.?
First time deconvolution obtains the first intermediate data 24 after rising dimension, and the data structure of the first intermediate data 24 is 14 × 14 × 256,14
× 14 × 256 indicate: elevation dimension 14, width dimensions 14, data characteristics 256.After second of deconvolution rises dimension
To the second intermediate data 25, the data structure of the second intermediate data 25 is 28 × 28 × 256,28 × 28 × 256 expressions: height dimension
Degree is 28, width dimensions 28, data characteristics 256.Then, being obtained according to the second intermediate data includes each human body key point
The third intermediate data 26 of corresponding thermal map and offset data, the data structure of third intermediate data 26 is 28 × 28 × (3 ×
17), 28 × 28 × (3 × 17) indicate: elevation dimension 28, width dimensions 28, thermal map and offset data correspond to 3 altogether and lead to
Road, human body keypoint quantity are 17.Finally, extracting the corresponding offset data of human body key point from third intermediate data 26.
After obtaining offset data, in the prediction coordinate based on human body key point and thermal map between the coordinate of pixel
Offset data is normalized in difference, that is, to each pixel position and human body key point prediction 2D Offset
Vector Fk(xi)=lk-xi.In training, regression problem is solved to each pixel position and human body key point, minimizes Fk
(xi) and lk-xiAbsolute value of the difference, solution obtain bias vector Fk(xi).In turn, in step 603, the first coordinate is superimposed inclined
Set vector Fk(xi), obtain the second coordinate.
Further, since thermal map is the image that characteristics of human body Figure 23 is obtained after resolution compression, in general, characteristics of human body
The spatial resolution of Figure 23 is 224 × 224, and the spatial resolution of thermal map and offset data is 28 × 28.Therefore, in step 604
In, according to the proportionate relationship between thermal map and characteristics of human body Figure 23, the second coordinate is amplified, human body key point can be obtained
Coordinate in characteristics of human body Figure 23, the coordinate can accurately reflect position of the human body key point in characteristics of human body Figure 23.
Human body key point is obtained by the above process after the coordinate in characteristics of human body Figure 23, executes step 503.In step
In rapid 503, human skeleton coordinate corresponding with characteristics of human body Figure 23 includes each human body key point in characteristics of human body Figure 23
Coordinate.
The application after obtaining human skeleton data, execute step 303, step 303 correspond to Fig. 1 in based on human body
Skeleton classification.For step 303, the application is using 2D convolutional neural networks 14 based on human skeleton data identification human body
Pose obtains the second pose recognition result 29, due to can accurately know according to human skeleton data capture human body pose
It Chu not human body pose.As shown in figure 8, step 303 may include following steps on the basis of above-mentioned embodiment illustrated in fig. 3:
Step 801: human skeleton data are converted into human skeleton tensor.
Step 802: adding objective degrees of confidence into human skeleton tensor, wherein objective degrees of confidence passes through to each thermal map
Maximum pondization is carried out to obtain.
Step 803: 2D convolutional neural networks 14 will be input to added with the human skeleton tensor of objective degrees of confidence, obtain the
Two pose recognition results 29.
The application combines by the way that human skeleton data are converted to human skeleton tensor and carries out maximum Chi Huahou to thermal map
The objective degrees of confidence of acquisition, using the human skeleton tensor added with objective degrees of confidence as the input of 2D convolutional neural networks 14,
And then the second pose recognition result 29 is obtained, it can be improved the accurate of the human body pose identified using human skeleton data
Property.
In the specific implementation process, the human skeleton data comprising human skeleton coordinate are converted to the people of 2xTxK first
Body skeleton tensor, K are the quantity of human body key point, and T is the quantity of picture frame in video to be identified.Then, an additional mesh
Mark confidence level is added on human skeleton tensor, and objective degrees of confidence is obtained by carrying out maximum pondization to each thermal map.Then,
Human skeleton tensor added with objective degrees of confidence is input to 2D convolutional neural networks 14 and carries out pose identification, 2D convolutional Neural
Network 14 namely 2D CNN.Since the pose sequence dimension of input is small, the pondization operation that can be used in 2D CNN is removed,
It is 1 that the convolutional layer that step-length is 2, which is replaced with step-length, finally accesses global pool layer and full articulamentum.Final prediction uses intersection
Entropy loss optimization, obtains the second pose recognition result 29 by following formula (3):
Wherein, Ypaction(yi) be 2D CNN class prediction vector YpactionYiDimension, LpactionFor the knowledge of the second pose
Other result 29.
For Fig. 2, human skeleton data are converted into human skeleton tensor and add objective degrees of confidence obtain the 4th
Intermediate data 27, the data structure of the 4th intermediate data 27 are 8 × 17 × 3,8 × 17 × 3 expressions: time dimension 8, human body close
Key point quantity is that 17, xy both direction coordinate and objective degrees of confidence correspond to 3 channels altogether.Then, by 27 turns of the 4th intermediate data
It is changed to the 5th intermediate data 28 that 2D convolutional neural networks 14 are capable of handling, the data structure of the 5th intermediate data 28 is 8 × 17
× 512,8 × 17 × 512 indicate: time dimension 8, human body keypoint quantity be 17, data characteristics 512.Finally, by the 5th
Intermediate data 28 is input to 2D convolutional neural networks 14 and carries out pose identification, obtains the second pose recognition result 29.
In order to measure the gap between the output predicted value of neural network and actual value, and predicted value is corrected according to gap,
So that predicted value is closer to actual value, it is further comprising the steps of for the human body method for recognizing position and attitude of the application:
First task during human figure region carries out pose identification to human body is obtained to lose;And it obtains
The second task loss during pose based on human skeleton data identification human body.
Specifically, execute step 302 human figure region to human body carry out pose identification during, utilize with
Lower formula (4) obtains first task loss:
Wherein, Lh(θ) is first task loss, and θ is 3D convolutional neural networks learning parameter, and R is smooth L1Loss, K
For the quantity of human body key point.smooth L1It can be obtained by following formula (5):
Meanwhile during executing pose of step 303 acquisition based on human skeleton data identification human body, utilization is following
Formula (6) obtains the loss of the second task:
Wherein, Lo(θ) is the loss of the second task, and the biasing loss function of formula six is at each pixel position
smoothL1The sum of loss, biasing loss only solve the position in the circle for being M from each human body key point radius.
Further, after obtaining first task loss and the loss of the second task, for step 304, it is based on first
Pose recognition result 22, the second pose recognition result 29, first task loss and the loss of the second task, determine the pose of human body.
Specifically, object pose estimation loss, target can be obtained according to first task loss and the loss of the second task
Pose estimation loss is obtained by following formula (7):
Lp=λhLh(θ)+λoLo(θ) formula (7)
Wherein, LpEstimate to lose for object pose, λhAnd λoFor balance weight, balance weight is for balancing first task damage
Second task of becoming estranged loss, λhAnd λoTake 0.5.And the pose of human body is obtained by following formula (8):
L=λ1Lr+λ2Lp+λ3LpactionFormula (8)
Wherein, L is the pose of human body, λ1,λ2And λ3It is the loss of three tasks respectively, is defaulted as 1.
The application using first task loss and the second task by being lost to the first pose recognition result 22 and second
Appearance recognition result 29 is modified, and enables to the human body pose finally determined more accurate.
Exemplary means
Based on the same inventive concept, the embodiment of the present application also provides a kind of human body pose identification device, as shown in figure 9, should
Device includes:
Module 901 is cut, for cutting out the human figure region comprising human body from video to be identified.
First output module 902 obtains first for carrying out pose identification to the human body in the human figure region
Pose recognition result and human skeleton data corresponding with the human body.
Second output module 903 obtains second for identifying the pose of the human body based on the human skeleton data
Appearance recognition result.
Determining module 904 determines institute for being based on the first pose recognition result and the second pose recognition result
State the pose of human body.
Wherein, the second output module 903, as shown in Figure 10, comprising:
Pose identifies submodule 9031, for being based on described in human skeleton data identification using 2D convolutional neural networks
The pose of human body obtains the second pose recognition result.
Wherein, the first output module 902, as shown in figure 11, comprising:
Submodule 9021 is split, is schemed for the human figure region to be split as multiple characteristics of human body;
Human skeleton coordinate obtains submodule 9022, for being obtained respectively and each institute using the 3D convolutional neural networks
It states characteristics of human body and schemes corresponding human skeleton coordinate;
Submodule 9023 is determined, for will each human skeleton coordinate work corresponding with each characteristics of human body's figure
For the human skeleton data.
Wherein, human skeleton coordinate obtains submodule 9022, as shown in figure 12, comprising:
Thermal map obtaining unit 90221, it is each in 3D convolutional neural networks acquisition and characteristics of human body's figure for utilizing
The corresponding thermal map of a human body key point;
Human body key point determination unit 90222, for determining that each human body is crucial respectively according to each thermal map
Coordinate of the point in characteristics of human body's figure;
Skeleton coordinate obtaining unit 90223, for according to each human body key point in characteristics of human body's figure
Coordinate obtains the human skeleton coordinate corresponding with characteristics of human body figure.
Wherein, human body key point determination unit 90222, as shown in figure 13, comprising:
Probability obtains subelement 902221, for utilizing trained pixel point target corresponding with the thermal map, described in acquisition
Each pixel belongs to the probability of the human body key point in thermal map;
Coordinate determines subelement 902222, for being determined as the coordinate of the corresponding pixel of maximum value in the probability
First coordinate of the human body key point;
It is superimposed subelement 902223, for being superimposed bias vector to first coordinate, obtains the human body key point
Second coordinate;
Amplify subelement 902224, for based on the proportionate relationship between the thermal map and characteristics of human body's figure, to institute
It states the second coordinate to amplify, obtains coordinate of the human body key point in characteristics of human body's figure.
Wherein, as shown in figure 14, described device further include:
Offset data obtains module 905, for being obtained and the human body key point pair using the 3D convolutional neural networks
The offset data answered;
Data processing module 906, for pixel in prediction coordinate and the thermal map based on the human body key point
The offset data is normalized in difference between coordinate, obtains the bias vector.
Wherein, pose identifies submodule 9031, as shown in figure 15, comprising:
Date Conversion Unit 90311, for the human skeleton data to be converted to human skeleton tensor;
Adding unit 90312, for adding objective degrees of confidence into the human skeleton tensor, wherein the target is set
Reliability is obtained by carrying out maximum pondization to each thermal map;
Pose recognition unit 90313, for the human skeleton tensor for being added with the objective degrees of confidence to be input to
2D convolutional neural networks obtain the second pose recognition result.
Wherein, as shown in figure 16, described device further include:
Loss obtains module 907, for obtaining the mistake for carrying out pose identification to the human body in the human figure region
First task loss in journey;And during obtaining the pose for identifying the human body based on the human skeleton data
The loss of second task;
Wherein it is determined that module 904, as shown in figure 17, comprising:
Submodule 9041 is determined, for based on the first pose recognition result, the second pose recognition result, described
First task loss and second task loss, determine the pose of the human body.
Example electronic device
In the following, being described with reference to Figure 18 the electronic equipment according to the embodiment of the present application.
Figure 18 illustrates the block diagram of the electronic equipment according to the embodiment of the present application.
As shown in figure 18, electronic equipment 1801 includes one or more processors 18011 and memory 18012.
Processor 18011 can be central processing unit (CPU) or have data-handling capacity and/or instruction execution energy
The processing unit of the other forms of power, and can control the other assemblies in electronic equipment 1801 to execute desired function.
Memory 18012 may include one or more computer program products, and the computer program product can wrap
Include various forms of computer readable storage mediums, such as volatile memory and/or nonvolatile memory.The volatibility
Memory for example may include random access memory (RAM) and/or cache memory (cache) etc..It is described non-volatile
Property memory for example may include read-only memory (ROM), hard disk, flash memory etc..It can on the computer readable storage medium
To store one or more computer program instructions, processor 18011 can run described program instruction, described above to realize
The application each embodiment human body method for recognizing position and attitude and/or other desired functions.The computer can
It reads that the various contents such as input signal, signal component, noise component(s) can also be stored in storage medium.
In one example, electronic equipment 1801 can also include: input unit 18013 and output device 18014, these
Component passes through the interconnection of bindiny mechanism's (not shown) of bus system and/or other forms.
For example, the input unit 18013 can be above-mentioned microphone or microphone array, for capturing the input of sound source
Signal.When the electronic equipment is stand-alone device, which can be communication network connector.
In addition, the input equipment 18013 can also include such as keyboard, mouse etc..
The output device 18014 can be output to the outside various information, including determine range information, directional information
Deng.The output equipment 18014 may include such as display, loudspeaker, printer and communication network and its be connected remote
Journey output equipment etc..
Certainly, to put it more simply, illustrating only in the electronic equipment 1,801 one in component related with the application in Figure 18
A bit, the component of such as bus, input/output interface etc. is omitted.In addition to this, according to concrete application situation, electronic equipment
1801 can also include any other component appropriate.
Illustrative computer program product and computer readable storage medium
Other than the above method and equipment, embodiments herein can also be computer program product comprising meter
Calculation machine program instruction, it is above-mentioned that the computer program instructions make the processor execute this specification when being run by processor
According to the step in the human body method for recognizing position and attitude of the various embodiments of the application described in " illustrative methods " part.
The computer program product can be write with any combination of one or more programming languages for holding
The program code of row the embodiment of the present application operation, described program design language includes object oriented program language, such as
Java, C++ etc. further include conventional procedural programming language, such as " C " language or similar programming language.Journey
Sequence code can be executed fully on the user computing device, partly execute on a user device, be independent soft as one
Part packet executes, part executes on a remote computing or completely in remote computing device on the user computing device for part
Or it is executed on server.
In addition, embodiments herein can also be computer readable storage medium, it is stored thereon with computer program and refers to
It enables, the computer program instructions make the processor execute above-mentioned " the exemplary side of this specification when being run by processor
According to the step in the human body method for recognizing position and attitude of the various embodiments of the application described in method " part.
The computer readable storage medium can be using any combination of one or more readable mediums.Readable medium can
To be readable signal medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can include but is not limited to electricity, magnetic, light, electricity
Magnetic, the system of infrared ray or semiconductor, device or device, or any above combination.Readable storage medium storing program for executing it is more specific
Example (non exhaustive list) includes: the electrical connection with one or more conducting wires, portable disc, hard disk, random access memory
Device (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc
Read-only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
The basic principle of the application is described in conjunction with specific embodiments above, however, it is desirable to, it is noted that in this application
The advantages of referring to, advantage, effect etc. are only exemplary rather than limitation, must not believe that these advantages, advantage, effect etc. are the application
Each embodiment is prerequisite.In addition, detail disclosed above is merely to exemplary effect and the work being easy to understand
With, rather than limit, it is that must be realized using above-mentioned concrete details that above-mentioned details, which is not intended to limit the application,.
Device involved in the application, device, equipment, system block diagram only as illustrative example and be not intended to
It is required that or hint must be attached in such a way that box illustrates, arrange, configure.As those skilled in the art will appreciate that
, it can be connected by any way, arrange, configure these devices, device, equipment, system.Such as "include", "comprise", " tool
" etc. word be open vocabulary, refer to " including but not limited to ", and can be used interchangeably with it.Vocabulary used herein above
"or" and "and" refer to vocabulary "and/or", and can be used interchangeably with it, unless it is not such that context, which is explicitly indicated,.Here made
Vocabulary " such as " refers to phrase " such as, but not limited to ", and can be used interchangeably with it.
It may also be noted that each component or each step are can to decompose in the device of the application, device and method
And/or reconfigure.These decompose and/or reconfigure the equivalent scheme that should be regarded as the application.
The above description of disclosed aspect is provided so that any person skilled in the art can make or use this
Application.Various modifications in terms of these are readily apparent to those skilled in the art, and are defined herein
General Principle can be applied to other aspect without departing from scope of the present application.Therefore, the application is not intended to be limited to
Aspect shown in this, but according to principle disclosed herein and the consistent widest range of novel feature.
In order to which purpose of illustration and description has been presented for above description.In addition, this description is not intended to the reality of the application
It applies example and is restricted to form disclosed herein.Although already discussed above multiple exemplary aspects and embodiment, this field skill
Its certain modifications, modification, change, addition and sub-portfolio will be recognized in art personnel.
Claims (11)
1. a kind of human body method for recognizing position and attitude, comprising:
The human figure region comprising human body is cut out from video to be identified;
The human figure region to the human body carry out pose identification, obtain the first pose recognition result and with the human body
Corresponding human skeleton data;
The pose that the human body is identified based on the human skeleton data obtains the second pose recognition result;
Based on the first pose recognition result and the second pose recognition result, the pose of the human body is determined.
2. human body method for recognizing position and attitude according to claim 1 identifies the human body based on the human skeleton data
Pose obtains the second pose recognition result, comprising:
The pose for identifying the human body based on the human skeleton data using 2D convolutional neural networks, obtains second pose
Recognition result.
3. human body method for recognizing position and attitude according to claim 1 carries out position to the human body in the human figure region
Appearance identification, obtains human skeleton data corresponding with the human body, comprising:
The human figure region is split as multiple characteristics of human body's figures;
Obtain human skeleton coordinate corresponding with each characteristics of human body's figure respectively using 3D convolutional neural networks;
Will each human skeleton coordinate corresponding with each characteristics of human body's figure as the human skeleton data.
4. human body method for recognizing position and attitude according to claim 3 is obtained and an institute using the 3D convolutional neural networks
It states characteristics of human body and schemes the corresponding human skeleton coordinate, comprising:
Thermal map corresponding with human body key point each in characteristics of human body's figure is obtained using the 3D convolutional neural networks;
According to each thermal map, coordinate of each human body key point in characteristics of human body's figure is determined respectively;
According to coordinate of each human body key point in characteristics of human body's figure, obtain corresponding with characteristics of human body figure
The human skeleton coordinate.
5. human body method for recognizing position and attitude according to claim 4 determines the human body key point in institute according to the thermal map
State the coordinate in characteristics of human body's figure, comprising:
Using trained pixel point target corresponding with the thermal map, obtains each pixel in the thermal map and belong to the human body
The probability of key point;
By the coordinate of the corresponding pixel of maximum value in the probability, it is determined as the first coordinate of the human body key point;
Bias vector is superimposed to first coordinate, obtains the second coordinate of the human body key point;
Based on the proportionate relationship between the thermal map and characteristics of human body's figure, second coordinate is amplified, obtains institute
State coordinate of the human body key point in characteristics of human body's figure.
6. human body method for recognizing position and attitude according to claim 5, it is described to first coordinate superposition bias vector it
Before, the method also includes:
Offset data corresponding with the human body key point is obtained using the 3D convolutional neural networks;
Difference in prediction coordinate based on the human body key point and the thermal map between the coordinate of pixel, to the compensation
Data are normalized, and obtain the bias vector.
7. human body method for recognizing position and attitude according to claim 4 identifies the human body based on the human skeleton data
Pose obtains the second pose recognition result, comprising:
The human skeleton data are converted into human skeleton tensor;
Objective degrees of confidence is added into the human skeleton tensor, wherein the objective degrees of confidence passes through to each thermal map
Maximum pondization is carried out to obtain;
The human skeleton tensor added with the objective degrees of confidence is input to 2D convolutional neural networks, obtains described second
Pose recognition result.
8. human body method for recognizing position and attitude according to claim 1, further includes:
The first task during human figure region carries out pose identification to the human body is obtained to lose;And
Obtain the second task loss during the pose for identifying the human body based on the human skeleton data;
Wherein, it is based on the first pose recognition result and the second pose recognition result, determines the pose of the human body, is wrapped
It includes:
Based on the first pose recognition result, the second pose recognition result, first task loss and described second
Task loss, determines the pose of the human body.
9. a kind of human body pose identification device, comprising:
Module is cut, for cutting out the human figure region comprising human body from video to be identified;
First output module obtains the knowledge of the first pose for carrying out pose identification to the human body in the human figure region
Other result and human skeleton data corresponding with the human body;
Second output module obtains the identification of the second pose for identifying the pose of the human body based on the human skeleton data
As a result;
Determining module determines the human body for being based on the first pose recognition result and the second pose recognition result
Pose.
10. a kind of computer readable storage medium, the storage medium is stored with computer program, and the computer program is used for
Execute any human body method for recognizing position and attitude of the claims 1-8.
11. a kind of electronic equipment, the electronic equipment include:
Processor;
For storing the memory of the processor-executable instruction;
The processor, for executing any human body method for recognizing position and attitude of the claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910363750.8A CN110135304A (en) | 2019-04-30 | 2019-04-30 | Human body method for recognizing position and attitude and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910363750.8A CN110135304A (en) | 2019-04-30 | 2019-04-30 | Human body method for recognizing position and attitude and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110135304A true CN110135304A (en) | 2019-08-16 |
Family
ID=67576034
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910363750.8A Pending CN110135304A (en) | 2019-04-30 | 2019-04-30 | Human body method for recognizing position and attitude and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110135304A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111191599A (en) * | 2019-12-27 | 2020-05-22 | 平安国际智慧城市科技股份有限公司 | Gesture recognition method, device, equipment and storage medium |
CN113743293A (en) * | 2021-09-02 | 2021-12-03 | 泰康保险集团股份有限公司 | Fall behavior detection method and device, electronic equipment and storage medium |
US20230377192A1 (en) * | 2022-05-23 | 2023-11-23 | Dell Products, L.P. | System and method for detecting postures of a user of an information handling system (ihs) during extreme lighting conditions |
JP7480302B2 (en) | 2019-12-27 | 2024-05-09 | ヴァレオ・シャルター・ウント・ゼンゾーレン・ゲーエムベーハー | Method and apparatus for predicting the intentions of vulnerable road users |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105069423A (en) * | 2015-07-29 | 2015-11-18 | 北京格灵深瞳信息技术有限公司 | Human body posture detection method and device |
CN109376663A (en) * | 2018-10-29 | 2019-02-22 | 广东工业大学 | A kind of human posture recognition method and relevant apparatus |
CN109460707A (en) * | 2018-10-08 | 2019-03-12 | 华南理工大学 | A kind of multi-modal action identification method based on deep neural network |
CN109670474A (en) * | 2018-12-28 | 2019-04-23 | 广东工业大学 | A kind of estimation method of human posture based on video, device and equipment |
-
2019
- 2019-04-30 CN CN201910363750.8A patent/CN110135304A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105069423A (en) * | 2015-07-29 | 2015-11-18 | 北京格灵深瞳信息技术有限公司 | Human body posture detection method and device |
CN109460707A (en) * | 2018-10-08 | 2019-03-12 | 华南理工大学 | A kind of multi-modal action identification method based on deep neural network |
CN109376663A (en) * | 2018-10-29 | 2019-02-22 | 广东工业大学 | A kind of human posture recognition method and relevant apparatus |
CN109670474A (en) * | 2018-12-28 | 2019-04-23 | 广东工业大学 | A kind of estimation method of human posture based on video, device and equipment |
Non-Patent Citations (1)
Title |
---|
JIAGANG ZHU 等: "Action Machine: Rethinking Action Recognition in Trimmed Videos", 《ARXIV:1812.05770V1 [CS.CV]》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111191599A (en) * | 2019-12-27 | 2020-05-22 | 平安国际智慧城市科技股份有限公司 | Gesture recognition method, device, equipment and storage medium |
CN111191599B (en) * | 2019-12-27 | 2023-05-30 | 平安国际智慧城市科技股份有限公司 | Gesture recognition method, device, equipment and storage medium |
JP7480302B2 (en) | 2019-12-27 | 2024-05-09 | ヴァレオ・シャルター・ウント・ゼンゾーレン・ゲーエムベーハー | Method and apparatus for predicting the intentions of vulnerable road users |
CN113743293A (en) * | 2021-09-02 | 2021-12-03 | 泰康保险集团股份有限公司 | Fall behavior detection method and device, electronic equipment and storage medium |
CN113743293B (en) * | 2021-09-02 | 2023-11-24 | 泰康保险集团股份有限公司 | Fall behavior detection method and device, electronic equipment and storage medium |
US20230377192A1 (en) * | 2022-05-23 | 2023-11-23 | Dell Products, L.P. | System and method for detecting postures of a user of an information handling system (ihs) during extreme lighting conditions |
US11836825B1 (en) * | 2022-05-23 | 2023-12-05 | Dell Products L.P. | System and method for detecting postures of a user of an information handling system (IHS) during extreme lighting conditions |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102266529B1 (en) | Method, apparatus, device and readable storage medium for image-based data processing | |
US10902056B2 (en) | Method and apparatus for processing image | |
CN110135304A (en) | Human body method for recognizing position and attitude and device | |
US9349076B1 (en) | Template-based target object detection in an image | |
CN113362382A (en) | Three-dimensional reconstruction method and three-dimensional reconstruction device | |
CN111160269A (en) | Face key point detection method and device | |
CN111931764B (en) | Target detection method, target detection frame and related equipment | |
JP6571108B2 (en) | Real-time 3D gesture recognition and tracking system for mobile devices | |
CN103324938A (en) | Method for training attitude classifier and object classifier and method and device for detecting objects | |
CN111680678B (en) | Target area identification method, device, equipment and readable storage medium | |
JP2020518051A (en) | Face posture detection method, device and storage medium | |
CN111597884A (en) | Facial action unit identification method and device, electronic equipment and storage medium | |
WO2023151237A1 (en) | Face pose estimation method and apparatus, electronic device, and storage medium | |
CN104463240A (en) | Method and device for controlling list interface | |
Fei et al. | Flow-pose Net: An effective two-stream network for fall detection | |
CN111126515A (en) | Model training method based on artificial intelligence and related device | |
Shi et al. | DSFNet: a distributed sensors fusion network for action recognition | |
Zhang et al. | A posture detection method for augmented reality–aided assembly based on YOLO-6D | |
Zhang | Innovation of English teaching model based on machine learning neural network and image super resolution | |
US20210097377A1 (en) | Method and apparatus for image recognition | |
CN110969138A (en) | Human body posture estimation method and device | |
CN113239915B (en) | Classroom behavior identification method, device, equipment and storage medium | |
Mao et al. | A deep learning approach to track Arabidopsis seedlings’ circumnutation from time-lapse videos | |
Rawat et al. | Indian Sign Language Recognition System for Interrogative Words Using Deep Learning | |
CN113642565B (en) | Object detection method, device, equipment and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190816 |
|
RJ01 | Rejection of invention patent application after publication |