CN112131979A - Continuous action identification method based on human skeleton information - Google Patents

Continuous action identification method based on human skeleton information Download PDF

Info

Publication number
CN112131979A
CN112131979A CN202010941604.1A CN202010941604A CN112131979A CN 112131979 A CN112131979 A CN 112131979A CN 202010941604 A CN202010941604 A CN 202010941604A CN 112131979 A CN112131979 A CN 112131979A
Authority
CN
China
Prior art keywords
node
distance
nodes
action
motion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010941604.1A
Other languages
Chinese (zh)
Inventor
黎张子康
周小舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202010941604.1A priority Critical patent/CN112131979A/en
Publication of CN112131979A publication Critical patent/CN112131979A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate

Abstract

The invention discloses a continuous action identification method based on human skeleton information, which comprises the following steps: (1) extracting a skeleton of a human body to obtain position information of a plurality of nodes of the human body; (2) judging key nodes and non-joint nodes in the action, and taking the distance information of the key nodes of continuous action as an observed value characteristic sequence of action identification; (3) normalizing the observed value characteristic sequence; (4) performing probability calculation on the observation sequence corresponding to the continuous action by adopting the trained HMM model; (5) and carrying out weighted average on the obtained probability according to different weights occupied by the corresponding nodes of different actions, and outputting a final result. The method and the device can reduce the influence of the information of the joint-free point on the motion recognition, reasonably carry out weighted average on the probability obtained by calculating different node sequences, reduce the operation amount, improve the recognition accuracy rate and improve the user experience.

Description

Continuous action identification method based on human skeleton information
Technical Field
The invention relates to the technical field of human body action recognition, in particular to a continuous action recognition method based on human body skeleton information.
Background
At present, human body action recognition is a popular research subject in the field of computer vision, and can be used in the fields of machine learning, image processing, computer vision and the like.
Nowadays, many devices are available to acquire human skeleton data, such as a Kinect, realsense and other RGB-D depth somatosensory cameras, which mainly include a microphone array, an infrared transmitter and an infrared receiver used for depth images, and an RGB camera. The depth somatosensory camera separates a human body from a background by using a separation strategy, and inputs the separated human body partial image into each part recognition model of the human body trained by a cluster system through data counted by TB to produce a human body skeleton model with up to 32 joint points, and outputs skeleton data at a speed of about 30f/s, so that the human body skeleton model with up to 6 persons can be recognized. The identification method of the invention is based on human skeleton data, in the existing body action identification algorithm, researchers mostly adopt the information of simultaneously obtaining a plurality of node directions or distances as characteristics for identification, the method can cover as much action information as possible in a sequence, but for a specific action, the method has certain problems: not every node has a great position or direction change in the action, the change amplitude ratio of different nodes in different actions is different, and the direct identification of all information can increase the influence of the joint-free point information on the identification rate.
Disclosure of Invention
In order to solve the problems, the invention discloses a continuous motion recognition method based on human body skeleton information, which can reduce the influence of joint-free point information on motion recognition, perform normalization processing on distance characteristics, reduce the amount of computation, improve the recognition stability and improve the user experience.
The invention relates to a continuous action identification method based on human skeleton information, which comprises the following steps:
(1) extracting a skeleton of a human body to obtain position information of a plurality of nodes of the human body;
(2) judging key nodes and non-joint nodes in the action, and taking the distance information of the key nodes of continuous action as an observed value characteristic sequence of action identification;
(3) normalizing the observed value characteristic sequence;
(4) performing probability calculation on the observation sequence corresponding to the continuous action by adopting an HMM model;
(5) and carrying out weighted average on the obtained probability according to different weights occupied by the corresponding nodes of different actions, and outputting a final result.
The invention further improves that:
in the step (1), the plurality of nodes of the whole body include NECK, SPINE, CHEST, LEFT SHOULDER, LEFT ELBOW, LEFT WRIST, RIGHT SHOULDER, RIGHT ELBOW, RIGHT WRIST (NECK, SPINE _ NAVAL, SPINE _ CHEST, SHOULDER _ LEFT, ELBOW _ LEFT, WRIST _ LEFT, SHOULDER _ RIGHT, ELBOW _ RIGHT, WRIST _ RIGHT).
The invention further improves that:
in the step (2), it is considered that the acquired spatial position of the node may not change greatly, that is, the node does not perform an obvious intentional action, so that the collection of the distance data sequence of the node is started only when the spatial position distance recorded before and after the node is greater than the distance threshold, and the collection of the ratio sequence of the node is stopped when the spatial position distance recorded before and after the node is less than the distance threshold. For nodes without obvious change, the sequence is empty, and the output probability of the model is 0; before the identification is started, data of certain nodes are not collected by artificially setting certain actions, and the output probability is directly assigned to be 0.
The invention further improves that:
in the step (3), the observed value feature sequence of the action recognition is a ratio O1 of a distance from a left shoulder node to a neck node and a distance from a chest node to a spine node, a ratio O2 of a distance from a right shoulder node to a neck node and a distance from a chest node to a spine node, a ratio O3 of a distance from a left elbow node to a neck node and a distance from a chest node to a spine node, a ratio O4 of a distance from a right elbow node to a neck node and a distance from a chest node to a spine node, a ratio O5 of a distance from a left wrist node to a neck node and a distance from a chest node to a spine node, and a ratio O6 of a distance from a right wrist node to a neck node and a distance from a chest node. The static frame data is O ═ O1, O2, O3, O4, O5, O6.
The invention further improves that:
in the step (4), when the length of the ratio sequence is greater than the length threshold lambda, calculating a forward algorithm output probability value of the node ratio sequence under each action model. And (3) iteratively training an HMM model by using a Baum-Welch algorithm according to a training database, wherein each key node has one HMM model for each action, and the n actions obtain 6n output probabilities.
The invention further improves that:
in the step (5), after the output probabilities of the key nodes of each action model are weighted and averaged according to different proportions of the key nodes in different actions, finally each action obtains a total probability, the probabilities are compared to obtain a maximum value, if the maximum value is too small, the action made by the user does not belong to any existing action in a database, namely when the maximum probability is smaller than a threshold value corresponding to the action, the action is judged to be an undefined action, otherwise, the action corresponding to the maximum probability is a final recognition result.
Has the advantages that: compared with the prior art, the feature extraction method for motion recognition only extracts key node data related to motion, the output probability of the non-joint nodes is 0, the calculation amount is greatly reduced, the influence of irrelevant amount on the recognition rate is also reduced, and the user experience is improved. And the characteristic distance is normalized in consideration of different human body material differences, so that the identification stability is improved. When the final output probability is calculated, the probability obtained by each key node sequence is weighted and averaged, and the identification accuracy is improved.
Drawings
FIG. 1 is a Kinect depth camera coordinate system according to an embodiment of the present invention;
FIG. 2 shows a human joint position and connection mode obtained by Kinect according to an embodiment of the present invention;
FIG. 3 is a graph of node distance ratios according to the present invention;
FIG. 4 is a diagram of HMM model parameters according to the present invention;
fig. 5 is a flow chart of a continuous motion recognition method based on human skeleton information according to the present invention.
Detailed Description
The present invention will be further illustrated with reference to the accompanying drawings and specific embodiments, which are to be understood as merely illustrative of the invention and not as limiting the scope of the invention. It should be noted that the terms "front," "back," "left," "right," "upper" and "lower" used in the following description refer to directions in the drawings, and the terms "inner" and "outer" refer to directions toward and away from, respectively, the geometric center of a particular component.
The invention relates to a continuous action identification method based on human skeleton information, which comprises the following steps:
step 1, extracting a skeleton of a human body to obtain position information of a plurality of nodes of the human body;
as shown in fig. 1 and 2, the Kinect device is used in this example to obtain the location information of the body skeleton nodes: NECK, SPINE, CHEST, LEFT SHOULDER, LEFT ELBOW, LEFT WRIST, RIGHT SHOULDER, RIGHT ELBOW, RIGHT WRIST (NECK, SPINE _ NAVAL, SPINE _ CHEST, SHOULDER _ LEFT, ELBOW _ LEFT, WRIST _ LEFT, SHOULDER _ RIGHT, ELBOW _ RIGHT, WRIST _ RIGHT).
Step 2, judging key nodes and non-joint nodes in the action, and taking distance information of the key nodes in continuous action as an observed value characteristic sequence of action identification;
considering that the acquired spatial position of the node may not change greatly, that is, the node does not perform obvious intentional action, only when the spatial position distance recorded before and after the node is greater than the distance threshold, the collection of the distance data sequence of the node is started, for the right wrist, training shows that when the distance threshold is 10, the start and the end of the action can be recorded best, and when the spatial position distance recorded before and after the node is less than the distance threshold, the collection of the ratio sequence of the node is stopped. For nodes without obvious changes, the sequence is empty, and the model output probability is 0. Before the identification is started, data of certain nodes are not collected by artificially setting certain actions, and the output probability is directly assigned to be 0.
Step 3, normalizing the observed value characteristic sequence;
as shown in fig. 3, the observed value feature sequence of the motion recognition is a ratio of a distance from a left shoulder node to a neck node to a distance from a chest node to a spine node, a ratio of a distance from a right shoulder node to a neck node to a distance from a chest node to a spine node, a ratio of a distance from a left elbow node to a neck node to a distance from a chest node to a spine node, a ratio of a distance from a right elbow node to a neck node to a distance from a chest node to a spine node, a ratio of a distance from a left wrist node to a neck node to a distance from a chest node to a spine node, and a ratio of a distance from a right wrist node to a neck node to a distance from. The static frame data is O ═ O1, O2, O3, O4, O5, O6.
In the process of motion recognition, the bone distances of different users are greatly different, so that the observation value characteristic sequence for gesture recognition is required to be capable of adapting to the difference, and original information is reserved for observation value to the maximum extent possible, so that the accuracy of recognition is prevented from being influenced due to information loss.
From the perspective of different adaptability, the normalized distance between the nodes of the neck, the shoulder, the elbow and the wrist has good adaptability to the upper limbs of different users. The bone node distances of different people are different, but the ratio obtained after the distance from the spine to the chest is removed is not very different, so that the method is very suitable for being used as an observed value feature sequence of motion recognition, as shown in fig. 2, the distance between the neck and the wrist is different from person to person and is not suitable for being used as a recognition feature, the distance needs to be subjected to normalization processing, the distance between the neck and the wrist is divided by the distance from the chest to the spine to obtain the ratio of the wrist in a certain frame of image, the modulus ratio difference is very small when different people do the same motion, the modulus ratio can be used as an observed feature, and the ratio obtained in the whole motion process forms the feature sequence of the wrist.
Step 4, performing probability calculation on the observation sequence corresponding to the continuous action by adopting an HMM model;
for the characteristic sequence of the input model, the length of the sequence represents the length of the action displacement, too short displacement may be a false triggering gesture, and the probability value does not have referential property, and only when the length of the ratio sequence is greater than a set threshold lambda, the forward algorithm output probability value of the node ratio sequence under each action model is started to be calculated. Training has found that the sequence input to the model is most referential when the length threshold λ is 70% of the sequence average. For each action, each key node has 1 HMM model, each HMM model corresponding to each action is iteratively trained by using a Baum-Welch algorithm according to a training database, each model outputs one probability, and then n actions are defined to obtain 6n output probabilities.
And 5, carrying out weighted average on the obtained probability according to different weights occupied by the corresponding nodes of different actions, and outputting a final result.
In the training stage of the action, the output values of the forward algorithm of 30 sets of training data are calculated in each training, and the calculated values are used as the basis for judging whether convergence occurs. And extracting the output values of the forward algorithm of 30 groups of training data calculated during the last training, defining the probability threshold value of a certain action, taking h >0 as a harmonic parameter, taking mu as a mean value and taking sigma as a standard deviation, and storing the threshold value parameter as a model parameter in a corresponding action object.
=μ–h*σ
After the output probabilities of the key node sequences are obtained, for the key nodes with the probabilities different from 0, even if the key nodes are right-handed nodes, the change amplitudes of the key nodes are different in different actions, for the action with the larger change amplitude, the node should be assigned with more weights, and for the action with the smaller change amplitude, the weights are also smaller, so that the obtained total probability can represent the action more.
And after weighted averaging is carried out on the output probability of each key node of each action model, finally obtaining a total probability for each action, comparing the sizes of the probabilities to obtain a maximum value, if the maximum value is too small, indicating that the action performed by the user does not belong to any existing action in a database, namely when the maximum probability is smaller than a threshold value corresponding to the action, judging the action to be undefined, otherwise, determining the action corresponding to the maximum probability to be a final recognition result.
The technical means disclosed in the invention scheme are not limited to the technical means disclosed in the above embodiments, but also include the technical scheme formed by any combination of the above technical features.

Claims (6)

1. A continuous action recognition method based on human skeleton information is characterized in that,
the method comprises the following steps:
(1) extracting a skeleton of a human body to obtain position information of a plurality of nodes of the human body;
(2) judging key nodes and non-joint nodes in the action, and taking the distance information of the key nodes of continuous action as an observed value characteristic sequence of action identification;
(3) normalizing the observed value characteristic sequence;
(4) performing probability calculation on the observation sequence corresponding to the continuous action by adopting an HMM model;
(5) and carrying out weighted average on the obtained probability according to different weights occupied by the corresponding nodes of different actions, and outputting a final result.
2. The method for recognizing continuous motion based on human skeletal information as claimed in claim 1, wherein in the step (1): the multiple joints of the body comprise neck, spine, chest, left shoulder, left elbow, left wrist, right shoulder, right elbow and right wrist.
3. The method for continuous motion recognition based on human skeletal information of claim 1, wherein in step (2), when the distance between two previous and subsequent recorded spatial positions of a node is greater than a distance threshold, the collection of the distance data sequence of the node is started, and when the distance between two previous and subsequent recorded spatial positions of the node is less than the distance threshold, the collection of the ratio sequence of the node is stopped; before the identification is started, data of certain nodes are not collected by artificially setting certain actions, and the output probability is directly assigned to be 0.
4. The method for continuous motion recognition based on human skeletal information of claim 1, wherein in the step (3), the observed feature sequence of motion recognition is a ratio O1 of a distance from a left shoulder node to a neck node to a distance from a chest to a spine node, a ratio O2 of a distance from a right shoulder node to a neck node to a distance from a chest to a spine node, a ratio O3 of a distance from a left elbow node to a neck node to a distance from a chest to a spine node, a ratio O4 of a distance from a right elbow node to a neck node to a distance from a chest to a spine node, a ratio O5 of a distance from a left wrist node to a neck node to a chest to a spine node, and a ratio O6 of a distance from a right wrist node to a neck node to a chest to a spine node; the static frame data is O ═ O1, O2, O3, O4, O5, O6.
5. The method for recognizing continuous motion based on human skeletal information as claimed in claim 1, wherein in step (4), when the length of the ratio sequence is greater than the length threshold λ, the forward algorithm output probability value of the node ratio sequence under each motion model is calculated; and (3) iteratively training an HMM model by using a Baum-Welch algorithm according to a training database, wherein each key node has one HMM model for each action, and the n actions obtain 6n output probabilities.
6. The method according to claim 1, wherein in the step (5), the probabilities of the nodes obtained by each motion model are weighted and averaged according to the different ratios of the nodes in different motions, and finally each motion is obtained as a probability, the probabilities are compared to obtain a maximum value, if the maximum value is too small, the motion performed by the user does not belong to any existing motion in a database, that is, if the maximum probability is smaller than a threshold corresponding to the motion, the motion is determined as undefined motion, otherwise, the motion corresponding to the maximum probability is the final recognition result.
CN202010941604.1A 2020-09-09 2020-09-09 Continuous action identification method based on human skeleton information Pending CN112131979A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010941604.1A CN112131979A (en) 2020-09-09 2020-09-09 Continuous action identification method based on human skeleton information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010941604.1A CN112131979A (en) 2020-09-09 2020-09-09 Continuous action identification method based on human skeleton information

Publications (1)

Publication Number Publication Date
CN112131979A true CN112131979A (en) 2020-12-25

Family

ID=73846258

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010941604.1A Pending CN112131979A (en) 2020-09-09 2020-09-09 Continuous action identification method based on human skeleton information

Country Status (1)

Country Link
CN (1) CN112131979A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837112A (en) * 2021-09-27 2021-12-24 联想(北京)有限公司 Video data processing method and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104573665A (en) * 2015-01-23 2015-04-29 北京理工大学 Continuous motion recognition method based on improved viterbi algorithm
CN106022213A (en) * 2016-05-04 2016-10-12 北方工业大学 Human body motion recognition method based on three-dimensional bone information
WO2018011336A1 (en) * 2016-07-13 2018-01-18 Naked Labs Austria Gmbh Skeleton estimation from body mesh

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104573665A (en) * 2015-01-23 2015-04-29 北京理工大学 Continuous motion recognition method based on improved viterbi algorithm
CN106022213A (en) * 2016-05-04 2016-10-12 北方工业大学 Human body motion recognition method based on three-dimensional bone information
WO2018011336A1 (en) * 2016-07-13 2018-01-18 Naked Labs Austria Gmbh Skeleton estimation from body mesh

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837112A (en) * 2021-09-27 2021-12-24 联想(北京)有限公司 Video data processing method and electronic equipment

Similar Documents

Publication Publication Date Title
CN108764031B (en) Method, device, computer equipment and storage medium for recognizing human face
CN108319930B (en) Identity authentication method, system, terminal and computer readable storage medium
US8467571B2 (en) Ordered recognition of connected objects
KR101901591B1 (en) Face recognition apparatus and control method for the same
WO2008007471A1 (en) Walker tracking method and walker tracking device
CN114067358A (en) Human body posture recognition method and system based on key point detection technology
CN110490109B (en) Monocular vision-based online human body rehabilitation action recognition method
CN112633196A (en) Human body posture detection method and device and computer equipment
CN105373810B (en) Method and system for establishing motion recognition model
CN110796101A (en) Face recognition method and system of embedded platform
CN113516005B (en) Dance action evaluation system based on deep learning and gesture estimation
Yan et al. Human-object interaction recognition using multitask neural network
CN114821786A (en) Gait recognition method based on human body contour and key point feature fusion
CN111860117A (en) Human behavior recognition method based on deep learning
CN112131979A (en) Continuous action identification method based on human skeleton information
JP6875058B2 (en) Programs, devices and methods for estimating context using multiple recognition engines
CN106778576A (en) A kind of action identification method based on SEHM feature graphic sequences
CN108256578B (en) Gray level image identification method, device, equipment and readable storage medium
CN110766093A (en) Video target re-identification method based on multi-frame feature fusion
Arowolo et al. Development of a human posture recognition system for surveillance application
KR20160113966A (en) Method and apparatus for recognizing action
CN111444374B (en) Human body retrieval system and method
CN113378799A (en) Behavior recognition method and system based on target detection and attitude detection framework
TW201824020A (en) Analysis system of humanity action
CN113052087A (en) Face recognition method based on YOLOV5 model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination