CN108764107B - Behavior and identity combined identification method and device based on human body skeleton sequence - Google Patents

Behavior and identity combined identification method and device based on human body skeleton sequence Download PDF

Info

Publication number
CN108764107B
CN108764107B CN201810499463.5A CN201810499463A CN108764107B CN 108764107 B CN108764107 B CN 108764107B CN 201810499463 A CN201810499463 A CN 201810499463A CN 108764107 B CN108764107 B CN 108764107B
Authority
CN
China
Prior art keywords
sequence
human body
behavior
identity
skeleton sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810499463.5A
Other languages
Chinese (zh)
Other versions
CN108764107A (en
Inventor
王亮
王洪松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201810499463.5A priority Critical patent/CN108764107B/en
Publication of CN108764107A publication Critical patent/CN108764107A/en
Application granted granted Critical
Publication of CN108764107B publication Critical patent/CN108764107B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the field of visual recognition, and provides a behavior and identity joint recognition method based on a human body skeleton sequence, aiming at solving the problem that identity information and behavior actions cannot be recognized simultaneously in human body data recognition. The method comprises the following steps: acquiring a human body skeleton sequence of a human body to be identified; according to the human body skeleton sequence, utilizing a pre-constructed recognition model to recognize the identity information and behavior actions of the human body; the training method of the recognition model comprises the following steps: converting the coordinates of the human body framework sequence for training into a reference coordinate system to obtain a reference framework sequence; comparing the coordinates of each joint node of each reference framework of the reference framework sequence with the coordinates of a pre-specified central point to obtain the relative coordinates of each joint node of each reference framework; and carrying out three-dimensional coordinate transformation on the reference framework sequence, and training the initial recognition model to obtain the optimized recognition model. The invention can quickly and accurately identify the identity information and the behavior action of the human body from the human body skeleton sequence.

Description

Behavior and identity combined identification method and device based on human body skeleton sequence
Technical Field
The invention relates to the technical field of computer vision, in particular to the field of vision based on deep learning, and specifically relates to a behavior and identity joint identification method and device based on a human body skeleton sequence.
Background
With the development of computer graphic and visual technology and the development of man-machine interaction technology, it is becoming more and more important to timely and accurately display the behavior and action and identity information of a detected or monitored person. Behavior recognition and identity recognition are applied in the fields of automatic driving, man-machine interaction, smart cities, intelligent transportation, intelligent monitoring and the like.
With the development of depth cameras (e.g., Kinect) and high-precision and high-efficiency human posture estimation algorithms in recent years, behavior recognition based on human skeleton sequences is becoming more popular. The skeleton sequence directly reflects the motion of the human body and has the advantages of small input data, no background interference and the like. The deep neural network-based method can automatically learn features and identify behaviors from an original skeleton sequence; however, identification based on human skeletal sequences is ignored.
The action sequence of a person in time can reflect the behavior of the person and can also reflect the identity of the person, for example, the gait recognition research can judge the identity of the person according to the walking state of the person. However, the behavior and identity of an individual are recognized individually, and the motion of a pedestrian and the identity of the pedestrian cannot be recognized simultaneously by the same motion sequence.
Disclosure of Invention
The technical problem that identity information and behavior actions cannot be recognized simultaneously in human body skeleton data recognition is solved. For the purpose, the invention provides a behavior and identity joint identification method and device based on a human skeleton sequence, so as to solve the technical problems.
In a first aspect, the behavior and identity joint identification method based on the human skeleton sequence provided by the invention comprises the following steps: acquiring a human body skeleton sequence of a human body to be identified; predicting the probability of each preset identity category and the probability of each preset behavior category according to the human body skeleton sequence by using a pre-constructed recognition model; judging the identity type of the human body to be identified according to the predicted probability of the identity type; judging the behavior category of the human body to be recognized according to the predicted probability of the behavior category; the identification model is an identity class and behavior class probability prediction model constructed based on a deep recurrent neural network.
Further, in a preferred technical solution provided by the present invention, before the step of "predicting the probability of each preset identity category and the probability of each preset behavior category according to the human skeleton sequence based on a pre-constructed recognition model", the method further includes: performing coordinate conversion on a preset human body skeleton sequence training sample based on a preset reference coordinate system to obtain a first reference skeleton sequence; acquiring the position coordinates of a preset human body central point at each moment corresponding to the first reference skeleton sequence; subtracting the position coordinate of the joint point corresponding to each moment in the first reference skeleton sequence from the corresponding human body skeleton coordinate mean value to obtain a second reference skeleton sequence; performing three-dimensional coordinate transformation on the second reference skeleton sequence according to a preset rotation angle to obtain a third reference skeleton sequence; acquiring the coordinate change characteristic of each joint point according to the third reference skeleton sequence; fusing the obtained coordinate change characteristics to obtain a characteristic sequence; and performing model training on the identification model according to the characteristic sequence based on a preset model loss function.
Further, in a preferred embodiment of the present invention, before the step of subtracting the position coordinate of the joint point corresponding to each time in the first reference skeleton sequence from the corresponding human body skeleton coordinate mean value to obtain the second reference skeleton sequence, the method includes: acquiring coordinates of a plurality of preset central points of the human skeleton; calculating a coordinate mean value of a plurality of the central points according to the acquired coordinates; in this case, the step of subtracting the position coordinate of the joint point corresponding to each time in the first reference skeleton sequence from the corresponding human body skeleton coordinate mean value to obtain the second reference skeleton sequence is to subtract the position coordinate of the joint point corresponding to each time in the first reference skeleton sequence from the corresponding center point coordinate mean value to obtain the second reference skeleton sequence.
Further, in a preferred technical solution provided by the present invention, the step of performing three-dimensional coordinate transformation on the second reference skeleton sequence according to a preset rotation angle to obtain a third reference skeleton sequence includes: and (3) carrying out three-dimensional coordinate transformation on each joint node by using the following transformation formula:
R=Rz(γ)Ry(β)Rx(α)
wherein R is a three-dimensional rotation transformation matrix, Rx(α),Ry(β),RzAnd (gamma) is a rotation matrix of the directions of three coordinate axes of x, y and z, and the form of the rotation matrix is as follows:
Figure BDA0001669904090000031
Figure BDA0001669904090000032
Figure BDA0001669904090000033
and alpha, beta and gamma are rotation angles in the directions of three coordinate axes of x, y and z.
Further, in a preferred technical solution provided by the present invention, the step of "fusing the obtained coordinate change features to obtain a feature sequence" includes: and connecting the coordinates of the joint points at each moment after the coordinate transformation into a feature vector to obtain a feature sequence.
Further, in a preferred embodiment of the present invention, the model loss function is represented by the following formula:
L=λL(1)+(1-λ)L(2)
wherein, λ is a preset weighting coefficient, λ is more than or equal to 0 and less than or equal to 1, and L(1)And L(2)Respectively, the loss functions corresponding to behavior recognition and identity recognition are as follows:
Figure BDA0001669904090000034
wherein the content of the first and second substances,
Figure BDA0001669904090000035
a category label of the behavior and identity of the nth sample, wherein N is the total number of samples;
the step of performing model training on the recognition model according to the characteristic sequence based on a preset model loss function comprises the following steps: and performing model training on the recognition model by using a BPTT algorithm according to the third reference skeleton sequence.
Further, in a preferred embodiment of the present invention, the center point includes a center point of a left hip joint, a center point of a right hip joint, and a center point of a hip, or the center point includes a center point of a left shoulder joint, a center point of a right shoulder joint, and a center point of a chest.
Further, in a preferred embodiment provided by the present invention, the deep recurrent neural network is a multi-layer bidirectional recurrent neural network or a unidirectional recurrent neural network; the multilayer bidirectional recurrent neural network comprises a plurality of long-time memory networks.
Further, in a preferred technical solution provided by the present invention, the fully connected layer in the network structure of the recognition model includes a first fully connected layer and a second fully connected layer; the first full-link layer is used for predicting the probability of each preset behavior category according to the human body skeleton sequence; the second fully-connected layer is used for predicting the probability of each preset identity type according to the human body skeleton sequence.
In a second aspect, the present invention provides a storage device, where the storage device carries one or more programs, where the programs are adapted to be loaded and executed by a processor, and when the one or more programs are executed by the device, the method may implement the behavior and identity joint recognition method based on the human skeleton sequence according to the above technical solution.
In a third aspect, the present invention provides a processing apparatus comprising a processor adapted to execute programs; and a storage device adapted to store a plurality of programs; the program is suitable for being loaded and executed by a processor to realize the behavior and identity joint identification method based on the human body skeleton sequence.
Compared with the closest prior art, the technical scheme has at least the following beneficial effects:
the behavior and identity joint recognition method based on the human body skeleton sequence, provided by the invention, predicts the probability of the identity category and the probability of the behavior category through a pre-constructed recognition model for the human body skeleton sequence to be recognized, judges each behavior action of the identity information of a human body corresponding to the human body skeleton sequence according to the summary of the predicted identity category and the probability of the behavior category, and realizes the joint recognition of the identity and the behavior of the human body skeleton sequence; the use of the multi-layer bidirectional recurrent neural network improves the prediction precision of the probability of the identity class and the probability of the behavior class.
Drawings
FIG. 1 is a schematic diagram illustrating the main steps of behavior and identity joint identification based on human skeleton sequence in an embodiment of the present invention;
FIG. 2 is a schematic diagram of a network structure for identifying model neurons according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of the main structure of a bidirectional recurrent neural network of the recognition model in the embodiment of the present invention;
fig. 4 is a schematic diagram of recognizing behavior and identity information of a human body corresponding to a human body skeleton sequence by using a recognition model in the embodiment of the present invention.
Detailed Description
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only for explaining the technical principle of the present invention, and are not intended to limit the scope of the present invention.
It should be noted that the embodiments and features of the embodiments of the present invention may be combined with each other without conflict. The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.
Referring to fig. 1, fig. 1 illustrates the main steps of behavior and identity joint identification based on human skeleton sequence in this embodiment. The behavior and identity combined identification method based on the human body skeleton sequence comprises the following steps:
step 1, obtaining a human body skeleton sequence of a human body to be identified.
In this embodiment, an electronic device or an application platform based on a behavior and identity joint identification method of a human skeleton sequence may be applied to obtain a human skeleton sequence to be subjected to behavior identification and identity verification. Obtaining a human skeleton sequence from a terminal device connected with the electronic device or the application platform; specifically, the terminal device can obtain skeleton data of a human skeleton of a person in the identification area through a Kinect sensor connected with the terminal device. The human body skeleton sequence is a skeleton data sequence of human body skeletons of the same person according to a time sequence.
The skeleton data may be image data of a human body detected by a Kinect sensor, and each frame of image data detected by the Kinect sensor may be data representing a trunk and each joint of the human body; the skeleton data includes the joint point coordinates of the human skeleton.
And 2, predicting the probability of each preset identity category and the probability of each preset behavior category based on a pre-constructed recognition model and according to the human body skeleton sequence.
In this embodiment, based on the human skeleton sequence obtained in step 1, the electronic device or the application platform identifies the human skeleton sequence by using a pre-established identification model, and predicts a probability of each preset identity class and a probability of each preset behavior class. The identification model may be a model identity class and behavior class probability prediction model constructed based on a deep recurrent neural network, for example, a Siamese network model, and the Siamese network model is used to complete identity verification and behavior action identification of the human skeleton sequence to be detected. The input of the identification model is a sequence of human body skeleton data, and the output is the probability of the identity class and the probability of the behavior class of the human body corresponding to the input human body skeleton sequence. The identity information and the behavior and action information of the human body are stored in the storage unit or the database of the electronic equipment or the application platform in advance. Specifically, the probability that the human skeleton sequence corresponds to each identity category in the pre-stored identity information of the human body can be predicted for the recognition model; the recognition model predicts a probability that the human skeleton sequence corresponds to a behavior type of each of the pre-stored behavior motions of the human body.
Step 3, judging the identity type of the human body to be detected according to the predicted probability of the identity type; and judging the behavior category of the human body to be recognized according to the predicted probability of the behavior category.
In this embodiment, according to the probabilities of the identity categories and the probabilities of the behavior categories predicted in the step 2, the identity category of the human body corresponding to the human body skeleton sequence and the behavior category of the human body corresponding to the human body skeleton sequence can be determined according to the magnitude of the probabilities. The identity category may be information for distinguishing human identity, and the behavior category may be information for distinguishing human behavior.
Further, in a preferred technical solution provided in this embodiment, before the step of "predicting the probability of each preset identity class and the probability of each preset behavior class according to the human skeleton sequence based on a pre-constructed recognition model", the method further includes: performing coordinate conversion on a preset human body skeleton sequence training sample based on a preset reference coordinate system to obtain a first reference skeleton sequence; acquiring the position coordinates of a preset human body central point at each moment corresponding to the first reference skeleton sequence; subtracting the position coordinate of the joint point corresponding to each moment in the first reference skeleton sequence from the corresponding human body skeleton coordinate mean value to obtain a second reference skeleton sequence; performing three-dimensional coordinate transformation on the second reference skeleton sequence according to a preset rotation angle to obtain a third reference skeleton sequence; acquiring the coordinate change characteristic of each joint point according to the third reference skeleton sequence; fusing the obtained coordinate change characteristics to obtain a characteristic sequence; and performing model training on the identification model according to the characteristic sequence based on a preset model loss function.
The training method of the pre-constructed recognition model comprises the following steps: converting the coordinates of the human skeleton sequence for training into a reference coordinate system to obtain a reference skeleton sequence; comparing the coordinates of each joint node of each reference framework of the reference framework sequence with the coordinates of a pre-specified central point to obtain the relative coordinates of each joint node of each reference framework; and carrying out three-dimensional coordinate transformation on the relative coordinates of each joint node, taking the reference framework sequence subjected to three-dimensional coordinate transformation as training data, and training the initial recognition model to obtain the optimized recognition model.
The preprocessing of the sample data also comprises the absolute coordinate processing of each skeleton data in the human skeleton sequence, namely, the coordinates of all key points of one skeleton sequence at different time are subtracted by the mean value of the coordinates of the corresponding time to obtain the coordinates of each joint node.
Specifically, in the data preprocessing, if the human skeleton sequence is based on an image plane coordinate system and camera parameters are known, the coordinate system conversion can be performed by calculating a camera transformation matrix; if the camera parameters are unknown, adding a dimension with the numerical value of 1 to the two-dimensional coordinates of the plane, and carrying out scale transformation on the processed three-dimensional coordinates to enable the numerical values of the coordinates of x, y and z to be in a preset range; preferably, the values of the coordinates of x, y, z are in the range of [ -3,3 ].
The three-dimensional coordinate transformation may be performed on the second reference skeleton sequence by using a preset rotation transformation matrix to obtain a third reference skeleton sequence.
Further, in a preferred technical solution provided in this embodiment, before the step of subtracting the position coordinate of the joint point corresponding to each time in the first reference skeleton sequence from the corresponding human skeleton coordinate mean value to obtain the second reference skeleton sequence, the method includes: acquiring coordinates of a plurality of preset central points of the human skeleton; calculating a coordinate mean value of a plurality of the central points according to the acquired coordinates; in this case, the step of subtracting the position coordinate of the joint point corresponding to each time in the first reference skeleton sequence from the corresponding human body skeleton coordinate mean value to obtain the second reference skeleton sequence is to subtract the position coordinate of the joint point corresponding to each time in the first reference skeleton sequence from the corresponding center point coordinate mean value to obtain the second reference skeleton sequence.
Specifically, the center point includes a center point of a left hip, a center point of a right hip, and a center point of a hip, or the center point includes a center point of a left shoulder, a center point of a right shoulder, and a center point of a chest.
Further, in a preferred technical solution provided in this embodiment, the step of performing three-dimensional coordinate transformation on the second reference skeleton sequence according to a preset rotation angle to obtain a third reference skeleton sequence includes:
in some optional implementations of this embodiment, the step of "performing three-dimensional coordinate transformation on the relative coordinates of each joint node" includes performing three-dimensional coordinate transformation on each joint node by using the following transformation formula:
R=Rz(γ)Ry(β)Rx(α) (1)
wherein R isx(α),Ry(β),RzAnd (gamma) is a rotation matrix of the directions of three coordinate axes of x, y and z, and the form of the rotation matrix is as follows:
Figure BDA0001669904090000071
Figure BDA0001669904090000081
Figure BDA0001669904090000082
in the above formula, R is a three-dimensional rotation transformation matrix, and α, β, and γ are rotation angles in the directions of three coordinate axes of x, y, and z. The three-dimensional transformation described above is a rotation matrix, the rotation matrix R for the three-dimensional transformation being dependent on only three parameters α, β, γ. When the values of the parameters α, β, γ are all 0, the rotation matrix R is an identity matrix, which means that no coordinate transformation is performed. In the recognition model training, values of α, β, γ are randomly generated, and the range of the random generation depends on the task, for example, for recognition across view angles, α ∈ [ -pi/2, pi/2 ], β ∈ [ -pi/2, pi/2 ], and γ ═ 0 may be set.
Further, in a preferred embodiment, the step of "fusing the obtained coordinate change features to obtain a feature sequence" includes: and connecting the coordinate change characteristics of the different joint points to obtain a characteristic sequence.
And fusing the characteristics of the model, which are learned on the basis of the characteristic sequence after coordinate transformation, for describing the motion in a time dimension to obtain a vector for describing the motion, and using the vector as the input of two full-connection layers in the network. The method for fusing the coordinate variation features can be realized by a Max Pooling method (Max Pooling) or an average Pooling method (Mean Pooling).
Further, in a preferred embodiment, the preset model loss function is shown as the following formula:
L=λL(1)+(1-λ)L(2)(5)
wherein, λ is a preset weighting coefficient, λ is more than or equal to 0 and less than or equal to 1, and L(1)And L(2)The loss functions, L, corresponding to behavior recognition and identity recognition, respectively(1)And L(2)Can be expressed as:
Figure RE-GDA0001769284040000083
wherein, in
Figure BDA0001669904090000084
A category label of the behavior and identity of the nth sample, wherein N is the total number of samples;
the step of performing model training on the recognition model according to the characteristic sequence based on a preset model loss function comprises the following steps: and performing model training on the recognition model by using a BPTT algorithm according to the third reference skeleton sequence. The BPTT algorithm is a Time sequence-based Back Propagation algorithm and is an abbreviation of Back-Propagation Through Time.
Further, in a preferred technical solution of this embodiment, the deep recurrent neural network is a multi-layer bidirectional recurrent neural network or a unidirectional recurrent neural network; the multilayer bidirectional recurrent neural network comprises a plurality of long-time memory networks.
In some optional implementations of the present embodiment, the recognition model is constructed based on a deep recurrent neural network. The recognition model may employ a multi-layer bidirectional recurrent neural network, wherein the recurrent neural network may employ a Short-Term Memory network (LSTM).
Referring to fig. 2, fig. 2 illustrates a network structure of a recognition model neuron in the present embodiment. As shown in FIG. 2, in identifying a network of model neuronsIn the structure, given an input sequence { xtThe output sequence of the long-time and short-time memory network is { h }tThe iterative process of the long and short term memory network is as follows:
it=σ(Wxixt+Whiht-1+Wcict-1+bi) (7)
ft=σ(Wxfxt+Whfht-1+Wcfct-1+bf) (8)
ct=ftct-1+ittanh(Wxcxt+Whcht-1+bc) (9)
ot=σ(Wxoxt+Whoht-1+Wcoct+bo) (10)
ht=ottanh(ct) (11)
wherein it,ft,ot,ctRespectively representing the states of an Input control gate (Input gate), a forgetting gate (Forget gate), an Output control gate (Output gate) and a memory Cell (Cell) at time t, and W and b respectively representing the connection weight and the offset vector.
Further, in a preferred technical solution of this embodiment, the deep recurrent neural network is a multi-layer bidirectional recurrent neural network or a unidirectional recurrent neural network; the multilayer bidirectional recurrent neural network comprises a plurality of long-time memory networks.
Referring to fig. 3, fig. 3 illustrates the main structure of a bidirectional recurrent neural network for identifying model neurons in the present embodiment. In the network structure of the bidirectional recursive network shown in fig. 3, the network structure of the bidirectional recursive network applied to the recognition model is shown in fig. 3, and for an input human skeleton sequence, the network has two hidden layers: a forward layer and a reverse layer, respectively learning the variation characteristics of the input sequence in two opposite directions in time. The output of the bidirectional recurrent neural network is the output of the forward layer and the reverse layer at the same time, which is connected to form a new time sequence.
In some optional implementation manners of this embodiment, the full connection layer in the network structure of the recognition model includes a first full connection layer and a second full connection layer, where the first full connection layer is configured to predict, according to the human skeleton sequence, a probability of each preset behavior category, so as to recognize a human action behavior, and the second full connection layer is configured to predict, according to the human skeleton sequence, a probability of each preset identity category, so as to recognize a human identity.
Here, the fully connected layer for classification includes two fully connected layers, and features learned by the deep recurrent neural network need to be fused in the time dimension to obtain a representation of the sequence. The fusion method employs either maximal Pooling (Max Pooling) or Mean Pooling (Mean Pooling). Remember { otT ∈ {1, 2.., T }, T representing the sequence length, and max { o } the maximum pooled outputtThe average pooled output is ∑ ot/T。
The node number of the hidden layer of the first full-connection layer is the number of behaviors to be identified, and the behavior category to which the input sequence belongs is judged through the maximum value of the following generic probability of the activation function:
Figure BDA0001669904090000101
wherein, aiFor the output of the fully connected layer, the number of categories of behaviors is m, piThe predicted probability for the ith behavior class.
The node number of the hidden layer of the second full-connection layer is the number of the identities to be identified, and the identity category to which the input sequence belongs is judged through the maximum value of the following generic probabilities of the activation function:
Figure BDA0001669904090000102
wherein, bjThe output of the full connection layer has n and q types of identitiesjIs the ith racePredicted probability of share category.
It can be understood that the behaviors to be recognized may be preset, and the number of behavior categories may be determined by an actual task; wherein each action corresponds to a behavior class. The information of the identities can be preset, and the number of the identity categories can be determined by the number of the human bodies to be identified in the actual task; wherein each person corresponds to an identity class.
By way of example, referring to fig. 4, fig. 4 is a schematic diagram illustrating a behavior and an identity information of a human body corresponding to a human skeleton sequence recognized by using a recognition model in the embodiment. As shown in fig. 4, after the human skeleton sequence is input into the recognition model, the behavior and identity of the human body are recognized. The recognition model jointly recognizes the behavior and the identity information of the human body through data preprocessing, three-dimensional coordinate transformation, a deep recurrent neural network and classification prediction. Here, 60 behavior categories and 40 identity categories are preset; according to the human skeleton sequence, 60 persons with different behaviors and actions and 40 persons with different identities can be identified by using the identification model.
The present invention also provides a storage device carrying one or more programs adapted to be loaded and executed by a processor, which when executed by the device is operable to carry out any of the methods of the embodiments described above.
The invention also provides a processing device comprising a processor adapted to execute various programs; and a storage device adapted to store a plurality of programs; wherein the program is adapted to be loaded and executed by a processor to implement any of the methods in the above embodiments.
The method provided by the embodiment of the invention identifies the human skeleton sequence through the pre-established identification model, and identifies the behavior and the identity information of the human body. In the invention, the full-link layer of the identification model comprises a full-link layer for identity identification and a full-link layer for behavior identification, the recurrent neural network of the identification model fuses the learned characteristics in a time dimension, the identification model can simultaneously predict the probability of the behavior class of the human skeleton sequence and the probability of the identity class of a human, and the identity class and the behavior class of the human are judged according to the pre-stored probabilities. Therefore, the method provided by the invention can be used for quickly and accurately identifying the identity information and the behavior action of the human body corresponding to the human body skeleton sequence.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Without departing from the principle of the invention, one skilled in the art can make equivalent changes or substitutions on the related technical features, and the technical solutions after the changes or substitutions will fall within the protection scope of the invention.

Claims (9)

1. A behavior and identity joint identification method based on a human skeleton sequence is characterized by comprising the following steps:
performing coordinate conversion on a preset human body skeleton sequence training sample based on a preset reference coordinate system to obtain a first reference skeleton sequence;
acquiring position coordinates of a preset human body central point at each moment corresponding to the first reference skeleton sequence;
acquiring coordinates of a plurality of preset central points of the human skeleton;
calculating a coordinate mean value of a plurality of central points according to the acquired coordinates;
subtracting the position coordinate of the joint point corresponding to each moment in the first reference skeleton sequence from the coordinate mean value of the corresponding central point to obtain a second reference skeleton sequence;
performing three-dimensional coordinate transformation on the second reference framework sequence according to a preset rotation angle to obtain a third reference framework sequence;
acquiring the coordinate change characteristic of each joint point according to the third reference skeleton sequence;
fusing the obtained coordinate change characteristics to obtain a characteristic sequence;
performing model training on the recognition model according to the characteristic sequence based on a preset model loss function;
acquiring a human body skeleton sequence of a human body to be identified;
predicting the probability of each preset identity category and the probability of each preset behavior category according to the human body skeleton sequence, the recognition model and the human body skeleton sequence;
judging the identity type of the human body to be recognized according to the predicted probability of the identity type; judging the behavior category of the human body to be recognized according to the predicted probability of the behavior category;
the identification model is an identity class and behavior class probability prediction model constructed based on a deep recurrent neural network.
2. The behavior and identity joint identification method based on the human body skeleton sequence according to claim 1, wherein the step of performing three-dimensional coordinate transformation on the second reference skeleton sequence according to a preset rotation angle to obtain a third reference skeleton sequence comprises:
and (3) carrying out three-dimensional coordinate transformation on each joint node by using the following transformation formula:
R=Rz(γ)Ry(β)Rx(α)
wherein R is a three-dimensional rotation transformation matrix, Rx(α),Ry(β),RzAnd (gamma) is a rotation matrix of the directions of three coordinate axes of x, y and z, and the form of the rotation matrix is as follows:
Figure FDA0002541427640000021
Figure FDA0002541427640000022
Figure FDA0002541427640000023
and alpha, beta and gamma are rotation angles in the directions of three coordinate axes of x, y and z.
3. The behavior and identity joint recognition method based on the human body skeleton sequence according to claim 1, wherein the step of fusing the obtained coordinate change features to obtain the feature sequence comprises: and connecting the coordinates of the joint points at each moment after the coordinate transformation into a feature vector to obtain a feature sequence.
4. The human skeleton sequence-based behavior and identity joint recognition method according to any one of claims 1-3, wherein the model loss function is represented by the following formula:
L=λL(1)+(1-λ)L(2)
wherein, λ is a preset weighting coefficient, λ is more than or equal to 0 and less than or equal to 1, and L(1)And L(2)The loss functions corresponding to behavior recognition and identity recognition are respectively:
Figure FDA0002541427640000024
wherein, in
Figure FDA0002541427640000025
A category label of the behavior and identity of the nth sample, wherein N is the total number of samples;
Figure FDA0002541427640000026
as a category of behavior
Figure FDA0002541427640000027
The corresponding probability of the prediction is used,
Figure FDA0002541427640000028
as identity classes
Figure FDA0002541427640000029
A corresponding prediction probability;
the step of performing model training on the recognition model according to the feature sequence based on a preset model loss function comprises the following steps: and performing model training on the recognition model by utilizing a time sequence-based back propagation algorithm according to the third reference skeleton sequence.
5. The human skeletal sequence-based behavior and identity joint recognition method according to any one of claims 1 to 3, wherein the central point comprises a central point of a left hip, a central point of a right hip and a central point of a hip, or the central points comprise a central point of a left shoulder, a central point of a right shoulder and a central point of a chest.
6. The human skeleton sequence-based behavior and identity joint recognition method according to any one of claims 1-3, wherein the deep recurrent neural network is a multi-layer bidirectional recurrent neural network or a unidirectional recurrent neural network; the multilayer bidirectional recurrent neural network comprises a plurality of long-time and short-time memory networks.
7. The human skeleton sequence-based behavior and identity joint recognition method according to any one of claims 1 to 3, wherein fully connected layers in a network structure of the recognition model comprise a first fully connected layer and a second fully connected layer;
the first full-connection layer is used for predicting the probability of each preset behavior category according to the human body skeleton sequence;
the second full-link layer is used for predicting the probability of each preset identity category according to the human body skeleton sequence.
8. A storage device having a plurality of programs stored therein, wherein the programs are adapted to be loaded and executed by a processor to implement the method for joint human skeletal sequence based behavior and identity recognition according to any of claims 1 to 7.
9. A processing apparatus, comprising:
a processor adapted to execute various programs; and
a storage device adapted to store a plurality of programs;
wherein the program is adapted to be loaded and executed by a processor to perform:
a method of joint behavioral and identity recognition based on human body framework sequences as claimed in any one of claims 1 to 7.
CN201810499463.5A 2018-05-23 2018-05-23 Behavior and identity combined identification method and device based on human body skeleton sequence Active CN108764107B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810499463.5A CN108764107B (en) 2018-05-23 2018-05-23 Behavior and identity combined identification method and device based on human body skeleton sequence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810499463.5A CN108764107B (en) 2018-05-23 2018-05-23 Behavior and identity combined identification method and device based on human body skeleton sequence

Publications (2)

Publication Number Publication Date
CN108764107A CN108764107A (en) 2018-11-06
CN108764107B true CN108764107B (en) 2020-09-11

Family

ID=64005031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810499463.5A Active CN108764107B (en) 2018-05-23 2018-05-23 Behavior and identity combined identification method and device based on human body skeleton sequence

Country Status (1)

Country Link
CN (1) CN108764107B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382306B (en) * 2018-12-28 2023-12-01 杭州海康威视数字技术股份有限公司 Method and device for inquiring video frame
CN109902729B (en) * 2019-02-18 2020-10-16 清华大学 Behavior prediction method and device based on sequence state evolution
CN110197116B (en) * 2019-04-15 2023-05-23 深圳大学 Human behavior recognition method, device and computer readable storage medium
CN110070029B (en) * 2019-04-17 2021-07-16 北京易达图灵科技有限公司 Gait recognition method and device
CN110363131B (en) * 2019-07-08 2021-10-15 上海交通大学 Abnormal behavior detection method, system and medium based on human skeleton
CN110717381A (en) * 2019-08-28 2020-01-21 北京航空航天大学 Human intention understanding method facing human-computer cooperation and based on deep stacking Bi-LSTM
CN111079535B (en) * 2019-11-18 2022-09-16 华中科技大学 Human skeleton action recognition method and device and terminal
CN111274937B (en) * 2020-01-19 2023-04-28 中移(杭州)信息技术有限公司 Tumble detection method, tumble detection device, electronic equipment and computer-readable storage medium
CN113269008B (en) * 2020-02-14 2023-06-30 宁波吉利汽车研究开发有限公司 Pedestrian track prediction method and device, electronic equipment and storage medium
CN111353447B (en) * 2020-03-05 2023-07-04 辽宁石油化工大学 Human skeleton behavior recognition method based on graph convolution network
CN111783711B (en) * 2020-07-09 2022-11-08 中国科学院自动化研究所 Skeleton behavior identification method and device based on body component layer
CN112966628A (en) * 2021-03-17 2021-06-15 广东工业大学 Visual angle self-adaptive multi-target tumble detection method based on graph convolution neural network
US11854305B2 (en) 2021-05-09 2023-12-26 International Business Machines Corporation Skeleton-based action recognition using bi-directional spatial-temporal transformer
CN113239819B (en) * 2021-05-18 2022-05-03 西安电子科技大学广州研究院 Visual angle normalization-based skeleton behavior identification method, device and equipment
CN113688790A (en) * 2021-09-22 2021-11-23 武汉工程大学 Human body action early warning method and system based on image recognition

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103729614A (en) * 2012-10-16 2014-04-16 上海唐里信息技术有限公司 People recognition method and device based on video images
US8929600B2 (en) * 2012-12-19 2015-01-06 Microsoft Corporation Action recognition based on depth maps
US20160042227A1 (en) * 2014-08-06 2016-02-11 BAE Systems Information and Electronic Systems Integraton Inc. System and method for determining view invariant spatial-temporal descriptors for motion detection and analysis
CN107301370B (en) * 2017-05-08 2020-10-16 上海大学 Kinect three-dimensional skeleton model-based limb action identification method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于人体红外辐射的身份识别新方法;薛召军等;《天津职业技术师范大学学报》;20120331;第22卷(第1期);第1-5页 *

Also Published As

Publication number Publication date
CN108764107A (en) 2018-11-06

Similar Documents

Publication Publication Date Title
CN108764107B (en) Behavior and identity combined identification method and device based on human body skeleton sequence
CN107273782B (en) Online motion detection using recurrent neural networks
US5845048A (en) Applicable recognition system for estimating object conditions
US10019629B2 (en) Skeleton-based action detection using recurrent neural network
CN106951923B (en) Robot three-dimensional shape recognition method based on multi-view information fusion
EP0737938B1 (en) Method and apparatus for processing visual information
CN109919245B (en) Deep learning model training method and device, training equipment and storage medium
CN107516127B (en) Method and system for service robot to autonomously acquire attribution semantics of human-worn carried articles
KR20180057096A (en) Device and method to perform recognizing and training face expression
CN112990211A (en) Neural network training method, image processing method and device
CN110555481A (en) Portrait style identification method and device and computer readable storage medium
CN111368656A (en) Video content description method and video content description device
CN111062263A (en) Method, device, computer device and storage medium for hand pose estimation
CN113569598A (en) Image processing method and image processing apparatus
CN111738074B (en) Pedestrian attribute identification method, system and device based on weak supervision learning
CN111428854A (en) Structure searching method and structure searching device
CN114387513A (en) Robot grabbing method and device, electronic equipment and storage medium
CN114708435A (en) Obstacle size prediction and uncertainty analysis method based on semantic segmentation
CN113516227A (en) Neural network training method and device based on federal learning
CN113838135A (en) Pose estimation method, system and medium based on LSTM double-current convolution neural network
CN114140841A (en) Point cloud data processing method, neural network training method and related equipment
Su et al. Incremental learning with balanced update on receptive fields for multi-sensor data fusion
CN111104911A (en) Pedestrian re-identification method and device based on big data training
TWI812053B (en) Positioning method, electronic equipment and computer-readable storage medium
CN112818887B (en) Human skeleton sequence behavior identification method based on unsupervised learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant