CN112749585A - Skeleton action identification method based on graph convolution - Google Patents

Skeleton action identification method based on graph convolution Download PDF

Info

Publication number
CN112749585A
CN112749585A CN201911041763.XA CN201911041763A CN112749585A CN 112749585 A CN112749585 A CN 112749585A CN 201911041763 A CN201911041763 A CN 201911041763A CN 112749585 A CN112749585 A CN 112749585A
Authority
CN
China
Prior art keywords
skeleton
graph
convolution
human body
component combination
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201911041763.XA
Other languages
Chinese (zh)
Inventor
崔振
刘蓉
许春燕
张桐
杨健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN201911041763.XA priority Critical patent/CN112749585A/en
Publication of CN112749585A publication Critical patent/CN112749585A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a skeleton action identification method based on graph convolution, and a basic unit of the method is a space-time graph convolution module. The space-time graph convolution module comprises the following steps: acquiring a skeleton video, constructing a skeleton graph (graph) based on the skeleton video of each frame, defining different human body part combinations according to the skeleton graph, constructing joint point relationship graphs for the human body part combinations, and further constructing a multi-dimensional relationship interaction graph which comprises part combination interaction dimensions and joint point interaction dimensions; carrying out graph convolution on the multi-dimensional interactive graph on the joint point interactive dimension and the component combination interactive dimension respectively; and then, the spatial features obtained by convolving the two images are sent to a local convolution network of a time slice to obtain the time dynamic features. A plurality of space-time graph convolution modules are stacked in the network model to construct a neural network, and a softmax classifier is used for classification.

Description

Skeleton action identification method based on graph convolution
Technical Field
The invention belongs to the motion recognition technology, and particularly relates to a skeleton motion recognition method based on graph convolution.
Background
Human motion recognition is a popular research direction in the field of computer vision. The main purpose is to correctly classify the human body actions in the video. The technology can be applied to the fields of intelligent video monitoring, man-machine natural interaction, motion video analysis, unmanned driving and the like. With the development of hardware equipment, multi-modal human motion data including RBG, depth, infrared data and the like can be easily collected. The skeleton video obtained from the depth data is robust to changes in appearance, lighting, and surrounding environment, and is of great significance as input data for performing motion recognition.
Deep learning is an important means for skeleton motion recognition, and Yan et al propose a space-time graph convolution network (ST-GCN) for adaptively learning the spatial and temporal patterns of human motion from skeleton video samples. Li et al recursively performs multi-scale local graph convolution on a skeleton graph using a space-time graph convolution (STGC) method, in conjunction with graph local convolution filtering and recursive learning. However, these existing skeleton-based motion recognition methods generally use joint position or sequence information to represent skeleton-based human motion, and do not take into account local/global information and the correspondence of specific motions to human body parts well.
Disclosure of Invention
The invention aims to provide a method for recognizing skeleton video actions based on graph convolution.
The technical solution for realizing the purpose of the invention is as follows: a skeleton action recognition method based on graph convolution comprises the following steps:
step 1, obtaining a skeleton video, and constructing a graph sequence based on a skeleton sequence;
step 2, constructing a joint point interaction graph representing each human body part combination according to the human body part combinations with different scales;
step 3, constructing a component combination interactive graph representing the integral structure of the human body by taking each human body component combination as a node;
step 4, performing K-order graph convolution on the human body component combination of each frame in the joint point interaction dimension to obtain corresponding human body component combination characteristics;
step 5, performing K-order graph convolution on the human body component combination of each frame in the component combination interaction dimension to obtain corresponding spatial features;
step 6, stacking the spatial features of all frames along a time axis to perform time dimension convolution operation to obtain time dynamic features;
step 7, constructing a joint point interaction graph and a component combination interaction graph for the time dynamic characteristics by adopting the method of the steps 2-6, calculating corresponding human body component combination characteristics and space characteristics, and updating the time dynamic characteristics to obtain a representation characteristic vector of the skeleton video;
and 8, classifying the expression characteristic vectors of the skeleton video by adopting a softmax classifier, and finishing the action recognition.
In the step 1, the following substeps are included:
(1.1) for the t-th frame of skeleton video, constructing a skeleton map S based on natural connection of human jointst={Vt,EtIn which V istRepresenting all nodes in the diagram, consisting of articulation points of the human skeleton, EtAll edges in the figure are represented. Skeleton diagram StIs non-directional and if there is a bone connection between two joint points, there is an edge between the two nodes; otherwise, no edge exists between the two nodes;
(1.2) for a T frame long skeleton sequence, constructing a corresponding graph sequence { S }1,S2,…,ST}。
In the step 2, the following substeps are included:
(2.1) for each skeleton map StThe limb part with obvious human motion characteristics is defined as different human body part combinations, particularly, four limbs are taken as four basic part combinations
Figure BDA0002253035750000021
(2.2) carrying out pairwise arrangement and combination on the four constructed basic components to construct six combinations with higher first-order dimensions:
Figure BDA0002253035750000022
-the right hand and the right leg,
Figure BDA0002253035750000023
the left hand and the left leg, respectively,
Figure BDA0002253035750000024
-the right hand and the left leg,
Figure BDA0002253035750000025
-the left hand and the right leg,
Figure BDA0002253035750000026
-the upper half of the body and
Figure BDA0002253035750000027
-a lower body, dividing the part of the human body with a relatively small motion amplitude into the corresponding closest combinations;
(2.3) construction of highest size human body part Assembly with Whole body skeleton
Figure BDA0002253035750000028
-whole body;
(2.4) combining each human body part, taking each joint point as a node of the graph, and establishing edges of the graph according to natural connection of the human body to obtain a joint point interaction graph;
(2.5) obtaining a Laplacian matrix of the joint point interaction diagram according to a spectrogram theory, namely a joint point adjacency relation matrix
Figure BDA0002253035750000029
In step 3, the following substeps are included:
(3.1) for the t-th frame, combining with the constructed h-person body parts
Figure BDA00022530357500000210
As nodes, each node randomly selects a plurality of other nodes to be connected with each other as edges, and a component combination interaction graph is constructed;
(3.2) obtaining a Laplacian matrix of the component combination interaction diagram according to the spectrogram theory, namely a component combination adjacency relation matrix
Figure BDA00022530357500000211
In step 4, the concrete operation of the joint point interaction dimension graph convolution is as follows:
Figure BDA0002253035750000031
wherein
Figure BDA0002253035750000032
Showing the combination of parts
Figure BDA0002253035750000033
In particular finger assemblies
Figure BDA0002253035750000034
3-dimensional coordinates of all internal joint points in the skeleton video are provided by the acquired skeleton video;
Figure BDA0002253035750000035
Figure BDA0002253035750000036
is a matrix of the adjacency of the joint points,
Figure BDA0002253035750000037
is a matrix
Figure BDA0002253035750000038
The largest eigenvalue;
Figure BDA0002253035750000039
is a matrix
Figure BDA00022530357500000310
Expansion of the chebyshev polynomial of (a);
Figure BDA00022530357500000311
is a combination of parts StiA graph convolution response at the joint interaction dimension; k1Representing K around the convolution node1Nodes and edges in the neighborhood participate in convolution operation; wikAre the model parameters of graph convolution.
In step 5, the specific operation of the convolution of the component combination interaction dimension graph is as follows:
Figure BDA00022530357500000312
wherein
Figure BDA00022530357500000313
The input characteristic of the convolution of the interactive dimension graph of the representing component combination is obtained by calculating the convolution of the interactive dimension graph of the joint points according to the characteristic of the human body component combination;
Figure BDA00022530357500000314
Figure BDA00022530357500000315
is a matrix of component combination adjacency,
Figure BDA00022530357500000316
is that
Figure BDA00022530357500000317
The maximum eigenvalue of (d);
Figure BDA00022530357500000318
is a matrix
Figure BDA00022530357500000319
Expansion of the chebyshev polynomial of (a); wkAre model parameters for k-order convolution;
Figure BDA00022530357500000320
is the spatial map convolution response of the t-th frame skeleton video; k2Meaning K around the convolution node2Nodes and edges in the neighborhood participate in convolution operations.
In step 6, the specific operation of time dimension convolution by stacking spatial features is as follows:
Y=L*f
wherein
Figure BDA00022530357500000321
Is a 3-D tensor eigenmatrix stacked by spatial features of all frames, f is a convolution kernel of 1 x 9, window size 9,
Figure BDA00022530357500000322
is the result of performing local convolution filtering only in the time dimension.
And 7, repeatedly updating the time dynamic characteristics to obtain a time dynamic characteristic representation skeleton video with higher dimensionality.
In step 8, the specific method for classifying the representation feature vectors of the skeleton video is as follows:
pooling and full-connection operation are carried out on the representation feature vectors of the skeleton video, feature dimensionality is reduced, the classification probability of the skeleton video for each action category is obtained through calculation of a softmax classifier, and the largest category is selected as the skeleton action.
Compared with the prior art, the invention has the following remarkable advantages: 1) according to the natural division of the human body, a multi-dimensional relationship interaction graph representing the whole structure of the human body is constructed for the input human body skeleton, and the global interaction relationship contained in the movement between the human body part combinations and the joint points can be comprehensively described; 2) the human skeleton dynamic different-level representation is learned through the space-time graph convolution network framework, the whole structure relation of the human body at a single moment can be captured, the dynamic change characteristic of the human skeleton in the time domain can be modeled, and the accuracy of skeleton action identification is improved.
Drawings
Fig. 1 is a schematic flow chart of a graph convolution-based skeleton action recognition method according to the present invention.
FIG. 2 is a diagram of the human body component assembly and its joint connection relationship.
FIG. 3 is a diagram of the adjacency relationship of the high-dimensional relationship interaction diagram constructed by the invention in the interaction dimension of the components.
Detailed Description
The invention is further illustrated by the following examples in conjunction with the accompanying drawings.
The invention provides a skeleton action identification method based on graph convolution aiming at a scene of a single video single label, which specifically comprises the following steps as shown in figure 1:
step 1, obtaining a skeleton video, and constructing a graph sequence based on a skeleton sequence, wherein the method comprises the following substeps:
(1.1) for the t-th frame of skeleton video, constructing a skeleton map S based on natural connection of human jointst={Vt,EtIn which V istRepresenting all nodes in the diagram, consisting of articulation points of the human skeleton, EtAll edges in the figure are represented. Skeleton diagram StIs non-directional and if there is a bone connection between two joint points, there is an edge between the two nodes; otherwise there is no edge between the two nodes.
(1.2) for a T frame long skeleton sequence, constructing a corresponding graph sequence { S }1,S2,…,ST}。
Step 2, defining h human body part combinations with different scales for the skeleton diagram constructed in the step 1, and constructing a joint point interaction diagram for each human body part combination, wherein h is 11, and the method comprises the following substeps:
(2.1) for each skeleton map StThe limb part with obvious human motion characteristics is defined as different human body part combinations, particularly, four limbs are taken as four basic part combinations
Figure BDA0002253035750000041
And (2.2) based on four basic components and considering the intuitive cognition of human beings to the human skeleton, a multi-scale human body component division method is provided. Firstly, the four basic components (the lowest dimension) constructed in the step (2.1) are arranged and combined pairwise, so that six combinations with relatively higher first-order dimensions can be constructed:
Figure BDA0002253035750000042
-the right hand and the right leg,
Figure BDA0002253035750000043
the left hand and the left leg, respectively,
Figure BDA0002253035750000044
-the right hand and the left leg,
Figure BDA0002253035750000045
-the left hand and the right leg,
Figure BDA0002253035750000046
-the upper half of the body and
Figure BDA0002253035750000047
the lower body, during which the body parts with relatively small motion amplitudes (such as neck, trunk, etc.) are divided into the combinations with the corresponding closest distances; the highest scale human body part combination is then constructed with the entire body skeleton
Figure BDA0002253035750000048
-the whole body. The specific construction results and the division method of the neck, the trunk and the like are shown in the attached figure 2 of the specification.
And (2.3) combining each human body part to construct an interactive graph of joint points, wherein each joint point is used as a node of the graph, and each node establishes an edge according to the natural connection of the human body. According to the existing spectrogram theory, based on the connecting edge between the joint points, a Laplacian matrix can be constructed
Figure BDA0002253035750000051
Used for representing the adjacent relation among all the joint points in the human body part assembly.
And 3, based on the multi-scale human body part combination divided in the step 2, further constructing a part combination interaction graph representing the whole structure of the human body by taking each human body part combination as a node, and comprising the following substeps:
(3.1) for the t-th frame, the h-person body part combination constructed in step 2
Figure BDA0002253035750000052
And as nodes, randomly selecting other nodes from each node to be connected with each other as edges, constructing a component combination interactive graph, and forming a multi-dimensional relationship interactive graph together with the joint point interactive graph. In the invention, each node randomly selects 5 nodes to be connected with each other as edges.
(3.2) calculating a Laplacian matrix of the component combination interactive graph based on the connection edges of the combined nodes of the human body components in (3.1) according to the existing spectrogram theory
Figure BDA0002253035750000053
The adjacency relation of each node of the multi-dimensional relationship interaction graph in the interaction dimension of the component combination is shown.
And 4, performing K-order graph convolution on the h human body part combination of each frame in the joint point interaction dimension to obtain the characteristics of the corresponding human body part combination, wherein the specific operation is as follows:
Figure BDA0002253035750000054
wherein
Figure BDA0002253035750000055
Showing the combination of parts
Figure BDA0002253035750000056
In particular finger assemblies
Figure BDA0002253035750000057
3-dimensional coordinates of all internal joint points in the skeleton video are provided by the acquired skeleton video;
Figure BDA0002253035750000058
Figure BDA0002253035750000059
is that
Figure BDA00022530357500000510
The largest eigenvalue;
Figure BDA00022530357500000511
is a matrix
Figure BDA00022530357500000512
Expansion of the chebyshev polynomial of (a);
Figure BDA00022530357500000513
is a combination of parts StiA graph convolution response at the joint interaction dimension; k1Representing K around the convolution node1Nodes and edges in the neighborhood participate in convolution operation; wikAre the model parameters of graph convolution.
And 5, performing K-order graph convolution on the h-shaped human body part combination of each frame in the interaction dimension of the part combination to obtain corresponding spatial features, wherein the specific operation is as follows:
Figure BDA00022530357500000514
wherein
Figure BDA00022530357500000515
The input characteristic of the convolution of the interactive dimension graph of the representing component combination is obtained by calculating the convolution of the interactive dimension graph of the joint points according to the characteristic of the human body component combination;
Figure BDA00022530357500000516
Figure BDA00022530357500000517
is a Laplacian matrix of the multi-dimensional relationship interactive graph on the component combination interactive dimension, represents the adjacent relationship of the component combination interactive graph,
Figure BDA0002253035750000061
is that
Figure BDA0002253035750000062
The maximum eigenvalue of (d);
Figure BDA0002253035750000063
is a matrix
Figure BDA0002253035750000064
Expansion of the chebyshev polynomial of (a); wkAre model parameters for k-order convolution;
Figure BDA0002253035750000065
is the spatial map convolution response of the t-th frame skeleton video; k2Meaning K around the convolution node2Nodes and edges in the neighborhood participate in convolution operations.
And 6, stacking the spatial features of all the frames along a time axis to perform time dimension convolution operation to obtain time dynamic features, wherein the time dynamic features are specifically operated as follows:
Y=L*f
wherein
Figure BDA0002253035750000066
Is a 3-D tensor eigenmatrix stacked by spatial features of all frames, f is a convolution kernel of 1 x 9, window size 9,
Figure BDA0002253035750000067
is the result of performing local convolution filtering only in the time dimension.
In step 7, the method of steps 2-6 forms a space-time diagram convolution module, which comprises five operations of constructing a joint point interaction diagram and a component combination interaction diagram, calculating corresponding human body component combination characteristics and space characteristics, and calculating time dynamic characteristics. And inputting the time dynamic features into the blank image convolution module again to obtain higher-level time dynamic features to represent the skeleton video. The network of the invention preferably selects 9 space-time convolution modules, namely, the updated time dynamic characteristics are circularly input into the space-time graph convolution module, and the time dynamic characteristics calculated for the 9 th time are used as the expression characteristic vector of the skeleton video.
Step 8, classifying the representation characteristic vectors of the skeleton video by adopting a softmax classifier, and finishing action recognition, wherein the concrete operations are as follows:
pooling and full-connection operation are carried out on the representation feature vectors of the skeleton video, feature dimensionality is reduced, the classification probability of the skeleton video for each action category is obtained through calculation of a softmax classifier, and the largest category is selected as the skeleton action.
Examples
In order to verify the effectiveness of the scheme, a simulation experiment is carried out on the disclosed NTU RGB + D data set based on a Pythrch deep learning platform. During the experiment, the method of the invention determines training and testing data according to two evaluation protocols of cross-view (cross view) and cross subject (cross subject), and then performs the training and testing of the depth map convolution network. When the network is trained, training data is input into the depth map convolutional network for forward propagation, the classification probability of each sample for each action class is obtained, then backward propagation is carried out based on cross entropy loss, and network parameters are adjusted. After training is finished, class prediction is carried out on the samples to be tested on the basis of the network, namely the test samples are input into the trained depth map convolution network, the classification probability of each sample for each action class is obtained through forward propagation, and then the class with the maximum classification probability is selected as the prediction class of the sample. For each video sample, if the prediction type is consistent with the label of the video, the method correctly classifies the video; otherwise, the method classifies the video incorrectly. Experimental results show that the method disclosed by the invention achieves 89% (cross-view) and 84% (cross-subject) accuracy on the two evaluation protocols respectively.

Claims (9)

1. A skeleton action recognition method based on graph convolution is characterized by comprising the following steps:
step 1, obtaining a skeleton video, and constructing a graph sequence based on a skeleton sequence;
step 2, constructing a joint point interaction graph representing each human body part combination according to the human body part combinations with different scales;
step 3, constructing a component combination interactive graph representing the integral structure of the human body by taking each human body component combination as a node;
step 4, performing K-order graph convolution on the human body component combination of each frame in the joint point interaction dimension to obtain corresponding human body component combination characteristics;
step 5, performing K-order graph convolution on the human body component combination of each frame in the component combination interaction dimension to obtain corresponding spatial features;
step 6, stacking the spatial features of all frames along a time axis to perform time dimension convolution operation to obtain time dynamic features;
step 7, constructing a joint point interaction graph and a component combination interaction graph for the time dynamic characteristics by adopting the method of the steps 2-6, calculating corresponding human body component combination characteristics and space characteristics, and updating the time dynamic characteristics to obtain a representation characteristic vector of the skeleton video;
and 8, classifying the expression characteristic vectors of the skeleton video by adopting a softmax classifier, and finishing the action recognition.
2. The graph convolution-based skeleton motion recognition method according to claim 1, wherein the step 1 includes the following substeps:
(1.1) for the t-th frame of skeleton video, constructing a skeleton map S based on natural connection of human jointst={Vt,EtIn which V istRepresenting all nodes in the diagram, consisting of articulation points of the human skeleton, EtAll edges in the figure are represented. Skeleton diagram StIs non-directional, andif a bone connection exists between two joint points, an edge exists between the two joint points; otherwise, no edge exists between the two nodes;
(1.2) for a T frame long skeleton sequence, constructing a corresponding graph sequence { S }1,S2,…,ST}。
3. The graph convolution-based skeleton motion recognition method according to claim 1, wherein the step 2 includes the following substeps:
(2.1) for each skeleton map StThe limb part with obvious human motion characteristics is defined as different human body part combinations, particularly, four limbs are taken as four basic part combinations
Figure FDA0002253035740000011
(2.2) carrying out pairwise arrangement and combination on the four constructed basic components to construct six combinations with higher first-order dimensions:
Figure FDA0002253035740000012
-the right hand and the right leg,
Figure FDA0002253035740000013
the left hand and the left leg, respectively,
Figure FDA0002253035740000014
-the right hand and the left leg,
Figure FDA0002253035740000015
-the left hand and the right leg,
Figure FDA0002253035740000016
-the upper half of the body and
Figure FDA0002253035740000017
-a lower body segment dividing a body segment with a relatively small amplitude of motion into respective distancesIn the closest combination;
(2.3) construction of highest size human body part Assembly with Whole body skeleton
Figure FDA0002253035740000021
-whole body;
(2.4) combining each human body part, taking each joint point as a node of the graph, and establishing edges of the graph according to natural connection of the human body to obtain a joint point interaction graph;
(2.5) obtaining a Laplacian matrix of the joint point interaction diagram according to a spectrogram theory, namely a joint point adjacency relation matrix
Figure FDA0002253035740000022
4. The graph convolution-based skeleton motion recognition method according to claim 1, wherein the step 3 includes the following substeps:
(3.1) for the t-th frame, combining with the constructed h-person body parts
Figure FDA0002253035740000023
As nodes, each node randomly selects a plurality of other nodes to be connected with each other as edges, and a component combination interaction graph is constructed;
(3.2) obtaining a Laplacian matrix of the component combination interaction diagram according to the spectrogram theory, namely a component combination adjacency relation matrix
Figure FDA0002253035740000024
5. The graph convolution-based skeleton motion recognition method according to claim 1, wherein in step 4, the specific operation of joint point interaction dimension graph convolution is:
Figure FDA0002253035740000025
wherein
Figure FDA0002253035740000026
Showing the combination of parts
Figure FDA0002253035740000027
In particular finger assemblies
Figure FDA0002253035740000028
3-dimensional coordinates of all internal joint points in the skeleton video are provided by the acquired skeleton video;
Figure FDA0002253035740000029
Figure FDA00022530357400000210
is a matrix of the adjacency of the joint points,
Figure FDA00022530357400000211
is a matrix
Figure FDA00022530357400000212
The largest eigenvalue;
Figure FDA00022530357400000213
is a matrix
Figure FDA00022530357400000214
Expansion of the chebyshev polynomial of (a);
Figure FDA00022530357400000215
is a combination of parts StiA graph convolution response at the joint interaction dimension; k1Representing K around the convolution node1Nodes and edges in the neighborhood participate in convolution operation; wikAre the model parameters of graph convolution.
6. The graph convolution-based skeleton action recognition method according to claim 1, wherein in step 5, the specific operation of component combination interaction dimension graph convolution is as follows:
Figure FDA00022530357400000216
wherein
Figure FDA00022530357400000217
The input characteristic of the convolution of the interactive dimension graph of the representing component combination is obtained by calculating the convolution of the interactive dimension graph of the joint points according to the characteristic of the human body component combination;
Figure FDA00022530357400000218
Figure FDA00022530357400000219
is a matrix of component combination adjacency,
Figure FDA00022530357400000220
is that
Figure FDA0002253035740000031
The maximum eigenvalue of (d);
Figure FDA0002253035740000032
is a matrix
Figure FDA0002253035740000033
Expansion of the chebyshev polynomial of (a); wkAre model parameters for k-order convolution;
Figure FDA0002253035740000034
is the spatial map convolution response of the t-th frame skeleton video; k2Meaning K around the convolution node2Nodes and edges in the neighborhood participate in convolution operations.
7. The graph convolution-based skeleton motion recognition method according to claim 1, wherein in step 6, the specific operation of performing time dimension convolution on the spatial feature stack is as follows:
Y=L*f
wherein
Figure FDA0002253035740000035
Is a 3-D tensor eigenmatrix stacked by spatial features of all frames, f is a convolution kernel of 1 x 9, window size 9,
Figure FDA0002253035740000036
is the result of performing local convolution filtering only in the time dimension.
8. The graph convolution-based skeleton motion recognition method of claim 1, wherein in step 7, the temporal dynamic features are repeatedly updated to obtain a temporal dynamic feature characterization skeleton video with a higher dimension.
9. The method for recognizing skeleton motion based on graph convolution according to claim 1, wherein in step 8, the specific method for classifying the representation feature vectors of the skeleton video is as follows:
pooling and full-connection operation are carried out on the representation feature vectors of the skeleton video, feature dimensionality is reduced, the classification probability of the skeleton video for each action category is obtained through calculation of a softmax classifier, and the largest category is selected as the skeleton action.
CN201911041763.XA 2019-10-30 2019-10-30 Skeleton action identification method based on graph convolution Withdrawn CN112749585A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911041763.XA CN112749585A (en) 2019-10-30 2019-10-30 Skeleton action identification method based on graph convolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911041763.XA CN112749585A (en) 2019-10-30 2019-10-30 Skeleton action identification method based on graph convolution

Publications (1)

Publication Number Publication Date
CN112749585A true CN112749585A (en) 2021-05-04

Family

ID=75640356

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911041763.XA Withdrawn CN112749585A (en) 2019-10-30 2019-10-30 Skeleton action identification method based on graph convolution

Country Status (1)

Country Link
CN (1) CN112749585A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113283400A (en) * 2021-07-19 2021-08-20 成都考拉悠然科技有限公司 Skeleton action identification method based on selective hypergraph convolutional network
CN115294228A (en) * 2022-07-29 2022-11-04 北京邮电大学 Multi-graph human body posture generation method and device based on modal guidance

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113283400A (en) * 2021-07-19 2021-08-20 成都考拉悠然科技有限公司 Skeleton action identification method based on selective hypergraph convolutional network
CN115294228A (en) * 2022-07-29 2022-11-04 北京邮电大学 Multi-graph human body posture generation method and device based on modal guidance
CN115294228B (en) * 2022-07-29 2023-07-11 北京邮电大学 Multi-figure human body posture generation method and device based on modal guidance

Similar Documents

Publication Publication Date Title
Ghosh et al. Learning human motion models for long-term predictions
Kim et al. 3-D scene graph: A sparse and semantic representation of physical environments for intelligent agents
CN108921893B (en) Image cloud computing method and system based on online deep learning SLAM
CN107492121B (en) Two-dimensional human body bone point positioning method of monocular depth video
Kumar et al. Monocular fisheye camera depth estimation using sparse lidar supervision
CN110472604B (en) Pedestrian and crowd behavior identification method based on video
JP7016522B2 (en) Machine vision with dimensional data reduction
CN111339942B (en) Method and system for recognizing skeleton action of graph convolution circulation network based on viewpoint adjustment
CN107203753A (en) A kind of action identification method based on fuzzy neural network and graph model reasoning
CN111814719A (en) Skeleton behavior identification method based on 3D space-time diagram convolution
Heidari et al. Temporal attention-augmented graph convolutional network for efficient skeleton-based human action recognition
CN106909938B (en) Visual angle independence behavior identification method based on deep learning network
CN113221663A (en) Real-time sign language intelligent identification method, device and system
CN107451594B (en) Multi-view gait classification method based on multiple regression
CN104298974A (en) Human body behavior recognition method based on depth video sequence
CN112990154B (en) Data processing method, computer equipment and readable storage medium
CN109657634A (en) A kind of 3D gesture identification method and system based on depth convolutional neural networks
CN112446253B (en) Skeleton behavior recognition method and device
CN112749585A (en) Skeleton action identification method based on graph convolution
Chan et al. A 3-D-point-cloud system for human-pose estimation
CN114140841A (en) Point cloud data processing method, neural network training method and related equipment
CN117854155B (en) Human skeleton action recognition method and system
CN115761905A (en) Diver action identification method based on skeleton joint points
CN114494594B (en) Deep learning-based astronaut operation equipment state identification method
Du The computer vision simulation of athlete’s wrong actions recognition model based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210504