CN113780129A - Motion recognition method based on unsupervised graph sequence predictive coding and storage medium - Google Patents
Motion recognition method based on unsupervised graph sequence predictive coding and storage medium Download PDFInfo
- Publication number
- CN113780129A CN113780129A CN202111009498.4A CN202111009498A CN113780129A CN 113780129 A CN113780129 A CN 113780129A CN 202111009498 A CN202111009498 A CN 202111009498A CN 113780129 A CN113780129 A CN 113780129A
- Authority
- CN
- China
- Prior art keywords
- sequence
- graph
- network
- data
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to an action recognition method based on unsupervised graph sequence predictive coding and a storage medium, wherein the action recognition method comprises the training and the use of a model and is used for recognizing various actions performed by a human body in a skeleton sequence, and the action recognition method aims to solve the problems that the existing action recognition method highly depends on a large amount of marked data, and has lower precision under the condition of only a small amount of marks, and the existing unsupervised method is over-fitted without utilizing topological information of a graph and has poorer serious generalization capability. The system method comprises the steps of unchanged visual angle transformation, resampling and block-level bone map data enhancement of bone sequence data; embedding and extracting a space-time graph convolution bone sequence block; aggregating context features of the graph convolution cyclic neural network; the predictive coding constructs a positive sample pair and a negative sample pair; and extracting features through a pre-training model, and obtaining action categories corresponding to the bone sequences to be recognized by using a classifier. Compared with the prior art, the method has the advantages of low training difficulty, high identification precision, excellent performance and the like.
Description
Technical Field
The present invention relates to the field of motion recognition technologies, and in particular, to a motion recognition method and a storage medium based on unsupervised graph sequence predictive coding.
Background
In computer vision tasks, motion recognition is a hot problem that is now of great interest. The fields of unmanned robots, smart cities, intelligent transportation and the like need to analyze and recognize human behaviors. In recent years, as image convolution is valued and utilized by more and more researchers, the development of a pose estimation algorithm and a depth sensor, and the robustness and the visual removal characteristics of skeleton data are concentrated on the characteristics of actions, and action identification by using the skeleton data becomes a hot point of current research.
Early motion recognition was mainly based on still pictures. In recent years, as research progresses, more and more researchers have given more attention to the dynamic nature of motion, and thus have turned their attention to video-based motion recognition. The most significant difference of video-based motion recognition compared to still picture-based methods is the increase of the time dimension, the data becoming a time sequence of 2D pictures. The time dimension provides rich features and brings great challenges — computational power and increased storage space. Skeletal-based motion recognition alleviates the computational requirements of motion recognition algorithms, but most methods are based on supervised tasks, highly dependent on the number and quality of data set samples. Because of the high inter-class similarity of actions, accurately labeling enough data to train a deep learning model is a challenging and costly problem, and it is therefore highly desirable for researchers to find a robust, label-free method to learn representations of action recognition to better utilize temporal and spatial information. Existing unsupervised work attempts to address the borrowing task of drawing or reconstructing a skeleton sequence using the potential embedding of encoders. However, these codec models typically flatten spatial channels into a single feature vector, ignoring the spatial relationships of the skeleton map. And these borrowing tasks often have problems with overfitting and are not always helpful in downstream tasks.
Disclosure of Invention
The present invention aims to overcome the defects of the prior art, and provides an action recognition method based on unsupervised graph sequence predictive coding with low training difficulty, high recognition accuracy and excellent performance, and a storage medium.
The purpose of the invention can be realized by the following technical scheme:
an action recognition method based on unsupervised graph sequence predictive coding, the action recognition method comprises the following steps:
step 1: acquiring a skeleton data sequence, and preprocessing the data sequence to obtain an input training data block;
step 2: inputting the input training data block into a null graph convolutional network f (-) to obtain an embedded representation of the sequence skeleton graph block, inputting the embedded representation into a cyclic neural network g (-) and aggregating context information;
and step 3: predicting the next sequence of bone picture block embedded representation through a prediction network phi (-) according to the context information, inputting the predicted embedded representation into a recurrent neural network g (-) to obtain a new context representation, and repeating for a plurality of times to obtain a series of predicted picture embedded representations;
and 4, step 4: comparing the obtained prediction graph embedded representation with the real graph embedded representation, optimizing the space-time graph convolution network f (-) and the graph convolution cyclic neural network g (-) and the prediction network phi (-) through comparing loss function reverse conduction, and obtaining a pre-training model through a plurality of iterations;
and 5: removing the prediction network phi (-) according to the obtained pre-training model, taking the parts of the space-time graph convolution network f (-) and the cyclic neural network g (-) as feature extractors, adding a classifier on the upper layer of the feature extractors, and obtaining a final classification model through training of inputting labeled data;
step 6: acquiring a bone data sequence to be detected, and preprocessing the bone data sequence to obtain an input prediction data block;
and 7: and inputting the input prediction data block into the classification model, predicting various action probabilities of the people needing to be identified, and completing action identification.
Preferably, the step 1 specifically comprises:
step 1-1: for a given bone sequenceObtaining the bone sequence data of the corrected view angle by the data X through view angle invariant transformation F (-) to obtain
Step 1-2: bone sequence data for a given corrected view angleAnd input sample window size TwindowFirst, will have TsampleThe skeleton sequence of the frame is upsampled to T by linear interpolationwindowA sequence of xk frames, where k ∈ N +, Twindow·(k-1)<Tsample<Twindow·k;
Step 1-3: for interpolated data obtained in the preceding stepIs divided into fractions containing TpatchSequence block of frame, P ═ P1,p2,...,pnFor each sequence block piApplying random skeleton map data enhancement to finally obtain enhanced skeleton sequence blocks
Preferably, the step 2 specifically comprises:
step 2-1: according to the bone sequence block obtained in the step 1Inputting the input data block into the space-map convolutional network f (-) to obtain the embedded representation
Step 2-2: according to step 2-1: the resulting embedded representationObtaining a context representation C in an input graph convolution recurrent neural network g (-) toi。
Preferably, the step 3 specifically comprises:
step 3-1: context information C obtained according to step 2iPredicting a next sequence of bone tile-embedded representations over a prediction network phi (·)
Step 3-2: graph-embedded representation obtained according to step 3-1Obtaining context information via a graph convolution recurrent neural network g (-) to
Step 3-3: context information obtained according to step 3-2Repeating the step 3-1 and the step 3-2 for several times by analogy to obtain a series of predicted graph embedding representations
Preferably, the space-time graph convolution network f (-) and the recurrent neural network g (-) are both constructed based on a graph convolution neural network, and the prediction network Φ (-) is constructed based on a neural network.
More preferably, the graph convolution rule of the space-time graph convolution network f (-) and the recurrent neural network g (-) is:
wherein the content of the first and second substances,andrespectively representing an input characteristic diagram and an output characteristic diagram;the unit matrix I is added to the drawing defined tie matrix A, namely the node itself links the node itself,representing its diagonal matrix, τ the activation function, and Θ the learnable weight matrix of the atlas layer.
More preferably, the structure of the recurrent neural network g (-) is based on a gated recurrent unit GRU, and the calculation rule is as follows:
wherein z istIndicating an update gate, rtA reset gate is shown, which is,representing candidate activation vectors;is a graph convolution operator; an indication of a hadamard product; sigma represents a Sigmoid activation function, and psi is a Tanh activation function; omegazz、ωhz、ωzrAnd ωhrRespectively representing the parameters of each memory gate; q. q.stIs the memory/forgetting weight.
Preferably, the contrast loss function in step 4 is specifically:
wherein z isi,kAndrespectively represent z taken from the i-th samplekAnd representing embedded representation pairsThe similarity of (c).
Preferably, the step 5 specifically comprises:
step 5-1: the training model obtained according to the step 4 comprises a space-time graph convolution network f (-) and a graph convolution recurrent neural network g (-) and a prediction network phi (-) and only f (-) and g (-) are used for replacing phi (-) with a classifier networkConstructing a classification model;
step 5-2: inputting the training data with the labels, and training the labeled training data to obtain a final classification model.
A storage medium storing a motion recognition method based on unsupervised graph sequence predictive coding according to any one of the above.
Compared with the prior art, the invention has the following beneficial effects:
firstly, the skeleton action recognition framework based on the unsupervised graph convolution can learn the effective representation of human body action from unlabeled data through comparison and learning, so that the requirement of sample labeling is reduced, and the training difficulty is simplified.
Secondly, the recognition precision is high: the motion recognition method based on the unsupervised graph sequence predictive coding simultaneously and fully utilizes the space and time dependency by utilizing the graph convolution and the contrast learning, avoids the limitation of generative learning and sample-based contrast learning in the motion recognition based on the unsupervised skeleton, and improves the motion recognition precision.
Thirdly, excellent performance: compared with the latest SOTA method on three reference data sets, the action identification method based on the unsupervised graph sequence predictive coding has the performance that the SOTA is higher than 20 percent.
Drawings
FIG. 1 is a flow chart of a method of motion recognition in the present invention;
FIG. 2 is a schematic view of the overall framework of the present invention;
FIG. 3 is a schematic diagram of training of a contrast learning-based and training model according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.
As shown in fig. 1, the present embodiment provides a skeleton motion recognition method based on unsupervised graph convolution, and the main objective is to learn a representation of motion recognition from unlabeled data by using an unsupervised contrast learning method, and to maximally utilize time information of a skeleton sequence and space information of a skeleton graph, and perform classification model training on the learned representation by using a small amount of labeled data, so as to more accurately recognize human motions.
As shown in fig. 1 and fig. 2, the method for identifying an action based on unsupervised graph sequence predictive coding in this embodiment mainly includes the following steps:
step 1: obtaining a skeleton data sequence, preprocessing the data through view angle invariant transformation, time window resampling and block-level data enhancement, and obtaining an input training data block which is segmented into specific lengths and has a fixed window size;
the method specifically comprises the following steps:
step 1-1: for given bone sequence data X, obtaining bone sequence data of corrected visual angle through constant visual angle transformation F (-) to obtain
Step 1-2: bone sequence data for a given corrected view angleAnd input sample window size TwindowFirst, will have TsampleThe skeleton sequence of the frame is upsampled to T by linear interpolationwindowA sequence of xk frames, where k ∈ N +, Twindow·(k-1)<Tsample<Twindow·k;
Step 1-3: for interpolated data obtained in the preceding stepIs divided into fractions containing TpatchSequence block of frame, P ═ P1,p2,...,pnFor each sequence block piRandom skeleton map data enhancement is applied, the same enhancement is applied in blocks, and different enhancements are applied among blocks; the enhancement comprises displacement, inclination and rotation, and finally the enhanced bone sequence block is obtained
Step 2: inputting the input training data block into a null graph convolutional network f (-) to obtain an embedded representation of the sequence skeleton graph block, inputting the embedded representation into a cyclic neural network g (-) and aggregating context information;
the method specifically comprises the following steps:
step 2-1: according to the bone sequence block obtained in the step 1Inputting the input data block into the space-map convolutional network f (-) to obtain the embedded representation
Step 2-2: according to step 2-1: the resulting embedded representationObtaining a context representation C in an input graph convolution recurrent neural network g (-) toi;
And step 3: predicting the next sequence of bone picture block embedded representation through a prediction network phi (-) according to the context information, inputting the predicted embedded representation into a recurrent neural network g (-) to obtain a new context representation, and repeating for a plurality of times to obtain a series of predicted picture embedded representations;
the method specifically comprises the following steps:
step 3-1: context information C obtained according to step 2iPredicting a next sequence of bone tile-embedded representations over a prediction network phi (·)
Step 3-2: graph-embedded representation obtained according to step 3-1Obtaining context information via a graph convolution recurrent neural network g (-) to
Step 3-3: context information obtained according to step 3-2Repeating the step 3-1 and the step 3-2 for a plurality of times by analogy to obtain a series of predicted graph inlaysIn represents
And 4, step 4: comparing the obtained prediction graph embedded representation with the real graph embedded representation, optimizing the space-time graph convolution network f (-) and the graph convolution cyclic neural network g (-) and the prediction network phi (-) through comparing the loss function reverse conduction, and obtaining a pre-training model through a plurality of iterations, as shown in FIG. 3;
in the embodiment, both the space-time graph convolution network f (-) and the recurrent neural network g (-) are constructed based on a graph convolution neural network, and the prediction network phi (-) is constructed based on a neural network;
the graph convolution rule of the space-time graph convolution network f (-) and the recurrent neural network g (-) is as follows:
wherein the content of the first and second substances,andrespectively representing an input characteristic diagram and an output characteristic diagram;the unit matrix I is added to the drawing defined tie matrix A, namely the node itself links the node itself,representing the angle matrix, tau representing the activation function, theta representing the learnable weight matrix of the graph convolution layer;
the construction of the recurrent neural network g (-) is based on a gated recurrent unit GRU, and the calculation rule is as follows:
wherein z istIndicating an update gate, rtA reset gate is shown, which is,representing candidate activation vectors;is a graph convolution operator; an indication of a hadamard product; sigma represents a Sigmoid activation function, and psi is a Tanh activation function; omegazz、ωhz、ωzrAnd ωhrRespectively representing the parameters of each memory gate; q. q.stIs the memory/forgetting weight.
The contrast loss function is specifically:
wherein z isi,kAndrespectively represent z taken from the i-th samplekAnd representing embedded representation pairsThe similarity of (2);
and 5: removing the prediction network phi (-) according to the obtained pre-training model, taking the parts of the space-time graph convolution network f (-) and the cyclic neural network g (-) as feature extractors, adding a classifier on the upper layer of the feature extractors, and obtaining a final classification model through training of inputting labeled data;
the method specifically comprises the following steps:
step 5-1: the training model obtained according to the step 4 comprises a space-time graph convolution network f (-) and a graph convolution recurrent neural network g (-) and a prediction network phi (-) and only f (-) and g (-) are used for replacing phi (-) with a classifier networkConstructing a classification model;
step 5-2: inputting labeled training data, and performing labeled data training to obtain a final classification model;
step 6: acquiring a bone data sequence to be detected, and preprocessing the bone data sequence to obtain an input prediction data block;
and 7: and inputting the input prediction data block into the classification model, predicting various action probabilities of the people needing to be identified, and completing action identification.
In the embodiment, the prediction network phi (-) is a single-layer fully-connected neural network construction and classifier networkThe multi-classification classifier is obtained by training through methods such as a multilayer perceptron.
In order to support and verify the performance of the motion recognition method proposed by the present invention, the method is compared with other latest leading-edge motion recognition methods on three widely used public standard data sets, and the comparison results are shown in table 1.
Experimental comparisons three widely used public standard data sets were used: NTU RGB + D60, Northwestern-UCLA (NW-UCLA) and UWA3D Multiview Activity II (UWA 3D). The experiment adopts a linear probe verification method widely used by an unsupervised learning method to verify, namely, the weight of a pre-training model is fixed, a linear classifier taking the output characteristics of the pre-training model as input is trained, and the performance of a test set is reported to measure the effectiveness of the learning representation.
TABLE 1 comparative results
The comparison results show that the motion recognition method proposed in this example is excellent in performance.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (10)
1. A motion recognition method based on unsupervised graph sequence predictive coding is characterized by comprising the following steps:
step 1: acquiring a skeleton data sequence, and preprocessing the data sequence to obtain an input training data block;
step 2: inputting the input training data block into a null graph convolutional network f (-) to obtain an embedded representation of the sequence skeleton graph block, inputting the embedded representation into a cyclic neural network g (-) and aggregating context information;
and step 3: predicting the next sequence of bone picture block embedded representation through a prediction network phi (-) according to the context information, inputting the predicted embedded representation into a recurrent neural network g (-) to obtain a new context representation, and repeating for a plurality of times to obtain a series of predicted picture embedded representations;
and 4, step 4: comparing the obtained prediction graph embedded representation with the real graph embedded representation, optimizing the space-time graph convolution network f (-) and the graph convolution cyclic neural network g (-) and the prediction network phi (-) through comparing loss function reverse conduction, and obtaining a pre-training model through a plurality of iterations;
and 5: removing the prediction network phi (-) according to the obtained pre-training model, taking the parts of the space-time graph convolution network f (-) and the cyclic neural network g (-) as feature extractors, adding a classifier on the upper layer of the feature extractors, and obtaining a final classification model through training of inputting labeled data;
step 6: acquiring a bone data sequence to be detected, and preprocessing the bone data sequence to obtain an input prediction data block;
and 7: and inputting the input prediction data block into the classification model, predicting various action probabilities of the people needing to be identified, and completing action identification.
2. The method for motion recognition based on unsupervised graph sequence predictive coding according to claim 1, wherein the step 1 specifically comprises:
step 1-1: for given bone sequence data X, obtaining bone sequence data of corrected visual angle through constant visual angle transformation F (-) to obtain
Step 1-2: bone sequence data for a given corrected view angleAnd input sample window size TwindowFirst, will have TsampleThe skeleton sequence of the frame is upsampled to T by linear interpolationwindowA sequence of xk frames, where k ∈ N +, Twindow·(k-1)<Tsample<Twindow·k;
3. The method for motion recognition based on unsupervised graph sequence predictive coding according to claim 1, wherein the step 2 specifically comprises:
step 2-1: according to the bone sequence block obtained in the step 1Inputting the input data block into the space-map convolutional network f (-) to obtain the embedded representation
4. The method for motion recognition based on unsupervised graph sequence predictive coding according to claim 1, wherein the step 3 specifically comprises:
step 3-1: context information C obtained according to step 2iPredicting a next sequence of bone tile-embedded representations over a prediction network phi (·)
Step 3-2: according toGraph-embedded representation from step 3-1Obtaining context information via a graph convolution recurrent neural network g (-) to
5. The method of claim 1, wherein the space-time graph convolutional network f (-) and the recurrent neural network g (-) are constructed based on a graph convolution neural network, and the prediction network Φ (-) is constructed based on a neural network.
6. The method of claim 5, wherein the graph convolution rule between the spatio-temporal graph convolution network f (-) and the recurrent neural network g (-) is as follows:
wherein the content of the first and second substances,andrespectively representing input characteristicsOutputting a characteristic diagram;the unit matrix I is added to the drawing defined tie matrix A, namely the node itself links the node itself,representing its diagonal matrix, τ the activation function, and Θ the learnable weight matrix of the atlas layer.
7. The method of claim 5, wherein the recurrent neural network g (-) is constructed based on gated recurrent units GRU, and the calculation rule is:
wherein z istIndicating an update gate, rtA reset gate is shown, which is,representing candidate activation vectors;for pattern convolutionAn operator; an indication of a hadamard product; sigma represents a Sigmoid activation function, and psi is a Tanh activation function; omegazz、ωhz、ωzrAnd ωhrRespectively representing the parameters of each memory gate; q. q.stIs the memory/forgetting weight.
9. The method for motion recognition based on unsupervised graph sequence predictive coding according to claim 1, wherein the step 5 specifically comprises:
step 5-1: the training model obtained according to the step 4 comprises a space-time graph convolution network f (-) and a graph convolution recurrent neural network g (-) and a prediction network phi (-) and only f (-) and g (-) are used for replacing phi (-) with a classifier networkConstructing a classification model;
step 5-2: inputting the training data with the labels, and training the labeled training data to obtain a final classification model.
10. A storage medium storing an unsupervised graph sequence predictive coding-based motion recognition method according to any one of claims 1 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111009498.4A CN113780129B (en) | 2021-08-31 | 2021-08-31 | Action recognition method based on unsupervised graph sequence predictive coding and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111009498.4A CN113780129B (en) | 2021-08-31 | 2021-08-31 | Action recognition method based on unsupervised graph sequence predictive coding and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113780129A true CN113780129A (en) | 2021-12-10 |
CN113780129B CN113780129B (en) | 2023-07-04 |
Family
ID=78840308
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111009498.4A Active CN113780129B (en) | 2021-08-31 | 2021-08-31 | Action recognition method based on unsupervised graph sequence predictive coding and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113780129B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115019397A (en) * | 2022-06-15 | 2022-09-06 | 北京大学深圳研究生院 | Comparison self-monitoring human behavior recognition method and system based on temporal-spatial information aggregation |
CN115035606A (en) * | 2022-08-11 | 2022-09-09 | 天津大学 | Bone action recognition method based on segment-driven contrast learning |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110059620A (en) * | 2019-04-17 | 2019-07-26 | 安徽艾睿思智能科技有限公司 | Bone Activity recognition method based on space-time attention |
CN111310707A (en) * | 2020-02-28 | 2020-06-19 | 山东大学 | Skeleton-based method and system for recognizing attention network actions |
CN111339942A (en) * | 2020-02-26 | 2020-06-26 | 山东大学 | Method and system for recognizing skeleton action of graph convolution circulation network based on viewpoint adjustment |
WO2021069945A1 (en) * | 2019-10-09 | 2021-04-15 | Toyota Motor Europe | Method for recognizing activities using separate spatial and temporal attention weights |
-
2021
- 2021-08-31 CN CN202111009498.4A patent/CN113780129B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110059620A (en) * | 2019-04-17 | 2019-07-26 | 安徽艾睿思智能科技有限公司 | Bone Activity recognition method based on space-time attention |
WO2021069945A1 (en) * | 2019-10-09 | 2021-04-15 | Toyota Motor Europe | Method for recognizing activities using separate spatial and temporal attention weights |
CN111339942A (en) * | 2020-02-26 | 2020-06-26 | 山东大学 | Method and system for recognizing skeleton action of graph convolution circulation network based on viewpoint adjustment |
CN111310707A (en) * | 2020-02-28 | 2020-06-19 | 山东大学 | Skeleton-based method and system for recognizing attention network actions |
Non-Patent Citations (1)
Title |
---|
管珊珊;张益农;: "基于残差时空图卷积网络的3D人体行为识别", 计算机应用与软件, no. 03 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115019397A (en) * | 2022-06-15 | 2022-09-06 | 北京大学深圳研究生院 | Comparison self-monitoring human behavior recognition method and system based on temporal-spatial information aggregation |
CN115019397B (en) * | 2022-06-15 | 2024-04-19 | 北京大学深圳研究生院 | Method and system for identifying contrasting self-supervision human body behaviors based on time-space information aggregation |
CN115035606A (en) * | 2022-08-11 | 2022-09-09 | 天津大学 | Bone action recognition method based on segment-driven contrast learning |
CN115035606B (en) * | 2022-08-11 | 2022-10-21 | 天津大学 | Bone action recognition method based on segment-driven contrast learning |
Also Published As
Publication number | Publication date |
---|---|
CN113780129B (en) | 2023-07-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110414432B (en) | Training method of object recognition model, object recognition method and corresponding device | |
Yang et al. | Fsa-net: Learning fine-grained structure aggregation for head pose estimation from a single image | |
CN107341452B (en) | Human behavior identification method based on quaternion space-time convolution neural network | |
CN111259786B (en) | Pedestrian re-identification method based on synchronous enhancement of appearance and motion information of video | |
CN108171209B (en) | Face age estimation method for metric learning based on convolutional neural network | |
CN106919903B (en) | robust continuous emotion tracking method based on deep learning | |
Wang et al. | Deep learning algorithms with applications to video analytics for a smart city: A survey | |
Mukhopadhyay et al. | Facial emotion recognition based on textural pattern and convolutional neural network | |
CN113158815B (en) | Unsupervised pedestrian re-identification method, system and computer readable medium | |
CN111582210B (en) | Human body behavior recognition method based on quantum neural network | |
CN113780129B (en) | Action recognition method based on unsupervised graph sequence predictive coding and storage medium | |
CN112307995A (en) | Semi-supervised pedestrian re-identification method based on feature decoupling learning | |
CN115100709B (en) | Feature separation image face recognition and age estimation method | |
Cho et al. | Semantic segmentation with low light images by modified CycleGAN-based image enhancement | |
CN112070010B (en) | Pedestrian re-recognition method for enhancing local feature learning by combining multiple-loss dynamic training strategies | |
CN112651940A (en) | Collaborative visual saliency detection method based on dual-encoder generation type countermeasure network | |
CN111723667A (en) | Human body joint point coordinate-based intelligent lamp pole crowd behavior identification method and device | |
Xu et al. | Task-aware meta-learning paradigm for universal structural damage segmentation using limited images | |
CN111242003B (en) | Video salient object detection method based on multi-scale constrained self-attention mechanism | |
CN111209886B (en) | Rapid pedestrian re-identification method based on deep neural network | |
CN110135253B (en) | Finger vein authentication method based on long-term recursive convolutional neural network | |
CN116758621A (en) | Self-attention mechanism-based face expression depth convolution identification method for shielding people | |
Nimbarte et al. | Biased face patching approach for age invariant face recognition using convolutional neural network | |
CN112818887B (en) | Human skeleton sequence behavior identification method based on unsupervised learning | |
CN114821631A (en) | Pedestrian feature extraction method based on attention mechanism and multi-scale feature fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |