CN110728183B - Human body action recognition method of neural network based on attention mechanism - Google Patents

Human body action recognition method of neural network based on attention mechanism Download PDF

Info

Publication number
CN110728183B
CN110728183B CN201910846654.9A CN201910846654A CN110728183B CN 110728183 B CN110728183 B CN 110728183B CN 201910846654 A CN201910846654 A CN 201910846654A CN 110728183 B CN110728183 B CN 110728183B
Authority
CN
China
Prior art keywords
network
sub
attention
deep
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910846654.9A
Other languages
Chinese (zh)
Other versions
CN110728183A (en
Inventor
侯永宏
李岳阳
肖任意
李翔宇
郭子慧
刘艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201910846654.9A priority Critical patent/CN110728183B/en
Publication of CN110728183A publication Critical patent/CN110728183A/en
Application granted granted Critical
Publication of CN110728183B publication Critical patent/CN110728183B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a human body action recognition method of a neural network based on an attention mechanism, which provides an end-to-end trainable network comprising a deep convolution sub-network and an attention sub-network, wherein the deep convolution sub-network and the attention sub-network are used for recognizing human actions from skeleton data. First, the skeleton sequence is encoded into a color space-time diagram and fed into a deep convolutional sub-network to extract deep features and mapped into the tag space using fully connected layers. In the attention subnetwork, hand-made features representing the degree of articulation importance are extracted and the attention weights are learned by simple but efficient linear mapping, the result of which is also mapped into the tag space by the fully connected layer. The final recognition accuracy is obtained through multiplication fusion of the two results. The invention can extract effective deep features from data automatically at maximum amplitude. The network structure of the present invention comprises two sub-networks that are simultaneously co-trained in an end-to-end fashion without post-processing.

Description

Human body action recognition method of neural network based on attention mechanism
Technical Field
The invention belongs to the field of computer vision, and particularly relates to a human body action recognition method of a neural network based on an attention mechanism.
Background
The human body action recognition has very wide application prospect, such as man-machine interaction, video monitoring, video understanding and the like. According to the current mainstream method, human motion recognition based on RGB data, depth data and bone data can be mainly classified. Skeleton data is a higher level representation than RGB data and depth data and is robust to changes in viewpoint, position and appearance, which is furthermore very challenging due to complex spatiotemporal changes in skeleton joints. Due to the popularization of economic and efficient depth cameras such as Microsoft Kinect and real-time skeleton estimation algorithms, human action recognition based on a 3D skeleton is attracting more and more attention.
Although the traditional method for manually extracting the characteristics can obtain good accuracy, people who need to design the characteristics have abundant experience and skillful skills, and the manual characteristics have great differences in different data sets, so that a better method is needed for identifying human body actions. With the progress of deep learning, convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have been increasingly used for the past few years to develop highlights in the fields of picture classification, object detection, natural language processing, etc. And recently, attention mechanisms have become popular because it can focus on some important areas in the picture, thereby improving the effectiveness of the task.
Currently, the motion recognition methods of skeleton data based on deep learning can be divided into two types according to how the skeleton sequence represents and feeds into the deep neural network: CNN-based methods and RNN-based methods.
The first method is to generate texture images by encoding a skeleton sequence, and then feed the texture images into CNNs for feature extraction and classification. For example, the joint coordinates of the skeleton sequence are encoded as a matrix and normalized with respect to the entire training dataset, wherein the three Cartesian components (x, y, z) of the skeleton joint are processed into three channels (R, G, B) of the color image, respectively. However, this normalization does not guarantee dimensional invariance.
The second approach is to extract features from each time step of the skeleton sequence and feed frame-based features into the recurrent neural network. Recent attention models have enhanced this approach, which aim at identifying body parts or time steps that are more discriminative to the action classification task. So far, among the several methods that have been proposed, there is a tendency to overemphasize temporal information and underestimate spatial information, and spatial attention is often ignored. Meanwhile, when the method based on the circulating neural network such as LSTM, GRU and the like is used for identifying the human skeleton sequence, a large amount of time sequence calculation is relied on, so that the processing speed of the skeleton sequence is limited, and in addition, the scale of the network is greatly increased along with the introduction of the circulating neural network, so that the training of the network takes more time.
In addition to the above, the human motion recognition method based on deep learning is particularly dependent on the preprocessing process of the skeleton sequence, and the space-time features generated by the process directly determine whether the recognition quality is good or bad, so how to extract a good space-time feature to efficiently recognize the complex motion is still an open problem.
Disclosure of Invention
The invention discloses a human body action recognition method of a neural network based on an attention mechanism, which adopts an end-to-end supervised training mode, and effectively improves the accuracy of human body action recognition by introducing a strategy of extracting characteristics of the convolutional neural network and capturing key points of bones by the attention mechanism.
The invention adopts the following technical scheme for solving the technical problems:
a human body action recognition method of a neural network based on an attention mechanism comprises the following steps:
1) Constructing a feature extraction and classification neural network, wherein the neural network comprises two sub-models, namely a deep convolution sub-network and an attention sub-network;
2) Constructing an end-to-end supervised training scheme, processing an original skeleton sequence, and encoding the skeleton sequence into a color space-time diagramThe three-dimensional matrix is input into a deep convolution sub-network to extract the characteristics of the three-dimensional matrix and output a vector P 1
3) In the attention sub-network, the hand-made characteristics representing the joint movement degree are extracted, the key nodes of the movement are captured, and a vector P is output 2
4) Finally P is arranged 1 And P 2 And (3) fusing, namely training the model by reducing the loss function through an optimization means until the network converges, and obtaining the final recognition accuracy.
Moreover, the deep convolution sub-network adopts a structure of a laminated convolution neural network, and the attention sub-network adopts a combination of a custom layer and a full connection layer.
And, in step 2)
Wherein P is 1 For deep space-time characteristics output by the deep convolution sub-network, representing the probability, W, of the action belonging to each category in the label space 1 ∈R M×C And b 1 ∈R M×1 Respectively representing a weight matrix and a bias vector of the full connection layer; m represents the number of label categories, C is the output dimension of the deep convolution sub-network;
for the spatiotemporal features extracted from the deep convolutional subnetwork, O represents the color image encoded from the skeleton sequence, GAP represents the global average pooling layer, +.>Is the output of GAP in DenseNet-161, conv is the convolution layer, reLU is the activation function, BN is the batch normalization layer.
And, in step 3)
P 2 =W 2 V+b 2
Wherein P is 2 To be the attention vector, W 2 ∈R M × N And b 2 ∈R M×1 The weight matrix and the bias vector of the full connection layer respectively,
V=V X ⊙V Y ⊙V Z
as indicated by the addition of the elements,
wherein the method comprises the steps ofRespectively represent x k Average value of x k 、y k 、z k X, Y, Z coordinates, x, respectively representing the kth joint in the skeleton sequence k =[x 1,k ,...,x t,k ,...,x T,k ],y k =[y 1,k ,...,y t,k ,...,y T,k ],z k =[z 1,k ,...,z t,k ,...,z T,k ]T represents the number of frames of the framework sequence.
Furthermore, step 4) is specifically: the deep space-time characteristic P obtained by the method 1 And an attention vector P 2 The final result of the action classification is obtained by multiplying by the elements, and the result is expressed as follows:
wherein the method comprises the steps ofRepresenting the predicted result, measuring the true class label y and the predicted result +.>Differences between them.
The invention has the following advantages and beneficial effects:
1. the invention provides a human body action recognition method of a neural network based on an attention mechanism. The invention is based on end-to-end supervised deep learning, does not need to manually extract the features in the training process, and can automatically extract the effective deep features from the data to the maximum extent. The network structure of the present invention comprises two sub-networks that are simultaneously co-trained in an end-to-end fashion without post-processing.
2. In the attention model, the variance characteristic of each joint is extracted through effective linear mapping, and attention weights are learned, so that key points for motion recognition in skeleton data can be effectively captured, and recognition accuracy is remarkably improved on different data sets.
3. In the data processing process, each skeleton sequence is converted into a space-time image without any standardization, and in the network training process, the translation and scale invariance of skeleton data are ensured.
4. The better result compared with the current mainstream attention model can be obtained without introducing the circulating neural network into the attention sub-network, the defect that the circulating neural network is not good at extracting the space information is overcome, the calculated amount of the network is reduced, and the training speed of the network is accelerated.
Drawings
FIG. 1 is a diagram showing a network structure of a human motion recognition method of an attention mechanism according to an embodiment of the present invention;
FIG. 2 is a diagram of a preprocessing procedure for a skeleton sequence;
FIG. 3 is a graph comparing performance of different neural networks on four data sets;
wherein (a) is an NTU-CS dataset; (b) is an NTU-CV dataset; (c) is a SYSU-3D dataset; (d) is a UTD-MHAD dataset.
Detailed Description
The invention will now be described in further detail by way of specific examples, which are given by way of illustration only and not by way of limitation, with reference to the accompanying drawings.
The invention discloses a human body action recognition method of a neural network based on an attention mechanism, which adopts an end-to-end supervised training mode, and effectively improves the accuracy of human body action recognition by introducing a strategy of extracting characteristics of the convolutional neural network and capturing key points of bones by the attention mechanism.
The deep convolution sub-network and the attention sub-network are constructed, and the model comprises a convolution layer, a normalization layer, a full connection layer and the like based on the structural design of the stacked convolution neural network.
Fig. 1 is a network structure of a human motion recognition method of an attention mechanism according to the present invention.
The action recognition network mainly comprises two sub-networks, including a deep convolution sub-network and an attention sub-network.
Wherein the deep convolutional subnetwork adopts DenseNet-161 as a main body part, and the front end coding network part adopts a laminated convolutional network, which comprises 4 blocks, wherein each block consists of a convolutional layer, a normalization layer and a Relu layer. And a transition layer is arranged between each block layer, and the size is 2 x 2, so that pooling and downsampling of the feature map are performed. And finally, adopting a GlobalAverage Pooling layer to carry out global pooling on the feature map, and outputting a result through a softmax layer.
The attention sub-network is composed of 3 variance calculation layers, 1 fusion layer and 1 full connection layer. The key skeletal nodes are captured by calculating their variance during motion for each skeletal node in the input three-dimensional matrix. And multiplying and fusing three variance values of xyz to consider the condition of xyz coordinates, and finally outputting through a full-connection layer, wherein the unit number of the full-connection layer is the action type number of the data set.
The data set adopted by the invention is NTU-RGB+D data set, SYSU-3D data set and UTD-MHAD data set. The NTU-RGB+D data set is photographed by university of Nanyang, is the most authoritative data set in the field of human body action recognition, contains 60 common human body actions, contains 10 double interaction actions, and has two evaluation methods: cross-subjects and cross-views. The SYSYSY-3D data set is shot by the university of Zhongshan, contains 12 types of actions altogether, belongs to a smaller data set, but is high in recognition difficulty due to high similarity between actions. The UTD-MHAD dataset contains 861 sequences, belongs to a medium-scale dataset, and has similar actions as SYSU-3D. All methods of evaluating the data set obey the evaluation specifications in the data set paper.
The frame numbers T of all the skeleton sequences are normalized, and different data sets are normalized to different frame numbers, so that each skeleton sequence in the same data set has the same frame number, and the average value of the frame numbers of most sequences in the data set is generally selected for the frame numbers.
Inputting normalized skeleton sequence A 1 …A t It is transformed into a vector of T x N x 3, where T represents the number of frames, N represents the number of bone nodes in each frame, and 3 is the number of channels. Each row is the coordinates of a different skeletal node of the same frame, and each column is the coordinates of the same skeletal node in a different frame.
The preprocessed T3 vector is input into a deep convolution sub-network taking DenseNet-161 as a main body, and feature extraction and mapping are carried out, and a vector is outputThe specific process is as follows:
where "O" represents the color image encoded from the skeleton sequence, GAP represents the global average pooling layer, is the output of GAP in DenseNet-161, conv is the convolution layer, reLU is the activation function, BN is the batch normalization layer.
Obtained byNamely, the space-time characteristics extracted by the deep convolution sub-network are input into the full connection layer to be mapped to the label space, and the specific process is as follows:
wherein W is 1 ∈R M×C And b 1 ∈R M×1 The weight matrix and the bias vector of the full connection layer are represented respectively. M represents the number of label categories, P 1 And (3) representing the probability that the action belongs to each category in the label space for deep space-time characteristics output by the deep convolution sub-network.
The movement of the joints is represented in the attention subnetwork using hand-made variance features. Input O ε R T×N×3 Is divided into three matrices: x epsilon R T×N ,Y∈R T×N ,Z∈R T×N . To describe the variance feature in detail, X is chosen as an example. Let X epsilon R T ×N The method comprises the following steps:
wherein x is k The X coordinate representing the kth joint in the skeleton sequence can be expressed as:
x k =[x 1,k ,...,x t,k ,...,x T,k ]
x k variance of (2)The calculation is as follows:
wherein the method comprises the steps ofRepresents x k The average value of (2), the output V shown in FIG. 1 X ∈R N×1 Can be expressed as:
calculating V in the same way Y ∈R N×1 And V Z ∈R N×1
Wherein the method comprises the steps ofRespectively represent x k Average value of x k 、y k 、z k X, Y, Z coordinates, x, respectively representing the kth joint in the skeleton sequence k =[x 1,k ,...,x t,k ,...,x T,k ],y k =[y 1,k ,...,y t,k ,...,y T,k ],z k =[z 1,k ,...,z t,k ,...,z T,k ]T represents the number of frames of the framework sequence.
Obtaining the final variance characteristic V epsilon R N×1 The following are provided:
V=V X ⊙V Y ⊙V Z
wherein +.is the element-wise multiplication. The variance feature is used to measure the motion amplitude and importance of each node to capture key nodes for identifying the motion, thereby improving the identification accuracy. Thereafter, the variance V is used for the full-connected layer learning attention weight P 2 ∈R M×1 It can be expressed as:
P 2 =W 2 V+b 2
wherein W is 2 ∈R M×N And b 2 ∈R M×1 The weight matrix and the bias vector of the full connection layer are respectively W 2 Automatically updated during the network training process.
The deep space-time characteristic P obtained by the method 1 And an attention vector P 2 The final result of the action classification is obtained by multiplying by the elements, and the result is expressed as follows:
wherein the method comprises the steps ofRepresenting the final predicted result. Measuring true class labels y and prediction results using cross entropy loss functionDifferences between them.
The invention adopts a keras deep learning framework to carry out experiments, and specific parameters are shown in the following figures:
after model training to convergence, evaluations were made on the NTU-RGB+D dataset, SYSU-3D dataset, UTD-MHAD dataset. The evaluation index is shown in the following table. Among them, MANs, VA-LSTM, etc. belong to other methods, ours (only DCM) to our method, but without attention to the subnetwork, our (dcm+sam) belongs to the complete method described above.
The above description is only of the preferred embodiments of the present invention, but the protection scope of the present invention is not limited thereto, and any person skilled in the art can substitute or change the technical solution and the inventive conception of the present invention equally within the scope of the disclosure of the present invention.

Claims (1)

1. A human body action recognition method of a neural network based on an attention mechanism is characterized by comprising the following steps of: the method comprises the following steps:
1) Constructing a feature extraction and classification neural network, wherein the neural network comprises two sub-models, namely a deep convolution sub-network and an attention sub-network;
2) Constructing an end-to-end supervised training scheme, processing an original skeleton sequence, encoding the skeleton sequence into a three-dimensional matrix composed of color space-time diagrams, inputting the three-dimensional matrix into a deep convolution sub-network for feature extraction, and outputting a vector P 1
3) In the attention sub-network, the hand-made characteristics representing the joint movement degree are extracted, the key nodes of the movement are captured, and a vector P is output 2
4) Finally P is arranged 1 And P 2 Fusing, namely training a model by reducing a loss function through an optimization means until the network converges, and obtaining the final recognition accuracy;
the deep convolution sub-network adopts a structure of a laminated convolution neural network, and the attention sub-network adopts a combination of a custom layer and a full connection layer;
in step 2)
Wherein P is 1 For deep space-time characteristics output by the deep convolution sub-network, representing the probability, W, of the action belonging to each category in the label space 1 ∈R M×C And b 1 ∈R M×1 Respectively representing a weight matrix and a bias vector of the full connection layer; m represents the number of label categories, C is the output dimension of the deep convolution sub-network;
for the spatiotemporal features extracted from the deep convolutional subnetwork, O represents the color image encoded from the skeleton sequence, GAP represents the global average pooling layer, +.>The output of GAP in DenseNet-161, conv is convolution layer, reLU is activation function, BN is batch normalization layer;
in step 3)
P 2 =W 2 V+b 2
Wherein P is 2 To be the attention vector, W 2 ∈R M×N And b 2 ∈R M×1 The weight matrix and the bias vector of the full connection layer respectively,
V=V X ⊙V Y ⊙V Z
as indicated by the addition of the elements,
wherein the method comprises the steps ofMean value, x of X, Y, Z coordinates of kth joint in skeleton sequence k 、y k 、z k X, Y, Z coordinates, x, respectively representing the kth joint in the skeleton sequence k =[x 1,k ,…,x t,k ,…,x T,k ],y k =[y 1,k ,…,y t,k ,…,y T,k ],z k =[z 1,k ,…,z t,k ,…,z T,k ]T represents the number of frames of the framework sequence;
the step 4) is specifically as follows: the deep space-time characteristic P obtained by the method 1 And an attention vector P 2 The final result of the action classification is obtained by multiplying by the elements, and the result is expressed as follows:
wherein the method comprises the steps ofRepresenting the predicted result, measuring the true class label y and the predicted result +.>Differences between them.
CN201910846654.9A 2019-09-09 2019-09-09 Human body action recognition method of neural network based on attention mechanism Active CN110728183B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910846654.9A CN110728183B (en) 2019-09-09 2019-09-09 Human body action recognition method of neural network based on attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910846654.9A CN110728183B (en) 2019-09-09 2019-09-09 Human body action recognition method of neural network based on attention mechanism

Publications (2)

Publication Number Publication Date
CN110728183A CN110728183A (en) 2020-01-24
CN110728183B true CN110728183B (en) 2023-09-22

Family

ID=69217957

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910846654.9A Active CN110728183B (en) 2019-09-09 2019-09-09 Human body action recognition method of neural network based on attention mechanism

Country Status (1)

Country Link
CN (1) CN110728183B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111339942B (en) * 2020-02-26 2022-07-12 山东大学 Method and system for recognizing skeleton action of graph convolution circulation network based on viewpoint adjustment
CN111046980B (en) * 2020-03-16 2020-06-30 腾讯科技(深圳)有限公司 Image detection method, device, equipment and computer readable storage medium
CN111582382B (en) * 2020-05-09 2023-10-31 Oppo广东移动通信有限公司 State identification method and device and electronic equipment
CN111967379B (en) * 2020-08-14 2022-04-08 西北工业大学 Human behavior recognition method based on RGB video and skeleton sequence
CN112130200B (en) * 2020-09-23 2021-07-20 电子科技大学 Fault identification method based on grad-CAM attention guidance
CN112613405B (en) * 2020-12-23 2022-03-25 电子科技大学 Method for recognizing actions at any visual angle
CN112560778B (en) * 2020-12-25 2022-05-27 万里云医疗信息科技(北京)有限公司 DR image body part identification method, device, equipment and readable storage medium
CN112783327B (en) * 2021-01-29 2022-08-30 中国科学院计算技术研究所 Method and system for gesture recognition based on surface electromyogram signals
CN113516242B (en) * 2021-08-10 2024-05-14 中国科学院空天信息创新研究院 Self-attention mechanism-based through-wall radar human body action recognition method
CN114613011A (en) * 2022-03-17 2022-06-10 东华大学 Human body 3D (three-dimensional) bone behavior identification method based on graph attention convolutional neural network

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203503A (en) * 2016-07-08 2016-12-07 天津大学 A kind of action identification method based on skeleton sequence
CN106228109A (en) * 2016-07-08 2016-12-14 天津大学 A kind of action identification method based on skeleton motion track
WO2017133009A1 (en) * 2016-02-04 2017-08-10 广州新节奏智能科技有限公司 Method for positioning human joint using depth image of convolutional neural network
CN107924472A (en) * 2015-06-03 2018-04-17 英乐爱有限公司 Pass through the image classification of brain computer interface
US10089556B1 (en) * 2017-06-12 2018-10-02 Konica Minolta Laboratory U.S.A., Inc. Self-attention deep neural network for action recognition in surveillance videos
CN108830157A (en) * 2018-05-15 2018-11-16 华北电力大学(保定) Human bodys' response method based on attention mechanism and 3D convolutional neural networks
CN108875708A (en) * 2018-07-18 2018-11-23 广东工业大学 Behavior analysis method, device, equipment, system and storage medium based on video
CN109614874A (en) * 2018-11-16 2019-04-12 深圳市感动智能科技有限公司 A kind of Human bodys' response method and system based on attention perception and tree-like skeleton point structure
CN109858406A (en) * 2019-01-17 2019-06-07 西北大学 A kind of extraction method of key frame based on artis information
CN110084228A (en) * 2019-06-25 2019-08-02 江苏德劭信息科技有限公司 A kind of hazardous act automatic identifying method based on double-current convolutional neural networks

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9830709B2 (en) * 2016-03-11 2017-11-28 Qualcomm Incorporated Video analysis with convolutional attention recurrent neural networks
US10387776B2 (en) * 2017-03-10 2019-08-20 Adobe Inc. Recurrent neural network architectures which provide text describing images

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107924472A (en) * 2015-06-03 2018-04-17 英乐爱有限公司 Pass through the image classification of brain computer interface
WO2017133009A1 (en) * 2016-02-04 2017-08-10 广州新节奏智能科技有限公司 Method for positioning human joint using depth image of convolutional neural network
CN106203503A (en) * 2016-07-08 2016-12-07 天津大学 A kind of action identification method based on skeleton sequence
CN106228109A (en) * 2016-07-08 2016-12-14 天津大学 A kind of action identification method based on skeleton motion track
US10089556B1 (en) * 2017-06-12 2018-10-02 Konica Minolta Laboratory U.S.A., Inc. Self-attention deep neural network for action recognition in surveillance videos
CN108830157A (en) * 2018-05-15 2018-11-16 华北电力大学(保定) Human bodys' response method based on attention mechanism and 3D convolutional neural networks
CN108875708A (en) * 2018-07-18 2018-11-23 广东工业大学 Behavior analysis method, device, equipment, system and storage medium based on video
CN109614874A (en) * 2018-11-16 2019-04-12 深圳市感动智能科技有限公司 A kind of Human bodys' response method and system based on attention perception and tree-like skeleton point structure
CN109858406A (en) * 2019-01-17 2019-06-07 西北大学 A kind of extraction method of key frame based on artis information
CN110084228A (en) * 2019-06-25 2019-08-02 江苏德劭信息科技有限公司 A kind of hazardous act automatic identifying method based on double-current convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
丰艳等.基于时空注意力深度网络的视角无关性骨架行为识别.《计算机辅助设计与图形学学报》.2018,第第30卷卷(第第30卷期),第2271-2277页. *

Also Published As

Publication number Publication date
CN110728183A (en) 2020-01-24

Similar Documents

Publication Publication Date Title
CN110728183B (en) Human body action recognition method of neural network based on attention mechanism
CN108520535B (en) Object classification method based on depth recovery information
CN110135319B (en) Abnormal behavior detection method and system
CN110135375B (en) Multi-person attitude estimation method based on global information integration
CN106897670B (en) Express violence sorting identification method based on computer vision
CN107808131B (en) Dynamic gesture recognition method based on dual-channel deep convolutional neural network
CN108460356B (en) Face image automatic processing system based on monitoring system
CN104008370B (en) A kind of video face identification method
CN110580472B (en) Video foreground detection method based on full convolution network and conditional countermeasure network
CN110516533B (en) Pedestrian re-identification method based on depth measurement
CN110390308B (en) Video behavior identification method based on space-time confrontation generation network
CN113963032A (en) Twin network structure target tracking method fusing target re-identification
CN111695523B (en) Double-flow convolutional neural network action recognition method based on skeleton space-time and dynamic information
CN112801015A (en) Multi-mode face recognition method based on attention mechanism
CN111723687A (en) Human body action recognition method and device based on neural network
CN114821764A (en) Gesture image recognition method and system based on KCF tracking detection
Wang et al. Video background/foreground separation model based on non-convex rank approximation RPCA and superpixel motion detection
Shariff et al. Artificial (or) fake human face generator using generative adversarial network (gan) machine learning model
CN103235943A (en) Principal component analysis-based (PCA-based) three-dimensional (3D) face recognition system
CN110348395B (en) Skeleton behavior identification method based on space-time relationship
CN112487926A (en) Scenic spot feeding behavior identification method based on space-time diagram convolutional network
CN116912804A (en) Efficient anchor-frame-free 3-D target detection and tracking method and model
CN116453025A (en) Volleyball match group behavior identification method integrating space-time information in frame-missing environment
CN113192186B (en) 3D human body posture estimation model establishing method based on single-frame image and application thereof
CN113255514B (en) Behavior identification method based on local scene perception graph convolutional network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant