CN111860188A - Human body posture recognition method based on time and channel double attention - Google Patents

Human body posture recognition method based on time and channel double attention Download PDF

Info

Publication number
CN111860188A
CN111860188A CN202010588253.0A CN202010588253A CN111860188A CN 111860188 A CN111860188 A CN 111860188A CN 202010588253 A CN202010588253 A CN 202010588253A CN 111860188 A CN111860188 A CN 111860188A
Authority
CN
China
Prior art keywords
data
attention
channel
convolution
human body
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010588253.0A
Other languages
Chinese (zh)
Inventor
张雷
高文彬
刘悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Normal University
Original Assignee
Nanjing Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Normal University filed Critical Nanjing Normal University
Priority to CN202010588253.0A priority Critical patent/CN111860188A/en
Publication of CN111860188A publication Critical patent/CN111860188A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Psychiatry (AREA)
  • Image Analysis (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The invention discloses a human body posture identification method based on time and channel double attention, which comprises the following steps: the method comprises the steps of collecting original data of various human body actions by using a built-in sensor of the mobile equipment, attaching an attribute label of the action, utilizing a sliding window and normalization processing, segmenting the data into a training sample set and a testing sample set, establishing a deep convolutional neural network model based on time and channel double attention, importing the training sample and the testing sample to train and optimally adjust, and obtaining a recognition result of the human body actions. Due to the superposition of the channel attention and the time sequence attention, the method can accurately position the type and the occurrence time of the target action after being trained by a large amount of coarse-grained training data, greatly reduces the complexity of manually marking the training data, and has very important functions in the aspects of sports, interactive games, medical care, general monitoring systems and the like.

Description

Human body posture recognition method based on time and channel double attention
Technical Field
The invention belongs to the field of intelligent monitoring of wearable equipment, and particularly relates to a human body posture identification method based on time and channel double attention.
Background
In recent years, with the development of computer technology and the popularization of intelligent technology, a new round of global technology change has been entered, and technologies such as large-scale cloud computing, internet of things, big data and artificial intelligence are also rapidly developed. Among them, the human body posture recognition technology is also an important research trend in the related field of computer vision. The application range is very wide, and the device can be used in various fields such as health monitoring, motion detection, man-machine interaction, movie and television production, game entertainment and the like. People can utilize a sensor worn by the human body to collect motion trail data of joint points of the human body to realize gesture recognition, and can also realize that 3D animation simulates human body motion to make movie and TV play and the like.
With the development of intelligent wearable device research, wearable sensor-based human body gesture recognition has become an important research field, and the technology is a technology for judging the human body motion behavior state by analyzing relevant information capable of reflecting human body motion behaviors. The method is applied to health monitoring, indoor positioning and navigation, user social behavior analysis, motion sensing games and the like. However, most of the existing human body posture recognition systems have the problems of low recognition accuracy, low inference speed and the like, so how to establish a high-accuracy network model and maintain the inference speed becomes a problem to be solved urgently.
The most widespread application of human body posture recognition at present is in intelligent monitoring. The intelligent monitoring is different from the common monitoring mainly in that a human body posture recognition technology is embedded into a video server, the behaviors of dynamic objects, namely pedestrians and vehicles, in a monitoring picture scene are recognized and judged by using an algorithm, key information in the behaviors is extracted, and when abnormal behaviors occur, an alarm is sent to a user in time. Similarly, human gesture recognition technology under the fixed scene can be applied to the family control, if for the emergence of prevention solitary old man's the condition of falling, can be through the intelligent supervisory equipment of installation discernment falling gesture at home, to the discernment of solitary old man's the condition of falling, in time make the response when the emergency appears. The continuous development of human society and the continuous improvement of quality of life, video monitoring has been applied to each field very widely, and the field of people's living space is expanding and expanding, public and private place is also developing thereupon, meets the probability of various emergency and is increasing constantly, especially in public place, because its control degree of difficulty is great, the population is intensive. Through simple monitoring, the requirement of current social development can not be met, the human body posture can be predicted with great difficulty by simply depending on the attendance of an operator on duty, and social resources are also potentially wasted. Therefore, the intelligent monitoring system independent of individuals is a necessary way for solving the problem in the current society, in the process of social contact, human body actions except for language can transmit certain information, the meaning of the actions can be read through scientific and reasonable prediction, and people can be better helped to realize social contact.
Deep learning has a good development prospect in pattern recognition. The model architecture represented by the shallow convolutional neural network occupies the mainstream position. The convolutional neural network is greatly concerned in the field of computer vision, can process multidimensional data, and has more obvious effect than the traditional method on the premise of large data volume. Compared with the traditional machine learning methods such as logistic regression, decision trees, Markov models and the like, although the shallow deep learning methods have significant improvement in precision, feature map information is not abundant due to the small number of convolution layers. Meanwhile, the general convolution calculation cannot accurately locate the occurrence time of the target action and the type of the target action in a long string of data. Therefore, how to accurately position the time and the category of the target action while ensuring the model identification precision becomes a problem which is first solved by current researchers.
Disclosure of Invention
The purpose of the invention is as follows: in view of the above problems, an object of the present invention is to provide a human body posture recognition method based on time and channel double attention, which can not only improve the accuracy of a model in a deeper convolutional neural network, but also accurately locate the type of a sensor acting on a channel axis and the occurrence time of a time axis target action.
The technical scheme is as follows: the invention provides a human body posture recognition method based on time and channel double attention, which comprises the following steps:
step1, acquiring human posture and motion signal data (such as lying down, standing up, walking, running, falling down and the like) of each activity type through a motion sensor, and attaching corresponding motion attribute labels to the motion signal data;
step2, preprocessing the collected motion signal data, and dividing the processed data into a training sample set and a testing sample set; the processing comprises the following steps: the data is subjected to sliding window processing with fixed step length, more data can be obtained by reducing the sliding step length, data denoising, null-removing operation and normalization processing are carried out on the data signals obtained through the processing, and the data signals are scaled according to the proportion to fall into a specific (0,1) interval;
step3, taking the processed data as an input sample, sending the processed data into a multi-attention deep convolutional neural network for training, setting a learning rate, an optimizer and a fixed batch, and then continuously reducing the loss value of the deep convolutional neural network model by utilizing gradient descent and updating each weight parameter at the same time until the loss value is smaller than a preset value to obtain a training model;
And Step4, classifying and recognizing the human body posture data to be recognized by using the trained model.
Further, in Step1, the sensor down-sampling frequency is set to be 20Hz to 40Hz when the sensor data signals are collected.
Further, Step2 includes removing outliers and nulls from the data, and rearranging the number of each activity category, so that the data set is subjected to uniform distribution, and 70% and 30% are used as training samples and test samples, respectively.
Further, Step3 specifically includes the following contents:
3.1, establishing a 6-layer convolution attention neural network model. The whole model establishes three convolution blocks together, wherein each block comprises two convolution extraction layers and a jump convolution layer;
a: a convolutional neural network was constructed using 6-layer deep convolution:
the convolutional neural network is constructed using 6 convolutional layers, where each two convolutional layers are built into one convolutional block. Adding a jump convolution layer to the convolution of two layers of each block to maintain the dimension R of the input dataC×H×W(C is the number of data channels, H is the data height, W is the data width) and output data RC×H×W(C' is the number of data channels, H is the data height, and W is the data width) can be linearly weighted. Where C → C' is determined by the channel dimension in the convolutional layer. After the 6 layers of convolutional layers are built, output data are sent to an attention network for linear weighting of attention weight. Then finally sending the data into a full connection layer for human body action classification calculation
B: establishing a channel attention network:
sequentially adding a channel attention network and a time sequence attention network behind two convolutional layers of each block of 3 blocks of the model, and adding the channel attention network to a characteristic diagram channel dimension in order to determine the sensor characteristic importance degree of convolutional characteristics in the channel dimension;
feature time dimension information is first compressed using average pooling and maximum pooling to generate an average timing feature
Figure BDA0002554574220000031
And maximum timing characteristics
Figure BDA0002554574220000032
And then sending the two characteristics to a multilayer perceptron to obtain final output characteristics by using point-by-point summation, wherein the final output characteristics are specifically represented as follows:
MC(F)=σ(MLP(Avgpool(F))+MLP(MaxPool(F)))
where σ denotes the sigmod activation function. MLP represents a standard multi-tier perceptron, AvgPool, Maxpool represent average pooling, maximum pooling operations;
the features generated by the attention network are added to the channel dimension of the sensor metadata, so that the corresponding weight of the channel dimension can be further generated, and the sensor axis corresponding to the human body posture action can be accurately positioned.
C: establishing a time sequence attention network:
and after each convolution block is added with the channel attention network, sequentially establishing a time sequence attention network, and adding time sequence attention to the time dimension of the characteristic diagram in order to determine the accurate position of the target action of the sensor data on the time axis. Channel dimension information is first compressed using average pooling and maximum pooling to generate average channel characteristics
Figure BDA0002554574220000041
And maximum channel characteristics
Figure BDA0002554574220000042
Wherein the content of the first and second substances,
Figure BDA0002554574220000043
representing the channel-averaged pooling characteristics along the time axis,
Figure BDA0002554574220000044
indicating the maximum pooled feature of the channels along the time axis, H, W being the height and width of these features, respectively, and the number of channels for these features being 1. And then linearly superposing the two features, and sending the two features into a convolution layer for convolution to obtain a final convolution feature, which is specifically expressed as:
MT(F)=σ(conv7×1([AvgPool(F);MaxPool(F)]))
where σ denotes the sigmod activation function, conv7×1Representing a convolution layer with a convolution kernel size of 7 multiplied by 1, wherein AvgPool and MaxPool respectively represent average pooling operation and large pooling operation;
3.2, importing training samples to adjust the parameters of the convolutional neural network model to obtain a model with high accuracy
In the convolutional neural network model, the size of a first layer of convolutional kernel is (6, 1), and the step length is (2, 1); the second layer convolution kernel size is (6, 1) and the step size is (2, 1); the size of a convolution kernel in the third layer is (6, 1), and the step length is (2, 1); the convolutional layer filling is set to (1, 0), the activation functions all use ReLu and add BatchNorm layer by layer to reduce the overfitting possibility, and finally, in order to obtain a classification effect with more obvious tendency, the classification is output by a Softmax layer.
Has the advantages that: compared with the prior art, the invention has the following remarkable progress:
The original data is subjected to frequency resampling processing, time dimensionality and feature dimensionality can be fused together for comprehensive consideration, high-precision discrimination is realized after deep convolutional neural network training, linear weighting is carried out on the features obtained by sequentially adding channel attention and time sequence attention to the features extracted by convolution and the original features, richer sensor data features are obtained, the convolutional neural network is trained, and the trained network model is subjected to human body posture recognition. An attention mechanism is added to the channel axis and the time axis of the extracted convolution features, so that the time of target action in the features and the category of the target action can be effectively positioned, the method can be effectively used for identifying coarse-grained human body actions, and the complexity of manually marked data is reduced; the invention adopts the sliding window technology to quickly preprocess the data under the condition of ensuring that the data does not lose the action characteristic, thereby effectively avoiding the defects of the traditional data processing.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a schematic diagram of the present invention;
FIG. 3 is a plot of a small batch of waveform of the raw triaxial acceleration data of the present invention;
FIG. 4 is a graph of error variation for training times according to the present invention;
FIG. 5 is a visualization of the sensor training data channel dimensions in the present invention;
FIG. 6 is a visualization of the time dimension of sensor training data in the present invention;
FIG. 7 is a graph of a confusion matrix for a test data set of the present invention.
Detailed Description
The technical solution and effects of the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention provides a human body posture recognition method based on time and channel double attention, which comprises the following steps:
step1, recruiting volunteers, wearing a movement sensor, recording three-axis acceleration data of the volunteers under different body part (such as wrist, chest, leg and the like) movements (such as standing, sitting, going up stairs, going down stairs, jumping, walking and the like), and attaching corresponding movement type labels to the movement signal data;
step2, cleaning the acquired triaxial acceleration data and removing noise, performing frequency resampling processing on the cleaned data, and dividing the data into a training set and a test set after normalization processing, wherein the frequency resampling processing and normalization processing are as follows: the data is subjected to time series signal frequency down-sampling to be arranged into a data signal diagram, and the data signal diagram obtained by the processing is subjected to normalization processing, namely is scaled to fall into a specific (0,1) interval;
Step3, the processed data is a four-dimensional tensor, and includes data, features and channel information. Then, the processed data is used as an input sample and sent to a convolutional neural network for training, the batch size and the learning rate are set, and a weight parameter is automatically updated by utilizing a back propagation technology to obtain an optimal convolutional neural network model;
and Step4, classifying and recognizing the human body posture data to be recognized by using the trained model.
The human body posture identification method based on the deep convolutional neural network can identify six action postures of jumping, walking, going upstairs, going downstairs, standing and sitting.
FIG. 1 is a flow chart of an object of the present invention, which is to collect data from an original sensor, preprocess the data, input the data to a convolutional neural network for model training, and apply an ideal model obtained after training to human posture recognition data, thereby realizing human posture recognition.
FIG. 2 is a block diagram of a convolutional neural network model based on time and channel dual attention. Which contains six layers of convolution and a final classification layer. The figure also includes the internal structure of the attention module, namely after data is input, the channel attention module generates the channel attention feature and the primary convolution feature to carry out linear weighting, and then the time attention module generates the time sequence attention feature and the channel attention feature to carry out linear weighting sequentially to generate the final sensor data feature.
Specifically, firstly, time series signal frequency resampling and normalization processing are carried out on various types of human posture action signal data collected from a mobile sensor, the processed data are sent to a convolution neural network to be subjected to convolution operation to obtain corresponding convolution characteristics, then the convolution characteristics are sent to a channel attention module and a time series attention module in sequence, and the obtained attention characteristics and original characteristics are subjected to linear weighting to obtain final attention characteristics. The attention mechanism for each convolutional neural network is implemented as shown in the attention module of FIG. 2. The size of convolution kernel F during the experiment was (6, 1), the convolution step size was (2, 1), the convolution pad was set to (1, 0), there were 128 convolution kernels in total, the ReLu activation function was used and the BatchNorm layer was added.
B: attention of the channel:
to determine the sensor feature importance of the convolution feature in the channel dimension, channel attention is added to the feature map channel dimension. Feature time dimension information is first compressed using average pooling and maximum pooling to generate an average timing feature
Figure BDA0002554574220000061
And maximum timing characteristics
Figure BDA0002554574220000062
And then sending the two characteristics to a multilayer perceptron to obtain final output characteristics by using point-by-point summation. The concrete expression is as follows:
MC(F)=σ(MLP(Avgpool(F))+MLP(MaxPool(F)))
Where σ denotes the sigmod activation function.
C: time-series attention:
to determine the exact location of the sensor data on the time axis for the target action, time-series attention is added to the feature map time dimension. First using an averaging cellCompressing channel dimension information by pooling and max-pooling to generate average channel characteristics
Figure BDA0002554574220000063
And maximum channel characteristics
Figure BDA0002554574220000064
And then linearly superposing the two features, and sending the two features into a convolution layer for convolution to obtain the final convolution feature. The concrete expression is as follows:
MT(F)=σ(conv7×1([AvgPool(F);MaxPool(F)]))
where σ denotes the sigmod activation function, conv7×1Represents a convolution layer having a convolution kernel size of 7 × 1.
Then, training samples are led in to adjust the parameters of the convolutional neural network model, and a model with high accuracy is obtained.
And in the network training, the dynamic learning rate is adopted to ensure that the curve oscillation is small, the initial learning rate is set to be 0.001, and every 50epochs is reduced to be 0.1 time of the original rate.
Compared with the traditional convolutional neural network, the method utilizes the deep neural network to extract richer sensor data characteristics, and simultaneously adds channel attention and time sequence attention to the sensor data in sequence to generate the characteristic diagram with attention characteristics. Due to the superposition of the channel attention and the time sequence attention, the method can accurately position the type and the occurrence time of the target action after a large amount of coarse-grained data training, and greatly reduces the complexity of manually marking training data. Through experimental comparison, the method disclosed by the invention is obviously superior to the traditional convolutional neural network in precision and has a better positioning effect.
FIG. 3 is a plot of a small batch of waveforms of raw sensor triaxial acceleration data. The down-sampling frequency of the motion sensor is preferably set to about 33 Hz.
FIG. 4 is a graph of the error variation of the neural network after 500epochs training.
The accuracy of the model of the error map is continuously increased along with the application of the deep convolutional network and the attention module. When the deep convolutional network and the double attention module act simultaneously, the model can obtain the optimal precision, and the generalization capability of the model is greatly improved.
FIG. 5 is a visualization of the sensor training data channel dimensions in the present invention.
Through the visualization of the training data channel dimension, the sensor type which plays a role can be distinguished by the channel which plays a role in the sensor channel dimension when the target action occurs, and the method has important significance for guiding the aspects of human action recognition, health detection and the like.
FIG. 6 is a visualization of the time dimension of sensor training data in the present invention.
Through the visualization of the training data time sequence dimension, the occurrence area and the category of the target action can be accurately positioned in a long string of sensing data containing the background action. By using the semi-supervised attention mechanism, a more accurate human body action classification result can be obtained by training a large amount of rough label sample data, the strict marking property of training data is reduced, the complexity of manual marking data is saved, and the human body gesture recognition industry is more convenient and faster.
FIG. 7 is a graph of a confusion matrix for a test data set of the present invention.
Confusion matrices are techniques used to summarize the performance of classification algorithms. If the number of samples in each class is not equal, or there are more than two classes in the dataset, then misleading may occur if only the classification accuracy is used as the criterion. Computing the confusion matrix allows us to better understand how the classification model behaves and what types of errors it makes. In the figure, we can see that the horizontal axis is the predicted result, the vertical axis is the true labeled result, and the main diagonal is the same number of samples as the predicted result and the true result.
By analyzing the confusion matrix, the recognition precision conditions of the convolutional neural network model to different actions can be obtained, so that the network parameters can be modified. The final model classification precision is 98.87, which meets the requirements of practical application.

Claims (4)

1. A human body posture recognition method based on time and channel double attention is characterized by comprising the following steps:
step1, acquiring human body posture action signal data of each activity type through a mobile sensor, and attaching corresponding action attribute labels to the action signal data;
step2, preprocessing the collected motion signal data, and dividing the processed data into a training sample set and a testing sample set; the processing comprises the following steps: carrying out fixed-step sliding window processing on the data, cutting the original long-section sensor data into data with fixed size, carrying out data denoising, null-removing operation and normalization processing on the processed data signal, and scaling the data signal according to the proportion to make the data fall into a specific (0, 1) interval;
Step3, inputting the processed data serving as an input sample into a multi-attention deep convolutional neural network for training, setting a learning rate, an optimizer and a fixed batch, continuously reducing the loss value of a deep convolutional neural network model by using gradient descent, and updating each weight parameter until the loss value is smaller than a preset value to obtain a training model;
and Step4, classifying and recognizing the human body posture data to be recognized by using the trained model.
2. The method for recognizing human body posture based on time and channel double attention of claim 1, characterized in that in Step1, the sensor down-sampling frequency is set to 20Hz-40Hz when the sensor data signal is collected.
3. The human body posture recognition method based on time and channel double attention as claimed in claim 1, characterized in that: in Step2, the data processing includes removing outliers and nulls from the data, and rearranging the number of each activity category, so that the data set is subjected to uniform distribution, and 70% and 30% are used as training samples and test samples, respectively.
4. The method for recognizing human body posture based on time and channel double attention as claimed in claim 1, wherein Step3 specifically comprises the following steps:
3.1, establishing a 6-layer convolution attention neural network model, and establishing three convolution blocks in the whole model, wherein each block comprises two convolution extraction layers and a jump convolution layer;
a: a convolutional neural network was constructed using 6-layer deep convolution:
constructing a convolutional neural network by using 6 convolutional layers, wherein each two convolutional layers are constructed into a convolutional block, and a jumper convolutional layer is added to the two layers of convolution of each block and used for keeping the dimension R of input dataC×H×WAnd output data dimension RC×H×WLinear weighting can be carried out, wherein C → C 'is determined by the channel dimension number in the convolutional layers, C input is the number of data channels, H is the data height, W is the data width, C' is the number of output data channels, after 6 layers of convolutional layers are built, output data are sent to an attention network for linear weighting of attention weight, and then are sent to a full-connection layer for classification calculation of human body actions;
b: establishing a channel attention network:
sequentially adding a channel attention network and a time sequence attention network behind two convolutional layers of each block of the 3 blocks, and adding the channel attention network to the channel dimension of the feature diagram in order to determine the sensor feature importance degree of the convolutional features in the channel dimension; compressing feature time dimension information using average pooling and maximum pooling to generate average timing features
Figure FDA0002554574210000021
And maximum timing characteristics
Figure FDA0002554574210000022
And then sending the two characteristics to a multilayer perceptron to obtain final output characteristics by using point-by-point summation, wherein the final output characteristics are specifically represented as follows:
Figure FDA0002554574210000023
wherein, sigma represents sigmod activation function, MLP represents a standard multilayer perceptron, and AvgPool and Maxpool represent average pooling and maximum pooling operations;
c: establishing a time sequence attention network:
after adding the channel attention network to each convolution block, sequentially establishing a time sequence attention network, adding time sequence attention to the time dimension of the feature map in order to determine the accurate position of the target action of the sensor data on a time axis, firstly compressing channel dimension information by using average pooling and maximum pooling to generate average channel features
Figure FDA0002554574210000024
And maximum channel characteristics
Figure FDA0002554574210000025
Wherein the content of the first and second substances,
Figure FDA0002554574210000026
representing the channel-averaged pooling characteristics along the time axis,
Figure FDA0002554574210000027
the method comprises the steps of representing the maximum pooling characteristic of channels along a time axis, enabling the number of the channels of all the characteristics to be 1, linearly superposing the two characteristics, feeding the two characteristics into a convolution layer for convolution to obtain a final convolution characteristic, and specifically representing the maximum pooling characteristic as follows:
Figure FDA0002554574210000028
where σ denotes the sigmod activation function, conv7×1Representing a convolution layer with a convolution kernel size of 7 multiplied by 1, wherein AvgPool and MaxPool respectively represent average pooling operation and large pooling operation;
3.2, importing training samples to adjust the parameters of the convolutional neural network model, and training the model
In the convolutional neural network model, the size of a first layer of convolutional kernel is (6, 1), and the step length is (2, 1); the second layer convolution kernel size is (6, 1) and the step size is (2, 1); the size of a convolution kernel in the third layer is (6, 1), and the step length is (2, 1); convolutional layer padding is set to (1, 0), activation functions are all using ReLu and adding BatchNorm layer by layer to reduce overfitting, and finally output classification after Softmax layer.
CN202010588253.0A 2020-06-24 2020-06-24 Human body posture recognition method based on time and channel double attention Pending CN111860188A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010588253.0A CN111860188A (en) 2020-06-24 2020-06-24 Human body posture recognition method based on time and channel double attention

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010588253.0A CN111860188A (en) 2020-06-24 2020-06-24 Human body posture recognition method based on time and channel double attention

Publications (1)

Publication Number Publication Date
CN111860188A true CN111860188A (en) 2020-10-30

Family

ID=72989342

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010588253.0A Pending CN111860188A (en) 2020-06-24 2020-06-24 Human body posture recognition method based on time and channel double attention

Country Status (1)

Country Link
CN (1) CN111860188A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112468509A (en) * 2020-12-09 2021-03-09 湖北松颢科技有限公司 Deep learning technology-based automatic flow data detection method and device
CN113283298A (en) * 2021-04-26 2021-08-20 西安交通大学 Real-time behavior identification method based on time attention mechanism and double-current network
CN113298083A (en) * 2021-02-25 2021-08-24 阿里巴巴集团控股有限公司 Data processing method and device
CN114169374A (en) * 2021-12-10 2022-03-11 湖南工商大学 Cable-stayed bridge stay cable damage identification method and electronic equipment
CN115017961A (en) * 2022-08-05 2022-09-06 江苏江海润液设备有限公司 Intelligent control method of lubricating equipment based on neural network data set augmentation
CN116010816A (en) * 2022-12-28 2023-04-25 南京大学 LRF large-kernel attention convolution network activity identification method based on large receptive field
CN117033979A (en) * 2023-09-04 2023-11-10 中国人民解放军空军预警学院 Space target identification method with same shape and micro-motion form as inclusion relation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062170A (en) * 2017-12-15 2018-05-22 南京师范大学 Multi-class human posture recognition method based on convolutional neural networks and intelligent terminal
CN108345846A (en) * 2018-01-29 2018-07-31 华东师范大学 A kind of Human bodys' response method and identifying system based on convolutional neural networks
CN110991511A (en) * 2019-11-26 2020-04-10 中原工学院 Sunflower crop seed sorting method based on deep convolutional neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062170A (en) * 2017-12-15 2018-05-22 南京师范大学 Multi-class human posture recognition method based on convolutional neural networks and intelligent terminal
CN108345846A (en) * 2018-01-29 2018-07-31 华东师范大学 A kind of Human bodys' response method and identifying system based on convolutional neural networks
CN110991511A (en) * 2019-11-26 2020-04-10 中原工学院 Sunflower crop seed sorting method based on deep convolutional neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SANGHYUN WOO 等: "《CBAM: Convolutional Block Attention Module》", 《THE EUROPEAN CONFERENCE ON COMPUTER VISION (ECCV)2018》, 14 September 2018 (2018-09-14), pages 1 - 9 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112468509A (en) * 2020-12-09 2021-03-09 湖北松颢科技有限公司 Deep learning technology-based automatic flow data detection method and device
CN113298083A (en) * 2021-02-25 2021-08-24 阿里巴巴集团控股有限公司 Data processing method and device
CN113283298A (en) * 2021-04-26 2021-08-20 西安交通大学 Real-time behavior identification method based on time attention mechanism and double-current network
CN113283298B (en) * 2021-04-26 2023-01-03 西安交通大学 Real-time behavior identification method based on time attention mechanism and double-current network
CN114169374A (en) * 2021-12-10 2022-03-11 湖南工商大学 Cable-stayed bridge stay cable damage identification method and electronic equipment
CN114169374B (en) * 2021-12-10 2024-02-20 湖南工商大学 Cable-stayed bridge stay cable damage identification method and electronic equipment
CN115017961A (en) * 2022-08-05 2022-09-06 江苏江海润液设备有限公司 Intelligent control method of lubricating equipment based on neural network data set augmentation
CN116010816A (en) * 2022-12-28 2023-04-25 南京大学 LRF large-kernel attention convolution network activity identification method based on large receptive field
CN116010816B (en) * 2022-12-28 2023-09-08 南京大学 LRF large-kernel attention convolution network activity identification method based on large receptive field
US11989935B1 (en) 2022-12-28 2024-05-21 Nanjing University Activity recognition method of LRF large-kernel attention convolution network based on large receptive field
CN117033979A (en) * 2023-09-04 2023-11-10 中国人民解放军空军预警学院 Space target identification method with same shape and micro-motion form as inclusion relation
CN117033979B (en) * 2023-09-04 2024-06-04 中国人民解放军空军预警学院 Space target identification method with same shape and micro-motion form as inclusion relation

Similar Documents

Publication Publication Date Title
CN111860188A (en) Human body posture recognition method based on time and channel double attention
Lester et al. A hybrid discriminative/generative approach for modeling human activities
CN112560723B (en) Fall detection method and system based on morphological recognition and speed estimation
CN106909938B (en) Visual angle independence behavior identification method based on deep learning network
Hou A study on IMU-based human activity recognition using deep learning and traditional machine learning
Bu Human motion gesture recognition algorithm in video based on convolutional neural features of training images
CN110826453A (en) Behavior identification method by extracting coordinates of human body joint points
CN111753683A (en) Human body posture identification method based on multi-expert convolutional neural network
CN112464738B (en) Improved naive Bayes algorithm user behavior identification method based on mobile phone sensor
KR102637133B1 (en) On-device activity recognition
CN109389035A (en) Low latency video actions detection method based on multiple features and frame confidence score
CN111723662B (en) Human body posture recognition method based on convolutional neural network
US20230401466A1 (en) Method for temporal knowledge graph reasoning based on distributed attention
Parameswari et al. Human activity recognition using SVM and deep learning
CN115346272A (en) Real-time tumble detection method based on depth image sequence
CN114241270A (en) Intelligent monitoring method, system and device for home care
Li et al. [Retracted] Human Motion Representation and Motion Pattern Recognition Based on Complex Fuzzy Theory
CN115546491A (en) Fall alarm method, system, electronic equipment and storage medium
Xu et al. Comparative studies on activity recognition of elderly people living alone
Gui et al. An approach to extract state information from multivariate time series
Yuan et al. The Human Continuity Activity Semisupervised Recognizing Model for Multiview IoT Network
Xu et al. Fall detection based on person detection and multi-target tracking
CN111860191A (en) Human body posture identification method based on channel selection convolutional neural network
Deepan et al. An intelligent robust one dimensional har-cnn model for human activity recognition using wearable sensor data
Wang et al. Analysis of Digital Long Jump Take‐off Wearable Sensor Monitoring System

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination