CN108985223A - A kind of human motion recognition method - Google Patents
A kind of human motion recognition method Download PDFInfo
- Publication number
- CN108985223A CN108985223A CN201810766185.5A CN201810766185A CN108985223A CN 108985223 A CN108985223 A CN 108985223A CN 201810766185 A CN201810766185 A CN 201810766185A CN 108985223 A CN108985223 A CN 108985223A
- Authority
- CN
- China
- Prior art keywords
- network
- sequence
- deep learning
- light stream
- recognition method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000013135 deep learning Methods 0.000 claims abstract description 32
- 238000000605 extraction Methods 0.000 claims abstract description 12
- 238000012549 training Methods 0.000 claims abstract description 10
- 230000001149 cognitive effect Effects 0.000 claims abstract description 6
- 239000000284 extract Substances 0.000 claims description 8
- 230000015572 biosynthetic process Effects 0.000 claims description 7
- 238000009432 framing Methods 0.000 claims description 4
- 210000002569 neuron Anatomy 0.000 claims description 3
- 230000003252 repetitive effect Effects 0.000 claims description 3
- 238000001514 detection method Methods 0.000 abstract description 7
- 238000012360 testing method Methods 0.000 abstract description 2
- 238000012545 processing Methods 0.000 description 6
- 238000010801 machine learning Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 206010016256 fatigue Diseases 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The present invention proposes a kind of human motion recognition method, this method is based on depth learning technology, the human motion recognition method includes training and two stages of identification, it include sequence signature extraction module in network used in trained and cognitive phase, sequence signature extraction module includes the CNN network of cromogram deep learning network, light stream deep learning network, the cromogram deep learning network include three layers LSTM layer, the light stream deep learning network including two layers LSTM layers.After increasing LSTM layers, so that recognition methods has the ability of Chief Learning Officer, CLO's image sequence, therefore the timing information of sequence video can be preferably utilized, effectively improve accuracy in detection.Used the convolutional network using four-layer structure in deep learning network simultaneously, convolutional network for changing condition code receptive field so that a part of image in image sequence also assists in the determination of testing result.
Description
Technical field
The invention belongs to machine learning field, especially a kind of human motion recognition method.
Background technique
Traditional human action identification is the body that the acquisition such as biosensor or mechanics sensor equipment is attached to people
On, it is a kind of motion detection method of contact, dislike or sense tired out can be brought to people.With the development of technology, this knowledge
Other mode is gradually substituted by the recognition methods based on image.
The it is proposed of deep learning so that making a breakthrough property of machine learning progress, also brought for human body action recognition new
Developing direction.Different from traditional recognition methods, deep learning can automatically learn from the feature of low level high-level out
Feature, solve the problems, such as that time-consuming for Feature Selection excessively dependence task itself and adjustment process.
Summary of the invention
The identification of human action directly uses full articulamentum in the prior art, and detection is done based on entire feature, this
Sample can be led to the problem of, for example, when movement than it is very fast when, the unit that sets is complete when having the sequence of pictures length of movement than detection
Whole sequence length is much smaller, at this moment just will appear the problem of motion detection does not come out.Sequence is not accounted in the prior art simultaneously
The historical information of image, detection accuracy need to be improved.A kind of human motion recognition method, the technical side of use are designed based on this
Case is as follows:
A kind of human motion recognition method, the human motion recognition method be based on deep learning technology, including training and
Identify two stages, network used in trained and cognitive phase includes sequence signature extraction module, sequence signature extraction module
CNN network including cromogram deep learning network, light stream deep learning network, the cromogram deep learning network include three
Layer LSTM layer, the light stream deep learning network including two layers LSTM layers.
Further, the neuron number in hidden layer in LSTM layers described is 200.
Further, the training stage comprising steps of
Step 1. obtains action video, is split into framing image, calculates light stream figure, and is spaced 16 frames and extracts a frame figure
As being used as formation center frame, operating position is marked out;
Video sequence image is distinguished formation sequence picture sample to step 2. and label, center frame picture sample and position are marked
Label and sequence light stream picture sample and label are for training corresponding Feature Selection Model;
Sequence of pictures sample and label are sent into cromogram deep learning network by step 3., by center frame picture sample and position
It sets label and is sent into CNN network, sequence light stream picture sample is sent into light stream deep learning network, carries out feature extraction;
Step 4. merges the feature that above three network model extracts, and generates feature corresponding with video sequence
Code;
Condition code is sent into convolutional network by step 5., carries out different time scales to the receptive field of video sequence characteristics
Variation;
The different condition code sample of receptive field is sent into video identification network by step 6., generates identification model;
Step 7. repetitive exercise is until identification model restrains.
Further, the condition code of cognitive phase video sequence is generated by the sequence signature extraction module, condition code warp
After convolutional network changes receptive field, then is identified and classified.
Further, the convolutional network uses four-layer structure.
Compared with prior art, the beneficial effects of the present invention are:
1. the deep learning network structure redesigned can preferably extract the feature of video sequence, action recognition precision
It is high.
2. receptive field variation is carried out to video sequence characteristics code using four layers of convolutional network, before guaranteeing to identify real-time
Effective solution is put when the sequence of pictures length containing movement in sequence image is more much smaller than complete sequence length, movement
The problem of can not be detected.
Detailed description of the invention
Fig. 1 is model training flow chart of the present invention;
Fig. 2 is cromogram deep learning job stream of network journey figure;
Fig. 3 is light stream deep learning job stream of network journey figure;
Fig. 4 is CNN job stream of network journey figure;
Fig. 5 is action recognition flow chart of the present invention;
Fig. 6 is convolutional layer job stream of network journey figure.
Specific embodiment
As shown in Figure 1, the training stage in a kind of human motion recognition method of the present invention includes:
Step 1. obtains action video, is split into framing image, calculates light stream figure, and is spaced 16 frames and extracts a frame figure
As being used as formation center frame, operating position is marked out;
Video sequence image is respectively fed to image sequence processing unit, center frame image processing unit and light stream by step 2.
Series processing unit, formation sequence picture sample and label, center frame picture sample and location tags and sequence light flow graph piece sample
Sheet and label, for training corresponding Feature Selection Model;
Sequence of pictures sample and label are sent into cromogram deep learning network by step 3., by center frame picture sample and position
It sets label and is sent into CNN network, sequence light stream picture sample is sent into light stream deep learning network, carries out feature extraction;
Step 4. merges the feature that above three network model extracts, and generates feature corresponding with video sequence
Code;
Condition code is sent into convolutional network by step 5., carries out different time scales to the receptive field of video sequence characteristics
Variation;
The different condition code sample of receptive field is sent into video identification network by step 6., generates identification model;
Step 7. repetitive exercise is until identification model restrains.
Wherein image sequence processing unit, center frame image processing unit, light stream series processing unit, cromogram depth
It practises network, CNN network, light stream deep learning network and Fusion Features unit and constitutes sequence signature extraction module.
Because human action is continuous, and acquired image frame is discrete, therefore the history letter of previous frame image
Breath is related to the image of present frame.Deep learning network main frame is CNN network, and the present invention constructs on its basis
Cromogram deep learning network and light stream deep learning network.Wherein CNN network uses SSD network layer, closes for extracting
The more specific location information acted in key frame.As shown in Figures 2 and 3, the cromogram deep learning network increases three layers of LSTM
Layer, the light stream deep learning network increase two layers of LSTM layers.Wherein the hidden layer in LSTM layers has 200 neurons.Increase
After adding LSTM layers, so that recognition methods has the ability of Chief Learning Officer, CLO's image sequence.Compared to being identified only with single frames picture
Algorithm, the present invention using reconstruct deep learning network recognition methods can preferably using sequence video timing believe
Breath, effectively improves accuracy in detection.
As shown in figure 5, the cognitive phase in a kind of human motion recognition method of the present invention includes:
Step 1. obtains action video, is split into framing image, calculates light stream figure, and is spaced 16 frames and extracts a frame figure
As being used as formation center frame, operating position is marked out;
Step 2. generates condition code corresponding with video sequence using sequence signature extraction module;
Condition code is sent into convolutional network by step 3., carries out different time scales to the receptive field of video sequence characteristics
Variation;
Step 4. classifies the different condition code of receptive field;
Step 5. obtains human action recognition result.
As shown in fig. 6, convolutional network used in trained and identification process is four-layer structure, convolutional network is for changing spy
The receptive field of code is levied, condition code just changes four receptive fields after four layers of convolutional layer.The purpose for changing receptive field is to make
A part of image in certain length sequence also assists in the determination of testing result, i.e., the result is that passing through entire condition code data
It is codetermined with Partial Feature code data.The convolution net is made of timing convolution, and every layer of convolution is one-dimensional using conv9's
Convolution, step-length 1, each convolutional layer is with a unification pond layer.
The foregoing is merely the preferred embodiments of the invention, are not intended to limit the invention creation, all at this
Within the spirit and principle of innovation and creation, any modification, equivalent replacement, improvement and so on should be included in the invention
Protection scope within.
Claims (6)
1. a kind of human motion recognition method, this method is based on deep learning technology, which is characterized in that the human action identification
Method includes training and two stages of identification, includes sequence signature extraction module, sequence in network used in trained and cognitive phase
Column characteristic extracting module includes the CNN network of cromogram deep learning network, light stream deep learning network, the cromogram depth
Learning network include three layers LSTM layer, the light stream deep learning network including two layers LSTM layers.
2. a kind of human motion recognition method as described in claim 1, which is characterized in that in the hidden layer in LSTM layers described
Neuron number is 200.
3. a kind of human motion recognition method as described in claim 1, which is characterized in that the training stage comprising steps of
Step 1. obtains action video, is split into framing image, calculates light stream figure, and is spaced 16 frames and extracts frame image work
For formation center frame, operating position is marked out;
Video sequence image is distinguished formation sequence picture sample and label by step 2.;Center frame picture sample position and label;
Sequence light stream picture sample and label, for training corresponding Feature Selection Model;
Sequence of pictures sample and label are sent into cromogram deep learning network by step 3., and center frame picture sample and position are marked
Label are sent into CNN network, and sequence light stream picture sample is sent into light stream deep learning network, carries out feature extraction;
Step 4. merges the feature that above three network model extracts, and generates condition code corresponding with video sequence;
Condition code is sent into convolutional network by step 5., and the change of different time scales is carried out to the receptive field of video sequence characteristics
Change;
The different condition code sample of receptive field is sent into video identification network by step 6., generates identification model;
Step 7. repetitive exercise is until identification model restrains.
4. a kind of human motion recognition method as described in claim 1, which is characterized in that the condition code of cognitive phase video sequence
It is generated by the sequence signature extraction module, condition code is after convolutional network changes receptive field, then is identified.
5. a kind of human motion recognition method of any one as described in claim 3 or 4, which is characterized in that the convolutional network
Using four-layer structure.
6. a kind of human motion recognition method of any one as claimed in claim 5, which is characterized in that in the convolutional network
Every layer of convolutional layer uses one-dimensional convolution, and step-length 1, each convolutional layer is with a unification pond layer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810766185.5A CN108985223B (en) | 2018-07-12 | 2018-07-12 | Human body action recognition method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810766185.5A CN108985223B (en) | 2018-07-12 | 2018-07-12 | Human body action recognition method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108985223A true CN108985223A (en) | 2018-12-11 |
CN108985223B CN108985223B (en) | 2024-05-07 |
Family
ID=64537893
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810766185.5A Active CN108985223B (en) | 2018-07-12 | 2018-07-12 | Human body action recognition method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108985223B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109685213A (en) * | 2018-12-29 | 2019-04-26 | 百度在线网络技术(北京)有限公司 | A kind of acquisition methods, device and the terminal device of training sample data |
CN109902565A (en) * | 2019-01-21 | 2019-06-18 | 深圳市烨嘉为技术有限公司 | The Human bodys' response method of multiple features fusion |
CN109919031A (en) * | 2019-01-31 | 2019-06-21 | 厦门大学 | A kind of Human bodys' response method based on deep neural network |
CN110084259A (en) * | 2019-01-10 | 2019-08-02 | 谢飞 | A kind of facial paralysis hierarchical synthesis assessment system of combination face texture and Optical-flow Feature |
CN110544301A (en) * | 2019-09-06 | 2019-12-06 | 广东工业大学 | Three-dimensional human body action reconstruction system, method and action training system |
CN112257568A (en) * | 2020-10-21 | 2021-01-22 | 中国人民解放军国防科技大学 | Intelligent real-time supervision and error correction system and method for individual soldier queue actions |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104933417A (en) * | 2015-06-26 | 2015-09-23 | 苏州大学 | Behavior recognition method based on sparse spatial-temporal characteristics |
CN106845351A (en) * | 2016-05-13 | 2017-06-13 | 苏州大学 | It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term |
CN107273800A (en) * | 2017-05-17 | 2017-10-20 | 大连理工大学 | A kind of action identification method of the convolution recurrent neural network based on attention mechanism |
CN107292247A (en) * | 2017-06-05 | 2017-10-24 | 浙江理工大学 | A kind of Human bodys' response method and device based on residual error network |
CN107463949A (en) * | 2017-07-14 | 2017-12-12 | 北京协同创新研究院 | A kind of processing method and processing device of video actions classification |
CN108108699A (en) * | 2017-12-25 | 2018-06-01 | 重庆邮电大学 | Merge deep neural network model and the human motion recognition method of binary system Hash |
CN108229338A (en) * | 2017-12-14 | 2018-06-29 | 华南理工大学 | A kind of video behavior recognition methods based on depth convolution feature |
-
2018
- 2018-07-12 CN CN201810766185.5A patent/CN108985223B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104933417A (en) * | 2015-06-26 | 2015-09-23 | 苏州大学 | Behavior recognition method based on sparse spatial-temporal characteristics |
CN106845351A (en) * | 2016-05-13 | 2017-06-13 | 苏州大学 | It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term |
CN107273800A (en) * | 2017-05-17 | 2017-10-20 | 大连理工大学 | A kind of action identification method of the convolution recurrent neural network based on attention mechanism |
CN107292247A (en) * | 2017-06-05 | 2017-10-24 | 浙江理工大学 | A kind of Human bodys' response method and device based on residual error network |
CN107463949A (en) * | 2017-07-14 | 2017-12-12 | 北京协同创新研究院 | A kind of processing method and processing device of video actions classification |
CN108229338A (en) * | 2017-12-14 | 2018-06-29 | 华南理工大学 | A kind of video behavior recognition methods based on depth convolution feature |
CN108108699A (en) * | 2017-12-25 | 2018-06-01 | 重庆邮电大学 | Merge deep neural network model and the human motion recognition method of binary system Hash |
Non-Patent Citations (5)
Title |
---|
JEFF DONAHUE 等: "Long-term Recurrent Convolutional Networks for Visual Recognition and Description", 《2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》, pages 1 * |
JEFF DONAHUE 等: "Long-term Recurrent Convolutional Networks for Visual Recognition and Description", 2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 15 October 2015 (2015-10-15), pages 1 * |
SHREYANK JYOTI 等: "Expression Empowered ResiDen Network for Facial Action Unit Detection", ARXIV, 14 June 2018 (2018-06-14), pages 1 * |
王昕培: "基于双流CNN的异常行为分类算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, vol. 2018, no. 2, pages 138 - 2191 * |
阳平 等: "一种基于融合多传感器信息的手语手势识别方法", 航天医学与医学工程, vol. 25, no. 4, 31 August 2012 (2012-08-31) * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109685213A (en) * | 2018-12-29 | 2019-04-26 | 百度在线网络技术(北京)有限公司 | A kind of acquisition methods, device and the terminal device of training sample data |
CN110084259A (en) * | 2019-01-10 | 2019-08-02 | 谢飞 | A kind of facial paralysis hierarchical synthesis assessment system of combination face texture and Optical-flow Feature |
CN110084259B (en) * | 2019-01-10 | 2022-09-20 | 谢飞 | Facial paralysis grading comprehensive evaluation system combining facial texture and optical flow characteristics |
CN109902565A (en) * | 2019-01-21 | 2019-06-18 | 深圳市烨嘉为技术有限公司 | The Human bodys' response method of multiple features fusion |
CN109919031A (en) * | 2019-01-31 | 2019-06-21 | 厦门大学 | A kind of Human bodys' response method based on deep neural network |
CN109919031B (en) * | 2019-01-31 | 2021-04-09 | 厦门大学 | Human behavior recognition method based on deep neural network |
CN110544301A (en) * | 2019-09-06 | 2019-12-06 | 广东工业大学 | Three-dimensional human body action reconstruction system, method and action training system |
CN112257568A (en) * | 2020-10-21 | 2021-01-22 | 中国人民解放军国防科技大学 | Intelligent real-time supervision and error correction system and method for individual soldier queue actions |
CN112257568B (en) * | 2020-10-21 | 2022-09-20 | 中国人民解放军国防科技大学 | Intelligent real-time supervision and error correction system and method for individual soldier queue actions |
Also Published As
Publication number | Publication date |
---|---|
CN108985223B (en) | 2024-05-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108985223A (en) | A kind of human motion recognition method | |
Zhao et al. | Single image action recognition using semantic body part actions | |
CN106960206A (en) | Character identifying method and character recognition system | |
CN107967695B (en) | A kind of moving target detecting method based on depth light stream and morphological method | |
CN110263833A (en) | Based on coding-decoding structure image, semantic dividing method | |
CN108984530A (en) | A kind of detection method and detection system of network sensitive content | |
CN106023220A (en) | Vehicle exterior part image segmentation method based on deep learning | |
CN107122375A (en) | The recognition methods of image subject based on characteristics of image | |
CN108427942A (en) | A kind of palm detection based on deep learning and crucial independent positioning method | |
CN106778796A (en) | Human motion recognition method and system based on hybrid cooperative model training | |
CN113723312B (en) | Rice disease identification method based on visual transducer | |
CN110073369A (en) | The unsupervised learning technology of time difference model | |
CN110503077A (en) | A kind of real-time body's action-analysing method of view-based access control model | |
CN109977791A (en) | A kind of hand physiologic information detection method | |
CN107909034A (en) | A kind of method for detecting human face, device and computer-readable recording medium | |
CN109522961A (en) | A kind of semi-supervision image classification method based on dictionary deep learning | |
CN108960171B (en) | Method for converting gesture recognition into identity recognition based on feature transfer learning | |
CN105404865A (en) | Probability state restricted Boltzmann machine cascade based face detection method | |
Li et al. | Dating ancient paintings of Mogao Grottoes using deeply learnt visual codes | |
Narayanan et al. | Yoga pose detection using deep learning techniques | |
CN108595014A (en) | A kind of real-time dynamic hand gesture recognition system and method for view-based access control model | |
CN108717548A (en) | A kind of increased Activity recognition model update method of facing sensing device dynamic and system | |
CN113705507B (en) | Mixed reality open set human body gesture recognition method based on deep learning | |
CN103544468B (en) | 3D facial expression recognizing method and device | |
CN110008847A (en) | A kind of stroke recognition methods based on convolutional neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |