CN108985223A - A kind of human motion recognition method - Google Patents

A kind of human motion recognition method Download PDF

Info

Publication number
CN108985223A
CN108985223A CN201810766185.5A CN201810766185A CN108985223A CN 108985223 A CN108985223 A CN 108985223A CN 201810766185 A CN201810766185 A CN 201810766185A CN 108985223 A CN108985223 A CN 108985223A
Authority
CN
China
Prior art keywords
network
sequence
deep learning
light stream
recognition method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810766185.5A
Other languages
Chinese (zh)
Other versions
CN108985223B (en
Inventor
张德馨
史玉坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Isecure Technology Co ltd
Original Assignee
Tianjin Isecure Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Isecure Technology Co ltd filed Critical Tianjin Isecure Technology Co ltd
Priority to CN201810766185.5A priority Critical patent/CN108985223B/en
Publication of CN108985223A publication Critical patent/CN108985223A/en
Application granted granted Critical
Publication of CN108985223B publication Critical patent/CN108985223B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The present invention proposes a kind of human motion recognition method, this method is based on depth learning technology, the human motion recognition method includes training and two stages of identification, it include sequence signature extraction module in network used in trained and cognitive phase, sequence signature extraction module includes the CNN network of cromogram deep learning network, light stream deep learning network, the cromogram deep learning network include three layers LSTM layer, the light stream deep learning network including two layers LSTM layers.After increasing LSTM layers, so that recognition methods has the ability of Chief Learning Officer, CLO's image sequence, therefore the timing information of sequence video can be preferably utilized, effectively improve accuracy in detection.Used the convolutional network using four-layer structure in deep learning network simultaneously, convolutional network for changing condition code receptive field so that a part of image in image sequence also assists in the determination of testing result.

Description

A kind of human motion recognition method
Technical field
The invention belongs to machine learning field, especially a kind of human motion recognition method.
Background technique
Traditional human action identification is the body that the acquisition such as biosensor or mechanics sensor equipment is attached to people On, it is a kind of motion detection method of contact, dislike or sense tired out can be brought to people.With the development of technology, this knowledge Other mode is gradually substituted by the recognition methods based on image.
The it is proposed of deep learning so that making a breakthrough property of machine learning progress, also brought for human body action recognition new Developing direction.Different from traditional recognition methods, deep learning can automatically learn from the feature of low level high-level out Feature, solve the problems, such as that time-consuming for Feature Selection excessively dependence task itself and adjustment process.
Summary of the invention
The identification of human action directly uses full articulamentum in the prior art, and detection is done based on entire feature, this Sample can be led to the problem of, for example, when movement than it is very fast when, the unit that sets is complete when having the sequence of pictures length of movement than detection Whole sequence length is much smaller, at this moment just will appear the problem of motion detection does not come out.Sequence is not accounted in the prior art simultaneously The historical information of image, detection accuracy need to be improved.A kind of human motion recognition method, the technical side of use are designed based on this Case is as follows:
A kind of human motion recognition method, the human motion recognition method be based on deep learning technology, including training and Identify two stages, network used in trained and cognitive phase includes sequence signature extraction module, sequence signature extraction module CNN network including cromogram deep learning network, light stream deep learning network, the cromogram deep learning network include three Layer LSTM layer, the light stream deep learning network including two layers LSTM layers.
Further, the neuron number in hidden layer in LSTM layers described is 200.
Further, the training stage comprising steps of
Step 1. obtains action video, is split into framing image, calculates light stream figure, and is spaced 16 frames and extracts a frame figure As being used as formation center frame, operating position is marked out;
Video sequence image is distinguished formation sequence picture sample to step 2. and label, center frame picture sample and position are marked Label and sequence light stream picture sample and label are for training corresponding Feature Selection Model;
Sequence of pictures sample and label are sent into cromogram deep learning network by step 3., by center frame picture sample and position It sets label and is sent into CNN network, sequence light stream picture sample is sent into light stream deep learning network, carries out feature extraction;
Step 4. merges the feature that above three network model extracts, and generates feature corresponding with video sequence Code;
Condition code is sent into convolutional network by step 5., carries out different time scales to the receptive field of video sequence characteristics Variation;
The different condition code sample of receptive field is sent into video identification network by step 6., generates identification model;
Step 7. repetitive exercise is until identification model restrains.
Further, the condition code of cognitive phase video sequence is generated by the sequence signature extraction module, condition code warp After convolutional network changes receptive field, then is identified and classified.
Further, the convolutional network uses four-layer structure.
Compared with prior art, the beneficial effects of the present invention are:
1. the deep learning network structure redesigned can preferably extract the feature of video sequence, action recognition precision It is high.
2. receptive field variation is carried out to video sequence characteristics code using four layers of convolutional network, before guaranteeing to identify real-time Effective solution is put when the sequence of pictures length containing movement in sequence image is more much smaller than complete sequence length, movement The problem of can not be detected.
Detailed description of the invention
Fig. 1 is model training flow chart of the present invention;
Fig. 2 is cromogram deep learning job stream of network journey figure;
Fig. 3 is light stream deep learning job stream of network journey figure;
Fig. 4 is CNN job stream of network journey figure;
Fig. 5 is action recognition flow chart of the present invention;
Fig. 6 is convolutional layer job stream of network journey figure.
Specific embodiment
As shown in Figure 1, the training stage in a kind of human motion recognition method of the present invention includes:
Step 1. obtains action video, is split into framing image, calculates light stream figure, and is spaced 16 frames and extracts a frame figure As being used as formation center frame, operating position is marked out;
Video sequence image is respectively fed to image sequence processing unit, center frame image processing unit and light stream by step 2. Series processing unit, formation sequence picture sample and label, center frame picture sample and location tags and sequence light flow graph piece sample Sheet and label, for training corresponding Feature Selection Model;
Sequence of pictures sample and label are sent into cromogram deep learning network by step 3., by center frame picture sample and position It sets label and is sent into CNN network, sequence light stream picture sample is sent into light stream deep learning network, carries out feature extraction;
Step 4. merges the feature that above three network model extracts, and generates feature corresponding with video sequence Code;
Condition code is sent into convolutional network by step 5., carries out different time scales to the receptive field of video sequence characteristics Variation;
The different condition code sample of receptive field is sent into video identification network by step 6., generates identification model;
Step 7. repetitive exercise is until identification model restrains.
Wherein image sequence processing unit, center frame image processing unit, light stream series processing unit, cromogram depth It practises network, CNN network, light stream deep learning network and Fusion Features unit and constitutes sequence signature extraction module.
Because human action is continuous, and acquired image frame is discrete, therefore the history letter of previous frame image Breath is related to the image of present frame.Deep learning network main frame is CNN network, and the present invention constructs on its basis Cromogram deep learning network and light stream deep learning network.Wherein CNN network uses SSD network layer, closes for extracting The more specific location information acted in key frame.As shown in Figures 2 and 3, the cromogram deep learning network increases three layers of LSTM Layer, the light stream deep learning network increase two layers of LSTM layers.Wherein the hidden layer in LSTM layers has 200 neurons.Increase After adding LSTM layers, so that recognition methods has the ability of Chief Learning Officer, CLO's image sequence.Compared to being identified only with single frames picture Algorithm, the present invention using reconstruct deep learning network recognition methods can preferably using sequence video timing believe Breath, effectively improves accuracy in detection.
As shown in figure 5, the cognitive phase in a kind of human motion recognition method of the present invention includes:
Step 1. obtains action video, is split into framing image, calculates light stream figure, and is spaced 16 frames and extracts a frame figure As being used as formation center frame, operating position is marked out;
Step 2. generates condition code corresponding with video sequence using sequence signature extraction module;
Condition code is sent into convolutional network by step 3., carries out different time scales to the receptive field of video sequence characteristics Variation;
Step 4. classifies the different condition code of receptive field;
Step 5. obtains human action recognition result.
As shown in fig. 6, convolutional network used in trained and identification process is four-layer structure, convolutional network is for changing spy The receptive field of code is levied, condition code just changes four receptive fields after four layers of convolutional layer.The purpose for changing receptive field is to make A part of image in certain length sequence also assists in the determination of testing result, i.e., the result is that passing through entire condition code data It is codetermined with Partial Feature code data.The convolution net is made of timing convolution, and every layer of convolution is one-dimensional using conv9's Convolution, step-length 1, each convolutional layer is with a unification pond layer.
The foregoing is merely the preferred embodiments of the invention, are not intended to limit the invention creation, all at this Within the spirit and principle of innovation and creation, any modification, equivalent replacement, improvement and so on should be included in the invention Protection scope within.

Claims (6)

1. a kind of human motion recognition method, this method is based on deep learning technology, which is characterized in that the human action identification Method includes training and two stages of identification, includes sequence signature extraction module, sequence in network used in trained and cognitive phase Column characteristic extracting module includes the CNN network of cromogram deep learning network, light stream deep learning network, the cromogram depth Learning network include three layers LSTM layer, the light stream deep learning network including two layers LSTM layers.
2. a kind of human motion recognition method as described in claim 1, which is characterized in that in the hidden layer in LSTM layers described Neuron number is 200.
3. a kind of human motion recognition method as described in claim 1, which is characterized in that the training stage comprising steps of
Step 1. obtains action video, is split into framing image, calculates light stream figure, and is spaced 16 frames and extracts frame image work For formation center frame, operating position is marked out;
Video sequence image is distinguished formation sequence picture sample and label by step 2.;Center frame picture sample position and label; Sequence light stream picture sample and label, for training corresponding Feature Selection Model;
Sequence of pictures sample and label are sent into cromogram deep learning network by step 3., and center frame picture sample and position are marked Label are sent into CNN network, and sequence light stream picture sample is sent into light stream deep learning network, carries out feature extraction;
Step 4. merges the feature that above three network model extracts, and generates condition code corresponding with video sequence;
Condition code is sent into convolutional network by step 5., and the change of different time scales is carried out to the receptive field of video sequence characteristics Change;
The different condition code sample of receptive field is sent into video identification network by step 6., generates identification model;
Step 7. repetitive exercise is until identification model restrains.
4. a kind of human motion recognition method as described in claim 1, which is characterized in that the condition code of cognitive phase video sequence It is generated by the sequence signature extraction module, condition code is after convolutional network changes receptive field, then is identified.
5. a kind of human motion recognition method of any one as described in claim 3 or 4, which is characterized in that the convolutional network Using four-layer structure.
6. a kind of human motion recognition method of any one as claimed in claim 5, which is characterized in that in the convolutional network Every layer of convolutional layer uses one-dimensional convolution, and step-length 1, each convolutional layer is with a unification pond layer.
CN201810766185.5A 2018-07-12 2018-07-12 Human body action recognition method Active CN108985223B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810766185.5A CN108985223B (en) 2018-07-12 2018-07-12 Human body action recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810766185.5A CN108985223B (en) 2018-07-12 2018-07-12 Human body action recognition method

Publications (2)

Publication Number Publication Date
CN108985223A true CN108985223A (en) 2018-12-11
CN108985223B CN108985223B (en) 2024-05-07

Family

ID=64537893

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810766185.5A Active CN108985223B (en) 2018-07-12 2018-07-12 Human body action recognition method

Country Status (1)

Country Link
CN (1) CN108985223B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685213A (en) * 2018-12-29 2019-04-26 百度在线网络技术(北京)有限公司 A kind of acquisition methods, device and the terminal device of training sample data
CN109902565A (en) * 2019-01-21 2019-06-18 深圳市烨嘉为技术有限公司 The Human bodys' response method of multiple features fusion
CN109919031A (en) * 2019-01-31 2019-06-21 厦门大学 A kind of Human bodys' response method based on deep neural network
CN110084259A (en) * 2019-01-10 2019-08-02 谢飞 A kind of facial paralysis hierarchical synthesis assessment system of combination face texture and Optical-flow Feature
CN110544301A (en) * 2019-09-06 2019-12-06 广东工业大学 Three-dimensional human body action reconstruction system, method and action training system
CN112257568A (en) * 2020-10-21 2021-01-22 中国人民解放军国防科技大学 Intelligent real-time supervision and error correction system and method for individual soldier queue actions

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104933417A (en) * 2015-06-26 2015-09-23 苏州大学 Behavior recognition method based on sparse spatial-temporal characteristics
CN106845351A (en) * 2016-05-13 2017-06-13 苏州大学 It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term
CN107273800A (en) * 2017-05-17 2017-10-20 大连理工大学 A kind of action identification method of the convolution recurrent neural network based on attention mechanism
CN107292247A (en) * 2017-06-05 2017-10-24 浙江理工大学 A kind of Human bodys' response method and device based on residual error network
CN107463949A (en) * 2017-07-14 2017-12-12 北京协同创新研究院 A kind of processing method and processing device of video actions classification
CN108108699A (en) * 2017-12-25 2018-06-01 重庆邮电大学 Merge deep neural network model and the human motion recognition method of binary system Hash
CN108229338A (en) * 2017-12-14 2018-06-29 华南理工大学 A kind of video behavior recognition methods based on depth convolution feature

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104933417A (en) * 2015-06-26 2015-09-23 苏州大学 Behavior recognition method based on sparse spatial-temporal characteristics
CN106845351A (en) * 2016-05-13 2017-06-13 苏州大学 It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term
CN107273800A (en) * 2017-05-17 2017-10-20 大连理工大学 A kind of action identification method of the convolution recurrent neural network based on attention mechanism
CN107292247A (en) * 2017-06-05 2017-10-24 浙江理工大学 A kind of Human bodys' response method and device based on residual error network
CN107463949A (en) * 2017-07-14 2017-12-12 北京协同创新研究院 A kind of processing method and processing device of video actions classification
CN108229338A (en) * 2017-12-14 2018-06-29 华南理工大学 A kind of video behavior recognition methods based on depth convolution feature
CN108108699A (en) * 2017-12-25 2018-06-01 重庆邮电大学 Merge deep neural network model and the human motion recognition method of binary system Hash

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
JEFF DONAHUE 等: "Long-term Recurrent Convolutional Networks for Visual Recognition and Description", 《2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》, pages 1 *
JEFF DONAHUE 等: "Long-term Recurrent Convolutional Networks for Visual Recognition and Description", 2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 15 October 2015 (2015-10-15), pages 1 *
SHREYANK JYOTI 等: "Expression Empowered ResiDen Network for Facial Action Unit Detection", ARXIV, 14 June 2018 (2018-06-14), pages 1 *
王昕培: "基于双流CNN的异常行为分类算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, vol. 2018, no. 2, pages 138 - 2191 *
阳平 等: "一种基于融合多传感器信息的手语手势识别方法", 航天医学与医学工程, vol. 25, no. 4, 31 August 2012 (2012-08-31) *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685213A (en) * 2018-12-29 2019-04-26 百度在线网络技术(北京)有限公司 A kind of acquisition methods, device and the terminal device of training sample data
CN110084259A (en) * 2019-01-10 2019-08-02 谢飞 A kind of facial paralysis hierarchical synthesis assessment system of combination face texture and Optical-flow Feature
CN110084259B (en) * 2019-01-10 2022-09-20 谢飞 Facial paralysis grading comprehensive evaluation system combining facial texture and optical flow characteristics
CN109902565A (en) * 2019-01-21 2019-06-18 深圳市烨嘉为技术有限公司 The Human bodys' response method of multiple features fusion
CN109919031A (en) * 2019-01-31 2019-06-21 厦门大学 A kind of Human bodys' response method based on deep neural network
CN109919031B (en) * 2019-01-31 2021-04-09 厦门大学 Human behavior recognition method based on deep neural network
CN110544301A (en) * 2019-09-06 2019-12-06 广东工业大学 Three-dimensional human body action reconstruction system, method and action training system
CN112257568A (en) * 2020-10-21 2021-01-22 中国人民解放军国防科技大学 Intelligent real-time supervision and error correction system and method for individual soldier queue actions
CN112257568B (en) * 2020-10-21 2022-09-20 中国人民解放军国防科技大学 Intelligent real-time supervision and error correction system and method for individual soldier queue actions

Also Published As

Publication number Publication date
CN108985223B (en) 2024-05-07

Similar Documents

Publication Publication Date Title
CN108985223A (en) A kind of human motion recognition method
Zhao et al. Single image action recognition using semantic body part actions
CN106960206A (en) Character identifying method and character recognition system
CN107967695B (en) A kind of moving target detecting method based on depth light stream and morphological method
CN110263833A (en) Based on coding-decoding structure image, semantic dividing method
CN108984530A (en) A kind of detection method and detection system of network sensitive content
CN106023220A (en) Vehicle exterior part image segmentation method based on deep learning
CN107122375A (en) The recognition methods of image subject based on characteristics of image
CN108427942A (en) A kind of palm detection based on deep learning and crucial independent positioning method
CN106778796A (en) Human motion recognition method and system based on hybrid cooperative model training
CN113723312B (en) Rice disease identification method based on visual transducer
CN110073369A (en) The unsupervised learning technology of time difference model
CN110503077A (en) A kind of real-time body's action-analysing method of view-based access control model
CN109977791A (en) A kind of hand physiologic information detection method
CN107909034A (en) A kind of method for detecting human face, device and computer-readable recording medium
CN109522961A (en) A kind of semi-supervision image classification method based on dictionary deep learning
CN108960171B (en) Method for converting gesture recognition into identity recognition based on feature transfer learning
CN105404865A (en) Probability state restricted Boltzmann machine cascade based face detection method
Li et al. Dating ancient paintings of Mogao Grottoes using deeply learnt visual codes
Narayanan et al. Yoga pose detection using deep learning techniques
CN108595014A (en) A kind of real-time dynamic hand gesture recognition system and method for view-based access control model
CN108717548A (en) A kind of increased Activity recognition model update method of facing sensing device dynamic and system
CN113705507B (en) Mixed reality open set human body gesture recognition method based on deep learning
CN103544468B (en) 3D facial expression recognizing method and device
CN110008847A (en) A kind of stroke recognition methods based on convolutional neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant