CN108985223A

CN108985223A - A kind of human motion recognition method

Info

Publication number: CN108985223A
Application number: CN201810766185.5A
Authority: CN
Inventors: 张德馨; 史玉坤
Original assignee: Tianjin Isecure Technology Co ltd
Current assignee: Tianjin Isecure Technology Co ltd
Priority date: 2018-07-12
Filing date: 2018-07-12
Publication date: 2018-12-11
Anticipated expiration: 2038-07-12
Also published as: CN108985223B

Abstract

The present invention proposes a kind of human motion recognition method, this method is based on depth learning technology, the human motion recognition method includes training and two stages of identification, it include sequence signature extraction module in network used in trained and cognitive phase, sequence signature extraction module includes the CNN network of cromogram deep learning network, light stream deep learning network, the cromogram deep learning network include three layers LSTM layer, the light stream deep learning network including two layers LSTM layers.After increasing LSTM layers, so that recognition methods has the ability of Chief Learning Officer, CLO's image sequence, therefore the timing information of sequence video can be preferably utilized, effectively improve accuracy in detection.Used the convolutional network using four-layer structure in deep learning network simultaneously, convolutional network for changing condition code receptive field so that a part of image in image sequence also assists in the determination of testing result.

Description

A kind of human motion recognition method

Technical field

The invention belongs to machine learning field, especially a kind of human motion recognition method.

Background technique

Traditional human action identification is the body that the acquisition such as biosensor or mechanics sensor equipment is attached to people On, it is a kind of motion detection method of contact, dislike or sense tired out can be brought to people.With the development of technology, this knowledge Other mode is gradually substituted by the recognition methods based on image.

The it is proposed of deep learning so that making a breakthrough property of machine learning progress, also brought for human body action recognition new Developing direction.Different from traditional recognition methods, deep learning can automatically learn from the feature of low level high-level out Feature, solve the problems, such as that time-consuming for Feature Selection excessively dependence task itself and adjustment process.

Summary of the invention

The identification of human action directly uses full articulamentum in the prior art, and detection is done based on entire feature, this Sample can be led to the problem of, for example, when movement than it is very fast when, the unit that sets is complete when having the sequence of pictures length of movement than detection Whole sequence length is much smaller, at this moment just will appear the problem of motion detection does not come out.Sequence is not accounted in the prior art simultaneously The historical information of image, detection accuracy need to be improved.A kind of human motion recognition method, the technical side of use are designed based on this Case is as follows:

A kind of human motion recognition method, the human motion recognition method be based on deep learning technology, including training and Identify two stages, network used in trained and cognitive phase includes sequence signature extraction module, sequence signature extraction module CNN network including cromogram deep learning network, light stream deep learning network, the cromogram deep learning network include three Layer LSTM layer, the light stream deep learning network including two layers LSTM layers.

Further, the neuron number in hidden layer in LSTM layers described is 200.

Further, the training stage comprising steps of

Step 1. obtains action video, is split into framing image, calculates light stream figure, and is spaced 16 frames and extracts a frame figure As being used as formation center frame, operating position is marked out；

Video sequence image is distinguished formation sequence picture sample to step 2. and label, center frame picture sample and position are marked Label and sequence light stream picture sample and label are for training corresponding Feature Selection Model；

Sequence of pictures sample and label are sent into cromogram deep learning network by step 3., by center frame picture sample and position It sets label and is sent into CNN network, sequence light stream picture sample is sent into light stream deep learning network, carries out feature extraction；

Step 4. merges the feature that above three network model extracts, and generates feature corresponding with video sequence Code；

Condition code is sent into convolutional network by step 5., carries out different time scales to the receptive field of video sequence characteristics Variation；

The different condition code sample of receptive field is sent into video identification network by step 6., generates identification model；

Step 7. repetitive exercise is until identification model restrains.

Further, the condition code of cognitive phase video sequence is generated by the sequence signature extraction module, condition code warp After convolutional network changes receptive field, then is identified and classified.

Further, the convolutional network uses four-layer structure.

Compared with prior art, the beneficial effects of the present invention are:

1. the deep learning network structure redesigned can preferably extract the feature of video sequence, action recognition precision It is high.

2. receptive field variation is carried out to video sequence characteristics code using four layers of convolutional network, before guaranteeing to identify real-time Effective solution is put when the sequence of pictures length containing movement in sequence image is more much smaller than complete sequence length, movement The problem of can not be detected.

Detailed description of the invention

Fig. 1 is model training flow chart of the present invention；

Fig. 2 is cromogram deep learning job stream of network journey figure；

Fig. 3 is light stream deep learning job stream of network journey figure；

Fig. 4 is CNN job stream of network journey figure；

Fig. 5 is action recognition flow chart of the present invention；

Fig. 6 is convolutional layer job stream of network journey figure.

Specific embodiment

As shown in Figure 1, the training stage in a kind of human motion recognition method of the present invention includes:

Video sequence image is respectively fed to image sequence processing unit, center frame image processing unit and light stream by step 2. Series processing unit, formation sequence picture sample and label, center frame picture sample and location tags and sequence light flow graph piece sample Sheet and label, for training corresponding Feature Selection Model；

Step 7. repetitive exercise is until identification model restrains.

Wherein image sequence processing unit, center frame image processing unit, light stream series processing unit, cromogram depth It practises network, CNN network, light stream deep learning network and Fusion Features unit and constitutes sequence signature extraction module.

Because human action is continuous, and acquired image frame is discrete, therefore the history letter of previous frame image Breath is related to the image of present frame.Deep learning network main frame is CNN network, and the present invention constructs on its basis Cromogram deep learning network and light stream deep learning network.Wherein CNN network uses SSD network layer, closes for extracting The more specific location information acted in key frame.As shown in Figures 2 and 3, the cromogram deep learning network increases three layers of LSTM Layer, the light stream deep learning network increase two layers of LSTM layers.Wherein the hidden layer in LSTM layers has 200 neurons.Increase After adding LSTM layers, so that recognition methods has the ability of Chief Learning Officer, CLO's image sequence.Compared to being identified only with single frames picture Algorithm, the present invention using reconstruct deep learning network recognition methods can preferably using sequence video timing believe Breath, effectively improves accuracy in detection.

As shown in figure 5, the cognitive phase in a kind of human motion recognition method of the present invention includes:

Step 2. generates condition code corresponding with video sequence using sequence signature extraction module；

Condition code is sent into convolutional network by step 3., carries out different time scales to the receptive field of video sequence characteristics Variation；

Step 4. classifies the different condition code of receptive field；

Step 5. obtains human action recognition result.

As shown in fig. 6, convolutional network used in trained and identification process is four-layer structure, convolutional network is for changing spy The receptive field of code is levied, condition code just changes four receptive fields after four layers of convolutional layer.The purpose for changing receptive field is to make A part of image in certain length sequence also assists in the determination of testing result, i.e., the result is that passing through entire condition code data It is codetermined with Partial Feature code data.The convolution net is made of timing convolution, and every layer of convolution is one-dimensional using conv9's Convolution, step-length 1, each convolutional layer is with a unification pond layer.

The foregoing is merely the preferred embodiments of the invention, are not intended to limit the invention creation, all at this Within the spirit and principle of innovation and creation, any modification, equivalent replacement, improvement and so on should be included in the invention Protection scope within.

Claims

1. a kind of human motion recognition method, this method is based on deep learning technology, which is characterized in that the human action identification Method includes training and two stages of identification, includes sequence signature extraction module, sequence in network used in trained and cognitive phase Column characteristic extracting module includes the CNN network of cromogram deep learning network, light stream deep learning network, the cromogram depth Learning network include three layers LSTM layer, the light stream deep learning network including two layers LSTM layers.

2. a kind of human motion recognition method as described in claim 1, which is characterized in that in the hidden layer in LSTM layers described Neuron number is 200.

3. a kind of human motion recognition method as described in claim 1, which is characterized in that the training stage comprising steps of

Step 1. obtains action video, is split into framing image, calculates light stream figure, and is spaced 16 frames and extracts frame image work For formation center frame, operating position is marked out；

Video sequence image is distinguished formation sequence picture sample and label by step 2.；Center frame picture sample position and label； Sequence light stream picture sample and label, for training corresponding Feature Selection Model；

Sequence of pictures sample and label are sent into cromogram deep learning network by step 3., and center frame picture sample and position are marked Label are sent into CNN network, and sequence light stream picture sample is sent into light stream deep learning network, carries out feature extraction；

Step 4. merges the feature that above three network model extracts, and generates condition code corresponding with video sequence；

Condition code is sent into convolutional network by step 5., and the change of different time scales is carried out to the receptive field of video sequence characteristics Change；

Step 7. repetitive exercise is until identification model restrains.

4. a kind of human motion recognition method as described in claim 1, which is characterized in that the condition code of cognitive phase video sequence It is generated by the sequence signature extraction module, condition code is after convolutional network changes receptive field, then is identified.

5. a kind of human motion recognition method of any one as described in claim 3 or 4, which is characterized in that the convolutional network Using four-layer structure.

6. a kind of human motion recognition method of any one as claimed in claim 5, which is characterized in that in the convolutional network Every layer of convolutional layer uses one-dimensional convolution, and step-length 1, each convolutional layer is with a unification pond layer.