The content of the invention
The purpose of the present invention be for prior art Gait Recognition runs under real scene the problem of, propose one kind can
Complex background and a variety of dressing conditions are adapted to, and the gait for being capable of Direct Recognition gait identity splits side integrated with Gait Recognition
Method.
The present invention is achieved in that a kind of gait segmentation based on deep learning and Gait Recognition integral method, institute
The method of stating includes:
The image for being used for humanoid segmentation training in humanoid partition data storehouse and corresponding humanoid segmentation mark image are returned
Same pixel size is arrived in one change, obtains the paired samples of the image and humanoid segmentation mark image for splitting training;
N is sent into the full convolutional neural networks of N channel to described image and corresponding humanoid segmentation mark image every time,
Obtain and the N number of image expression one for representing humanoid profile segmentation prediction result of humanoid segmentation dimensioning identical;Using reverse
Propagation algorithm and stochastic gradient descent method reduce the image expression one with it is corresponding it is humanoid segmentation mark image relatively obtain it is pre-
Error is surveyed to train the full convolutional neural networks of the N channel, trains to obtain by successive ignition and splits for the N channel of gait segmentation
Convolutional neural networks model, and the N channel is split into convolutional neural networks model copy and preserved, the segmentation mark fixed as one
Note maker;
N gait images are randomly selected from every section of selected gait video every time, are sent into the N channel segmentation convolution god
The N image expressions two for representing humanoid profile prediction segmentation result, the corresponding body of every section of gait video are obtained through network model
Part sequence number is used to identify;
Using the obtained N image expressions two as input, and with the identity sequence number for selecting every section of gait video
As output, reduce the mistake between prediction gait identity and actual walking pattern identity using back-propagation algorithm and stochastic gradient descent method
Difference carrys out repetitive exercise and is used for the classification convolutional neural networks model of Gait Recognition until model stops restraining;
By the output end and classification convolutional neural networks model of the N channel trained segmentation convolutional neural networks model
Input connection, form gait segmentation and the Integrated Model of Gait Recognition of the output for gait identity prediction result;
N gait images are randomly selected from every section of selected gait video every time and are sent into the N channel segmentation convolution god
The generation markup information of corresponding humanoid profile prediction segmentation figure picture is obtained through network model;It is using the N gait images simultaneously
Input, corresponding humanoid profile prediction segmentation figure picture and identity serial number supervision message, using back-propagation algorithm and boarding steps
Gait segmentation described in descent method joint training is spent with the Integrated Model of Gait Recognition until the Integrated Model restrains stopping;
During test, randomly select N images in one section of gait video and be sent into the gait segmentation and gait knowledge trained
Other Integrated Model, maximum sound is obtained in gait segmentation and the soft-max graders of the Integrated Model of Gait Recognition
Node ID where answering, the prediction result as identity sequence number.
Wherein, each passage of the full convolutional neural networks model of the N channel includes configuration identical multilayer convolution
Layer and one layer of warp lamination of last layer for being connected the multilayer convolutional layer.
Wherein, the classification convolutional neural networks model includes multilayer convolutional layer and connects last layer of convolutional layer extremely
Few one layer of full articulamentum, last layer connection output layer ----soft-max graders of the full articulamentum.
The present invention is first with the humanoid figure with humanoid dividing mark image as training is based on multilayer convolutional neural networks
N channel segmentation convolutional neural networks model;Then convolutional neural networks model is split by one section of gait video using the N channel
It is random to take multiple image to carry out gait segmentation, and train a classification convolutional Neural net using obtained humanoid profile segmentation result
Network model carries out identification;N channel finally is split into convolutional neural networks model with classification convolutional neural networks model to combine
Study, more accurate gait segmentation and the Integrated Model of Gait Recognition are obtained, it is achieved thereby that straight using the Integrated Model
Tap into identification of the row from gait to identity.
Gait segmentation proposed by the invention and Gait Recognition Integrated Model can combination learning can realize while more
New N channel segmentation convolutional neural networks model and classification convolutional neural networks model, obtain more accurately Gait Recognition result.
N channel segmentation convolutional neural networks model of the present invention based on convolutional neural networks passes through under large amount of complex background
Humanoid segmentation mark image pattern training, it is possible to achieve the accurate humanoid profile segmentation under various different backgrounds, solve
Gait segmentation problem in actual environment under complicated dynamic background, and these accurate segmentation results can further pass through classification
The grader Direct Recognition gait identity that convolutional neural networks model is formed, splitting study integrated with identification will significantly speed up
The speed of Gait Recognition.
Embodiment
Below, by drawings and examples, technical scheme is described in further detail.
Gait segmentation and Gait Recognition integral method provided by the invention based on deep learning, using deep learning skill
Art joint training N channel segmentation convolutional neural networks model (gait parted pattern) and classification convolutional neural networks model (gait
Identification model), multichannel gait parted pattern is trained first, is then trained Gait Recognition model, is finally carried out joint training, from
And realize and achieve very high accuracy and speed in the Gait Recognition task in real scene.
Below, illustrated by taking certain large-scale Gait Recognition database as an example, the large-scale Gait Recognition database includes 138 people
Gait video sequence, everyone about 36 sections of videos, including different visual angles, background and dressing, for the initialization of gait parted pattern
Humanoid partition data storehouse includes about 5000 images and corresponding humanoid segmentation mark image.
As shown in figure 1, gait segmentation and Gait Recognition integral method of the present invention based on deep learning, include one
Change model training step and the testing procedure tested with the Integrated Model trained;(wherein step S1-S10 is
Integrated Model training step, S11 are the testing procedure tested with the Integrated Model trained), specific steps are such as
Under:
Step S1, by humanoid partition data storehouse be used for train 5000 image normalizations to same pixel size (such as
48*48 pixels), (background segment image, that is, mark the humanoid wheel in image to corresponding humanoid segmentation mark image before being also called
It is wide) corresponding operation is also carried out, 48*48 pixel sizes are normalized to, have thus been obtained for the image trained and humanoid point
Cut mark image paired sample, totally 5000 pairs;
Step S2,3 pairs of image patterns is randomly selected every time, i.e., 3 for humanoid point corresponding to the images of training and 3
Mark image is cut, the full convolutional neural networks model of segmentation of 3 passages is sequentially sent to, by several layers of convolutional layer and deconvolution
Layer, the size identical image expression one (splitting prognostic chart picture) with humanoid segmentation mark image is obtained in last layer, and
Obtain predicting error compared with corresponding humanoid segmentation mark image;
For example, the parameter configuration of typical 3 passage, 4 layers of a certain passage of full convolutional neural networks is:First 3 layers are convolution
Layer, wherein first layer have a convolution kernel of 64 5 × 5, step-length 1, with 3 × 3 and step-length be 2 space basic unit of office;The second layer
Have a convolution kernel of 64 5 × 5, step-length 1, with 3 × 3 and step-length be 2 space basic unit of office;Third layer has the volume of 64 3 × 3
Product core, step-length 1;4th layer is warp lamination, contains the deconvolution core of 1 48 × 48, step-length 1, by last deconvolution
Layer can obtain a segmentation prognostic chart picture (size 48*48).2 other passages configure, the network identical with the passage
3 images can be inputted simultaneously and obtain 3 segmentation prognostic chart pictures, i.e. image expression one.
It should be noted that the full convolutional neural networks model of segmentation can be 3 passages or 4 passages, or its
The passage of its quantity, it is specific unlimited.It is corresponding, when the passage of the full convolutional neural networks model of segmentation is the logical of other quantity
During road, the quantity for randomly selecting multipair image pattern is consistent with the number of channels of the full convolutional neural networks model of the segmentation;
Step S3, reduced using back-propagation algorithm and stochastic gradient descent method described image expression one with it is corresponding
Humanoid segmentation marks image and is compared to obtain prediction error, splits full convolutional neural networks model with training, by repeatedly changing
In generation, training was untill the prediction error no longer declines, you can obtaining 3 channel segmentation convolutional neural networks models, (i.e. 3 passages walk
State parted pattern);
Step S4,3 channel segmentation convolutional neural networks model copies in S3 are preserved, the segmentation mark fixed as one
Note maker;
Step S5, randomly selects one section from all gait videos every time, and using identity sequence number corresponding to the video as
Classification number, the video of the 26th people is such as chosen, the identity sequence number is 26.The gait video of corresponding 138 people, shares 138 sequence numbers.
3 gait images are randomly selected in the video of the 26th people chosen, are sent into S3 the 3 channel segmentation convolutional neural networks formed
Model obtains 3 image expressions two, i.e. humanoid profile segmentation result (being referred to as splitting prognostic chart picture);
Step S6, using 3 humanoid profile segmentation results that S5 is obtained as input, and to select the gait body of video in S5
Part sequence number (26) exports as classification, and one classification convolutional neural networks model of repetitive exercise is used for Gait Recognition, output gait
The result of identity prediction, the classification convolutional neural networks model output layer is soft-max graders, the maximum section of output response
Point sequence number is corresponding with identity sequence number;
In specific implementation, the classification convolutional neural networks model can be 5 layers, such as be used to extract spy comprising 3 layers of convolutional layer
Sign, 2 layers of full articulamentum composition and classification device are connected afterwards, last layer of connection soft-max grader obtains the prediction of gait identity
As a result, the maximum node ID of output response is corresponding with identity sequence number;
The structure of the classification convolutional neural networks such as can be:Input as the image of 3 passage 48*48 sizes;First layer has
The convolution kernel of 64 5 × 5, step-length 1, with 3 × 3 and step-length be 2 space basic unit of office;The second layer has the convolution of 64 5 × 5
Core, step-length 1, with 3 × 3 and step-length be 2 space basic unit of office;Third layer has the convolution kernel of 64 3 × 3, step-length 1;4th
Layer is the full articulamentum containing 1000 and 138 nodes respectively with the 5th layer, and the 5th layer is followed by soft-max graders and obtains correspondingly
138 responses, and take the node number where peak response as identity prediction.For example, the 26th node response is maximum, then
It is the 26th people to predict the gait.
Step S7, using back-propagation algorithm and stochastic gradient descent method, to reduce prediction gait identity and actual walking pattern
Error between identity is to train the classification convolutional neural networks, by successive ignition training untill error no longer declines,
Obtain convolutional neural networks model (i.e. Gait Recognition model) of classifying;
Step S8, by the output of the 3 channel segmentation convolutional neural networks models for being used for gait segmentation in the S3 trained
End connects with the input of the classification convolutional neural networks model for Gait Recognition in S6, forms a gait segmentation and step
The Integrated Model of state identification;The model includes 3 passages, totally 9 layers, inputs as the gait image of 3 48*48 sizes, output
For gait identity prediction result.
Step S9, randomly selects one section from all gait videos every time, and using identity sequence number corresponding to the video as
Classification number, the video of the 26th people is such as chosen, the identity sequence number is 26.The gait video of corresponding 138 people, shares 138 sequence numbers.
The segmentation convolutional neural networks model (segmentation that 3 gait images are sent into S4 is randomly selected in the video of the 26th people chosen
Mark maker) obtain the generation markup information of corresponding humanoid profile.
Step S10, it is input using 3 gait images in S9, corresponding humanoid profile in S9 is predicted into segmentation figure picture
(i.e. image expression two) and identity serial number supervision message, using back-propagation algorithm and stochastic gradient descent method joint training S8
In gait segmentation with Gait Recognition Integrated Model, until model restrain stop;
Specifically, have error at 2 between gait identity mark (showing as gait identity sequence number) and the prediction of gait identity,
It is respectively used to correct the classification convolutional neural networks model and segmentation convolutional neural networks model;Meanwhile pass through segmentation in S9
Have caused by convolutional neural networks model (segmentation mark maker) between generation markup information and prediction segmentation figure picture at 1 and miss
Difference, split convolutional neural networks for correcting.So, share error-duration model at 3 and correct gait segmentation and Gait Recognition jointly
Integrated Model.
Step S11, it is shown in Figure 2, one section of gait video is randomly selected in all videos of 138 people (such as during test
The video of 10th people), 3 images are therefrom randomly selected, image are sent into the Integrated Model trained, in class convolutional Neural net
The soft-max graders of network model can obtain the output of 138 dimensions, and the node ID where drawing peak response is tieed up the 10th,
Can be using No. 10 prediction results as identity sequence number, this completes the integrated mistake from gait video to identification
Journey.
The specific processes of step S11 are one section of gait video first with multi-channel nerve network division model to input
In several gait images carry out humanoid profile segmentation, obtain the humanoid profile point of multiple gait images in one section of gait video
Cut;Then the humanoid profile of acquisition is subjected to identification by convolutional neural networks model of classifying, passes through class convolutional Neural net
The soft-max graders output identification result of network model.
This method has very strong robustness to scene changes, dressing change, the angle of image/video, walking states, special
Shi He not solve the Gait Recognition under dynamic background, very high accuracy of identification can be reached in the Gait Recognition of reality;Due to
Segmentation and identification integrated frame are employed, this method has very fast recognition speed simultaneously, is suitable under actual monitored
Real-time gait identifies.
The present invention splits convolutional neural networks model by using multichannel, while obtains multiple in one section of gait video
The humanoid profile segmentation result of gait image;Then the humanoid profile result of acquisition is passed through into a classification convolutional neural networks mould
Type carries out identification.The multichannel segmentation of multichannel segmentation convolutional neural networks model and the classification convolution god for identifying
Through network model can under a framework combination learning, constitute input be several gait images, export as identification knot
The integrated frame of fruit.
The inventive method has very strong robust to scene changes, dressing change, the angle of image/video, walking states
Property, it is particularly suitable for the Gait Recognition for solving under dynamic background, thus very high knowledge can be reached in the Gait Recognition of reality
Other precision;As a result of the segmentation framework integrated with identification, therefore this method has very fast recognition speed simultaneously, fits
Identified together in the real-time gait under actual monitored.This method can be widely used in video monitoring scene, such as airport and customs
Security monitoring, personal identification, company's work attendance, criminal detection etc..