Summary of the invention
It is an object of the invention to for the prior art problem that Gait Recognition runs under real scene, it is proposed to one can adapt to complex background and multiple dressing condition, and can the gait segmentation of Direct Recognition gait identity and Gait Recognition integral method.
The present invention is achieved in that a kind of gait segmentation based on degree of depth study and Gait Recognition integral method, and described method includes:
The humanoid segmentation of the image and correspondence that are used for humanoid segmentation training in humanoid partition data storehouse is marked image normalization to same pixel size, obtain the paired samples of the image for splitting training and humanoid segmentation mark image;
The humanoid segmentation mark image of described image and correspondence is sent into a full convolutional neural networks of N channel by N every time, obtain the image expression one that the N number of expression humanoid contours segmentation identical with humanoid segmentation dimensioning predicts the outcome;Adopt back-propagation algorithm and stochastic gradient descent method to reduce this image expression one and compare the forecast error obtained to train the full convolutional neural networks of this N channel with corresponding humanoid segmentation mark image, the N channel segmentation convolutional neural networks model for gait segmentation is obtained through successive ignition training, and this N channel is split the preservation of convolutional neural networks model copy, mark maker as a fixing segmentation;
Every time randomly selecting N from every section of selected gait video and open gait image, send into described N channel segmentation convolutional neural networks model and obtain N and open the image expression two representing humanoid contour prediction segmentation result, every section of gait video one identity sequence number of correspondence is used for identifying;
The described N obtained is opened image expression two as input, and using the identity sequence number of described selected every section of gait video as output, adopt back-propagation algorithm and stochastic gradient descent method to reduce the error between prediction gait identity and actual walking pattern identity and carry out repetitive exercise for the classification convolutional neural networks model of Gait Recognition until model stops convergence;
The outfan of described N channel segmentation convolutional neural networks model trained and the input of classification convolutional neural networks model are connected, form one Integrated Model being output as gait segmentation that gait identity predicts the outcome and Gait Recognition;
From every section of selected gait video, randomly select N open gait image and send into described N channel segmentation convolutional neural networks model and obtain the generation markup information of corresponding humanoid contour prediction segmentation image every time;Utilize this N to open gait image for input simultaneously, corresponding humanoid contour prediction segmentation image and identity sequence number are supervision message, adopt the Integrated Model of gait segmentation described in back-propagation algorithm and stochastic gradient descent method joint training and Gait Recognition until the convergence of this Integrated Model stops;
During test, randomly select N in one section of gait video to open image and send into the segmentation of described gait and the Integrated Model of Gait Recognition that train, grader is divided to obtain the node ID at peak response place at the soft-max of the segmentation of described gait with the Integrated Model of Gait Recognition, as predicting the outcome of identity sequence number.
Wherein, each passage of described N channel full convolutional neural networks model all includes one layer of warp lamination of the identical multilamellar convolutional layer of configuration and last layer being connected described multilamellar convolutional layer.
Wherein, described classification convolutional neural networks model includes multilamellar convolutional layer and connects the full articulamentum of at least one of which of last layer of convolutional layer, and last layer of described full articulamentum connects output layer----soft-max grader.
The present invention trains the N channel based on multilamellar convolutional neural networks to split convolutional neural networks model first with the humanoid figure's picture with humanoid dividing mark image;Then utilize this N channel segmentation convolutional neural networks model that one section of gait video takes multiple image at random and carry out gait segmentation, and utilize the humanoid contours segmentation result obtained to train a classification convolutional neural networks model to carry out identification;Finally N channel is split convolutional neural networks model and classification convolutional neural networks model combination learning, obtains the Integrated Model of the segmentation of more accurate gait and Gait Recognition, it is achieved thereby that utilize this Integrated Model to be made directly the identification from gait to identity.
Gait segmentation proposed by the invention can realize updating N channel segmentation convolutional neural networks model and classification convolutional neural networks model by combination learning with Gait Recognition Integrated Model simultaneously, it is thus achieved that Gait Recognition result more accurately.
The present invention splits convolutional neural networks model by the humanoid segmentation mark image pattern training under large amount of complex background based on the N channel of convolutional neural networks, can be implemented in the accurate humanoid contours segmentation under various different background, solve the gait segmentation problem under complicated dynamic background in actual environment, and the grader Direct Recognition gait identity that these accurate segmentation results can be consisted of convolutional neural networks model of classifying further, split and identify that integrated study will significantly speed up the speed of Gait Recognition.
Detailed description of the invention
Below, by drawings and Examples, technical scheme is described in further detail.
Gait segmentation based on degree of depth study provided by the invention and Gait Recognition integral method, adopt degree of depth learning art joint training N channel segmentation convolutional neural networks model (gait parted pattern) and classification convolutional neural networks model (Gait Recognition model), first training multichannel gait parted pattern, then training Gait Recognition model, finally carry out joint training, it is achieved thereby that the Gait Recognition task in real scene achieves very high accuracy and speed.
Below, illustrate for certain large-scale Gait Recognition data base, this large-scale Gait Recognition data base comprises 138 people's gait video sequences, everyone about 36 sections of videos, including different visual angles, background and dressing, comprise the humanoid segmentation mark image of about 5000 images and correspondence for the initialized humanoid partition data storehouse of gait parted pattern.
As it is shown in figure 1, the gait segmentation that learns based on the degree of depth of the present invention and Gait Recognition integral method, include Integrated Model training step and Integrated Model that utilization trains carries out the testing procedure tested;(the wherein integrated model training step of step S1 S10, S11 uses the Integrated Model trained to carry out the testing procedure tested), specifically comprises the following steps that
Step S1, by 5000 image normalizations being used for training in humanoid partition data storehouse to same pixel size (such as 48*48 pixel), corresponding humanoid segmentation mark image (is also called front background segment image, i.e. humanoid profile in mark image) it is also carried out corresponding operation, it is normalized to 48*48 pixel size, thus obtain the paired sample of image for training and humanoid segmentation mark image, totally 5000 pairs;
Step S2, randomly select 3 pairs of image patterns every time, namely 3 mark image for the humanoid segmentation that the image trained and 3 are corresponding, it is sequentially sent to the full convolutional neural networks model of segmentation of 3 passages, through several layers of convolutional layer and warp lamination, in the end one layer obtains and the equivalently-sized image expression one (namely splitting predicted picture) of humanoid segmentation mark image, and marks image with corresponding humanoid segmentation and compare and obtain forecast error;
Such as, the parameter configuration of typical 4 layers of a certain passage of full convolutional neural networks of 3 passage is: front 3 layers for convolutional layer, wherein ground floor has the convolution kernel of 64 5 × 5, and step-length is 1, with 3 × 3 and space basic unit of office that step-length is 2;The second layer has the convolution kernel of 64 5 × 5, and step-length is 1, with 3 × 3 and space basic unit of office that step-length is 2;Third layer has the convolution kernel of 64 3 × 3, and step-length is 1;4th layer is warp lamination, and containing the deconvolution core of 1 48 × 48, step-length is 1, can obtain segmentation predicted picture (being sized to 48*48) through last warp lamination.2 passage configurations additionally are identical with this passage, and this network can be simultaneously entered 3 images and obtain 3 segmentations predicted picture, i.e. image expression one.
It should be noted that the full convolutional neural networks model of described segmentation can be 3 passages, it is also possible to be 4 passages, or the passage of other quantity, specifically do not limit.Corresponding, when the passage that passage is other quantity of the full convolutional neural networks model of described segmentation, the quantity randomly selecting multipair image pattern is consistent with the number of channels of this segmentation full convolutional neural networks model;
Step S3, adopt back-propagation algorithm and stochastic gradient descent method to reduce described image expression one to mark image and compare with corresponding humanoid segmentation and obtain forecast error, to train the full convolutional neural networks model of segmentation, through successive ignition training until this forecast error no longer declines, 3 channel segmentation convolutional neural networks models (i.e. 3 passage gait parted pattern) can be obtained;
3 channel segmentation convolutional neural networks model copies in S3 are preserved, mark maker as a fixing segmentation by step S4;
Step S5, randomly selects one section every time from all gait videos, and using identity sequence number corresponding to this video as classification number, as chosen the video of the 26th people, this identity sequence number is 26.The gait video of corresponding 138 people, has 138 sequence numbers.The video of the 26th people chosen randomly selects 3 gait images, sends into the 3 channel segmentation convolutional neural networks models formed in S3 and obtain 3 image expressions two, be i.e. humanoid contours segmentation result (segmentation predicted picture can also be called);
Step S6,3 the humanoid contours segmentation results obtained by S5 are as input, and gait identity sequence number (26) of selected video exports as classification in S5, repetitive exercise one classification convolutional neural networks model is for Gait Recognition, the result of output gait identity prediction, this classification convolutional neural networks model output layer is soft-max grader, and it is corresponding with identity sequence number that output responds maximum node ID;
Implement, this classification convolutional neural networks model can be 5 layers, as comprised 3 layers of convolutional layer for extracting feature, connect 2 layers of full articulamentum composition and classification device afterwards, last layer connects soft-max grader and obtains the result of gait identity prediction, and it is corresponding with identity sequence number that output responds maximum node ID;
The structure of this classification convolutional neural networks is as may is that the image that input is 3 passage 48*48 sizes;Ground floor has the convolution kernel of 64 5 × 5, and step-length is 1, with 3 × 3 and space basic unit of office that step-length is 2;The second layer has the convolution kernel of 64 5 × 5, and step-length is 1, with 3 × 3 and space basic unit of office that step-length is 2;Third layer has the convolution kernel of 64 3 × 3, and step-length is 1;4th layer and the 5th layer is the full articulamentum containing 1000 and 138 nodes respectively, and the 5th layer is followed by soft-max grader and obtains 138 responses of correspondence, and the node number taking peak response place is predicted as identity.Such as, the 26th node response value is maximum, then predict that this gait is the 26th people.
Step S7, adopt back-propagation algorithm and stochastic gradient descent method, reduce the error between prediction gait identity and actual walking pattern identity to train this classification convolutional neural networks, through successive ignition training until error no longer declines, obtain classification convolutional neural networks model (i.e. Gait Recognition model);
Step S8, the input of the classification convolutional neural networks model being used for Gait Recognition in the outfan of the 3 channel segmentation convolutional neural networks models being used for gait segmentation in the S3 trained and S6 is connected, forms the Integrated Model of a gait segmentation and Gait Recognition;This model comprises 3 passages, and totally 9 layers, input is the gait image of 3 48*48 sizes, is output as gait identity and predicts the outcome.
Step S9, randomly selects one section every time from all gait videos, and using identity sequence number corresponding to this video as classification number, as chosen the video of the 26th people, this identity sequence number is 26.The gait video of corresponding 138 people, has 138 sequence numbers.The video of the 26th people chosen randomly selects the segmentation convolutional neural networks model in 3 gait images feeding S4 (segmentation mark maker) and obtains the generation markup information of corresponding humanoid profile.
Step S10, utilizing 3 gait images in S9 is input, it is supervision message by humanoid contour prediction segmentation image (i.e. image expression two) corresponding in S9 and identity sequence number, adopt the gait segmentation in back-propagation algorithm and stochastic gradient descent method joint training S8 and Gait Recognition Integrated Model, until model convergence stops;
Concrete, mark in gait identity and have 2 place's errors between (showing as gait identity sequence number) and the prediction of gait identity, be respectively used to correct described classification convolutional neural networks model and segmentation convolutional neural networks model;Meanwhile, there is 1 place error by splitting between generation markup information and the prediction segmentation image that convolutional neural networks model (segmentation mark maker) produces at S9, be used for correcting segmentation convolutional neural networks.So, have 3 place's error-duration model and jointly correct the segmentation of this gait and Gait Recognition Integrated Model.
Step S11, shown in Figure 2, in all videos of 138 people, one section of gait video (video such as the 10th people) is randomly selected during test, therefrom randomly select 3 images, image is sent into the Integrated Model trained, soft-max grader at class convolutional neural networks model can obtain the output of 138 dimensions, show that the node ID at peak response place is tieed up the 10th, using No. 10 predicting the outcome as identity sequence number, the integrated process from gait video to identification can be this completes.
Process concrete for step S11 is, first with multi-channel nerve network division model to input one section of gait video in several gait images carry out humanoid contours segmentation, it is thus achieved that the humanoid contours segmentation of the multiple gait images in one section of gait video;Then the humanoid profile obtained is carried out identification by convolutional neural networks model of classifying, export identification result by the soft-max grader of class convolutional neural networks model.
Scene changes, dressing change, the angle of image/video, walking states are had very strong robustness by the method, are particularly suitable for solving the Gait Recognition under dynamic background, can reach very high accuracy of identification in actual Gait Recognition;Owing to have employed segmentation and identifying integrated frame, the method has very fast recognition speed simultaneously, is suitable for the real-time gait identification under actual monitored.
The present invention passes through to utilize multichannel segmentation convolutional neural networks model, simultaneously the humanoid contours segmentation result of the multiple gait images in one section of gait video of acquisition;Then the humanoid profile results obtained is carried out identification by a convolutional neural networks model of classifying.The multichannel segmentation of this multichannel segmentation convolutional neural networks model with for the classification convolutional neural networks model that identifies can under a framework combination learning, constitute input for several gait images, be output as the integrated frame of identification result.
Scene changes, dressing change, the angle of image/video, walking states are had very strong robustness by the inventive method, are particularly suitable for solving the Gait Recognition under dynamic background, thus can reach very high accuracy of identification in actual Gait Recognition;Splitting owing to have employed and identify integrated framework, therefore the method has very fast recognition speed simultaneously, is suitable for the real-time gait identification under actual monitored.The method can be widely used in video monitoring scene, such as airport and the security monitoring of customs, personal identification, company's work attendance, criminal's detection etc..