CN110175506A

CN110175506A - Pedestrian based on parallel dimensionality reduction convolutional neural networks recognition methods and device again

Info

Publication number: CN110175506A
Application number: CN201910277665.XA
Authority: CN
Inventors: 熊贇; 朱旭东; 段宇; 朱扬勇
Original assignee: Fudan University
Current assignee: Fudan University
Priority date: 2019-04-08
Filing date: 2019-04-08
Publication date: 2019-08-27
Anticipated expiration: 2039-04-08
Also published as: CN110175506B

Abstract

The invention belongs to machine learning techniques field, specially a kind of pedestrian based on parallel dimensionality reduction convolutional neural networks recognition methods and device again.The method of the present invention includes: the convolutional neural networks of building and training based on parallel drop convolution kernel, as Feature Selection Model；To target image to be retrieved is had been determined as and target image to be determined pre-processes, pretreated target image and corresponding image to be determined are obtained；Target image and image to be determined are sequentially input into Feature Selection Model, obtain multiple pedestrian's feature vectors to be determined and multiple target feature vectors；According to feature vector, searched out in target image to be determined and the consistent pedestrian image of target image.The present invention reduces deconvolution parameter using parallel-convolution core, higher dimensional convolution kernel is replaced with the symmetrical convolution kernel of multiple low dimensionals and the asymmetric convolution kernel of low dimensional simultaneously, reduce operand, pedestrian's weight accuracy of identification of the invention is much higher than existing a variety of pedestrians recognition methods again.

Description

Pedestrian based on parallel dimensionality reduction convolutional neural networks recognition methods and device again

Technical field

The invention belongs to machine learning techniques fields, and in particular to pedestrian image recognition methods and device again.

Background technique

It is signature analysis to be carried out to the image of acquisition, by trans-regional using computer vision technique that pedestrian identifies again Camera obtain image or video in identify specified mesh calibration method.Pedestrian identifies that in safety-security area be a tool again Significant research work, while having important life application scenarios potentiality.

The technology that pedestrian identifies again originates from multiple target tracking earliest, gradually develops relatively be independent research for one later Field.Early stage pedestrian weight identification technology by manually getting colors, the features such as texture, edge, shape are with machine learning method It is analyzed.But these individual features and method are difficult have more comprehensive analysis to target signature, because of differences such as light, weather Environmental factor identifies that accurate sex expression is not ideal enough in the case where changing.

Recently as the appearance and prevalence of deep learning method and convolutional neural networks, corresponding many methods are applied It is made some progress in the research that pedestrian identifies again, and in identification accuracy.Its main flow is to utilize training set Convolutional neural networks are trained, and trained convolutional neural networks are subjected to feature extraction to target image, obtain mesh The character representation vector of logo image and image to be determined, then pass through the similarity table comparative approach of vector correlation, it can find out The highest target of similarity.

Pedestrian image feature is more, and manual extraction feature is described relatively difficult, and convolutional neural networks can be effective Extract depth characteristic.There is scientific research personnel to achieve higher identification on data set Market1501 using convolutional neural networks Rate.

However common convolutional neural networks model is faced with often with the increase of neural net layer number and model depth The problem of parameter increases and calculation amount increases identifies during multilayer neural network carries out deep learning in face of pedestrian again The problem of huge amount of training data, still there is parameters to a certain extent excessively, calculation amount is excessive, inefficiency.Thus Storage and the expense calculated are very huge, when being applied to large-scale pedestrian and identifying again (such as the quantity of image to be determined is huge When big) it is then more prone to produce the problem of parameter is excessive, model is excessive and magnanimity calculates, inefficiency, this not only improves completion Pedestrian identifies required hardware requirement again, and the time needed for will also increasing model training is even more so that training process almost can not be complete At.

Summary of the invention

Pedestrian's weight identification mission and calculation amount can be completed under large-scale dataset the purpose of the present invention is to provide a kind of Small pedestrian image recognition methods and device again.

Pedestrian image proposed by the present invention recognition methods again is based on parallel dimensionality reduction convolutional neural networks technology, more It searches out in the image or video sequence that a difference camera obtains to setting the goal, by increasing the asymmetric convolution of dimensionality reduction convolution sum Core replaces original symmetrical convolution kernel, the specific steps are as follows:

Step S1: the convolutional neural networks based on parallel drop convolution kernel are constructed, and use multiple pedestrians selected from standard data set Image carries out quantization training as training set, to the convolutional neural networks model, and the convolutional neural networks model after training is made It is characterized extraction model；

Step S2: pre-processing the target image for having been determined as to be retrieved, obtains corresponding pretreated target image, right Target image to be determined is pre-processed, and corresponding image to be determined is obtained；

Step S3: sequentially inputting Feature Selection Model for pretreated target image and image to be determined, respectively obtain in advance Handle the corresponding multiple pedestrian's feature vectors to be determined of image to be determined and multiple mesh corresponding with pretreatment goal image Mark feature vector；

Step S4: it according to target feature vector and pedestrian's feature vector to be determined, is searched out in target image to be determined and mesh The consistent pedestrian image of logo image.

Wherein, include substep in step S1:

Step S1-1 is pre-processed to for multiple existing target pedestrian images as training set, obtains size unification Image；

Step S1-2 constructs the convolutional neural networks model based on parallel drop convolution kernel, which contains defeated Enter module, drop convolution module, reduction module, pond layer and full articulamentum；Each module major function are as follows: input module is for defeated Enter image to be detected data and extract correlated characteristic, drop convolution module carries out convolution using multi-channel parallel structure, reduces image Characteristic parameter, reduction module carry out dimensionality reduction to characteristics of image is extracted；Parameter in the parameter matrix of each layer is to be randomly provided；It is tied Structure is referring to fig. 2.Wherein, drop convolution module is divided into 3 classes, is denoted as: drop convolution module A, drop convolution module B, drop convolution module C, contracting Subtract module and be divided into 2 classes, be denoted as: reduction modules A, reduction module B；Wherein:

It is 4 sequentially connected drop convolution module A after input module；Dropping convolution module A points is 4 groups of convolution；

It is reduction modules A after drop convolution module A, to play the role of pondization reduction data scale, reduces modules A by 3 groups of volumes Product row is constituted；

Reducing modules A is later 7 sequentially connected drop convolution module B, and convolution module B points of drop are 4 road convolutional channels；

Drop convolution module B is followed by reduction module B, and reduction module B has pondization effect；Reduction module B is made of 3 groups of convolution；

Reduction module B is followed by 3 sequentially connected drop convolution module C, and drop convolution module C is made of 5 groups of convolution；

Drop convolution module C is followed by average pond layer, be followed by random drop layer (i.e. Dropout layers), contain in above layers For calculating the calculating weighted value (i.e. parameter) for the data transmitted to next layer.Convolutional neural networks model instruction is carried out below Practice:

Step S1-3, will be pretreated after pedestrian image as training set, input convolutional neural networks model；

Step S1-4 carries out convolutional neural networks model to calculate error to front transfer；

Step S1-5, using back-propagation method transmission error undated parameter；

Step S1-6 repeats step S1-3 to step S1-5, until reaching training requirement condition, the convolutional Neural after being trained Network, as Feature Selection Model.

In the present invention, the convolutional neural networks have such technical characteristic, and being added in same layer convolutional layer has simultaneously Capable convolution kernel allows single layer convolution to extract the parameter attribute of different sparse degree, increases network-wide, while increasing network Adaptability.

In the present invention, the convolutional neural networks also have such technical characteristic, the convolution kernels of convolutional neural networks by Common m × m is changed to by 2 n × n(n < m) convolution kernel replacement, reduces the parameter number of convolutional layer while can get same field of view Amount, while also increasing the depth of neural network.

In the present invention, the convolutional neural networks also have such technical characteristic, and 5 × 5 biggish n is greater than to size The symmetrical convolution kernel of × n is combined to replace with the asymmetric convolution kernel of 1 × n and n × 1, can be constant in extraction feature quantity Under the premise of be further reduced number of parameters and calculation amount, obtain better training effect.

In the present invention, the convolutional neural networks also have such technical characteristic, and drop convolution core module is divided into 3 classes, often One kind respectively has multiple groups convolution, and reduction module is divided into 2 classes, and every one kind has multiple groups convolution.

In the present invention, batch standardization is all carried out in the output of each layer of convolutional layer of step S-1, all by each layer of output Standardize to the normal distribution of a N (0,1), prevents occurring gradient disappearance problem in back-propagation process.

In the present invention, 1 × 1 additional convolutional layer is added before 3 × 3 and 5 × 5 convolutional layers in step sl, is limited defeated Enter the quantity of channel, reduces calculation amount.

In the present invention, dedicated reduction module is introduced in step S1-3, for changing the width and height of grid, is reduced The dimension of output and the relevant parameter quantity of training.

Training completion condition in the present invention, in step S1-6 are as follows: complete scheduled cycle-index, parameter has restrained Or eliminate training error.

The invention also includes pedestrian's weight identification devices based on the above method comprising: convolutional neural networks model construction, Training module, image to be determined and target image preprocessing module, characteristic extracting module and consistency checking module.This Four modules execute the function successively operation for step S1, step S2, step S3, step S4 in pedestrian again recognition methods.

The method of the present invention reduces deconvolution parameter using parallel-convolution core, at the same with the symmetrical convolution kernel of multiple low dimensionals and The asymmetric convolution kernel of low dimensional replaces higher dimensional convolution kernel, reduces operational parameter quantity and operation calculation amount, makes model meter Calculation can be completed comparatively fast, and the training speed of corresponding characteristic vector pickup and pedestrian's weight identification model gets a promotion, and this method exists The pedestrian's weight accuracy of identification obtained on data set much higher than existing a variety of pedestrians recognition methods again, while reduce calculation amount and Calculate the time.

Detailed description of the invention

Fig. 1 is the process based on the parallel pedestrian for dropping convolution kernel convolutional neural networks again recognition methods of the embodiment of the present invention Figure.

Fig. 2 is the convolutional neural networks structural diagrams of the embodiment of the present invention.

Fig. 3 is the drop convolution module A diagram of the embodiment of the present invention.

Fig. 4 is the drop convolution module B diagram of the embodiment of the present invention.

Fig. 5 is the drop convolution module C diagram of the embodiment of the present invention.

Fig. 6 is the reduction modules A diagram of the embodiment of the present invention.

Fig. 7 is the reduction module B diagram of the embodiment of the present invention.

Specific embodiment

Illustrate a specific embodiment of the invention below in conjunction with attached drawing and embodiment.

Model construction in the present embodiment etc. realizes that the platform has an at least graphics process in Linux platform The support of unit GPU card.

Fig. 1 is the flow chart of the convolutional neural networks based on parallel drop convolution kernel of the embodiment of the present invention.Based on parallel drop The pedestrian of convolution kernel neural network recognition methods again, includes the following steps:

Step S1, the building and training of convolutional neural networks model.Construct the convolutional neural networks based on parallel drop convolution kernel Model, and convolutional network model is trained using multiple existing pedestrian images, convolutional neural networks model after being trained As Feature Selection Model.The model construction and training comprise the steps of

Step S1-1: pre-processing the multiple existing pedestrian images for serving as training set, thus obtain size it is unified and Pretreatment training image corresponding with existing pedestrian image respectively.

In the present embodiment, for the image as training set from data set Market1501, data set shares 1501 rows People ID, wherein including 750 trained ID and 751 test ID to be retrieved, training dataset includes 12936 pictures, not by 6 Camera with region is shot, and different cameras is numbered respectively, and input picture format is 128 × 64.In advance Processing includes the following steps:

Step S1-1-1: Face datection is carried out to image to be processed, finds out face location therein.In the present embodiment, using existing There is Faster-RCNN in technology to carry out the pedestrian in detection image.

Step S1-1-2: multiple key position points in the people pedestrian that detecting step S1-1-1 is found out include at least head Key position point including portion, trunk and four limbs.

Step S1-1-3 carries out alignment operation to image to be processed according to key position point, and image to be processed is carried out Size unitizes (being adjusted to uniform sizes).In the present embodiment, the alignment operation of image to be processed according to head, trunk with And the key point including four limbs carries out, each image to be processed is just uniformly adjusted by common Image Adjusting means after alignment Whole is the size of 299 × 299 pixels.In addition, the port number of each image to be processed does not change.

To be processed image of the size after unitized is cut out, obtains corresponding pretreatment image by step S1-1-4.

Step S1-2 constructs model.Model employed in the present embodiment is the convolutional Neural based on parallel drop convolution kernel Network, the convolutional neural networks model contain input module, drop convolution module, reduction module and full link block.Its structure Referring to fig. 2.Wherein, drop convolution module is divided into 3 classes, is denoted as: drop convolution module A, drop convolution module B, drop convolution module C, reduction Module is divided into 2 classes, is denoted as: reduction modules A, reduction module B.

Convolutional neural networks first enter data into input module, are 4 sequentially connected drop convolution modules after input module A.Dropping convolution module A points is 4 groups of convolution: LA1, LA2, LA3, LA4.Wherein LA1 first layer is average pond layer P1, and the second layer is Convolutional layer LA1C1.LA2 is containing only one layer of convolutional layer LA2C1.LA3 first layer is convolutional layer LA3C1, and the second layer is convolutional layer LA3C2.LA4 first layer is convolutional layer LA4C1, second layer LA4C2, third layer LA4C3.

Reduction modules A is added after drop convolution module A, to play the role of pondization reduction data scale, by the 35 of input × 35 pixel value data pools turn to 17 × 17 sizes.Reduction modules A is made of 3 groups of convolution rows, is respectively as follows: RA1, RA2, RA3.Its Middle RA1 is maximum pond layer P2, and RA2 is convolutional layer RA2C1, and RA3 first layer is convolutional layer RA3C1, and the second layer is convolutional layer RA3C2, third layer are convolutional layer RA3C3.

Reducing modules A is later 7 sequentially connected drop convolution module B, and convolution module B points of drop are 4 road convolutional channels: LB1, LB2, LB3, LB4.Wherein, LB1 first layer is average pond layer P3, and the second layer is convolutional layer LB1C1.LB2 is containing only one layer Convolutional layer LB2C1.LB3 first layer is convolutional layer LB3C1, and the second layer is convolutional layer LB3C2, and third layer is convolutional layer LB3C3. It is LB4C4 that LB4 first layer, which is convolutional layer LB4C1, second layer LB4C2, third layer LB4C3, the 4th layers,.

Drop convolution module B is followed by reduction module B, and reduction module B has pondization effect；By input reduction module B 17 × 17 block of pixels are reduced to 8 × 8 sizes.Reduce module B by 3 groups of convolution: RB1, RB2, RB3 are formed.Wherein RB1 is maximum pond layer P4, RB2 first layer are convolutional layer RB2C1, second layer RB2C2.RB3 first layer is convolutional layer RB3C1, and the second layer is convolutional layer RB3C2, third layer are convolutional layer RB3C3, and the 4th layer is convolutional layer RB3C4.

Reduction module B is followed by 3 sequentially connected drop convolution module C, drops convolution module C by 5 groups of convolution: LC1, LC2, LC3, LC4, LC5 composition.Wherein, LC1 first layer is average pond layer P5, and the second layer is convolutional layer LC1C1.LC2 is containing only one layer Convolutional layer LC2C1.LC3 first layer is convolutional layer LC3C1, and the second layer is 2 groups of parallel-convolutions LC3C21 and LC3C22.LC4 first Layer be convolutional layer LC4C1, second layer LC4C2, third layer LC4C3, the 4th layers be 2 groups of parallel-convolution LC4C41 with LC4C42。

Drop convolution module C is followed by average pond layer, be followed by random drop layer (i.e. Dropout layers), it is equal in above layers Contain the calculating weighted value (i.e. parameter) for calculating the data transmitted to next layer.

Each layer parameter of convolutional neural networks model is as shown in table 1 below in the present embodiment.

Table 1

Layer name	Parameter
		Input layer	299×299×3
P1	Pond section 2 × 2, average pond
		LA1C1	Convolution kernel 1 × 1, channel 96, moving step length 1
LA2C1	Convolution kernel 1 × 1, channel 96, moving step length 1
		LA3C1	Convolution kernel 1 × 1, channel 64, moving step length 1
LA3C2	Convolution kernel 3 × 3, channel 96, moving step length 1
		LA4C1	Convolution kernel 1 × 1, channel 64, moving step length 1
LA4C2	Convolution kernel 3 × 3, channel 96, moving step length 1
		LA4C3	Convolution kernel 3 × 3, channel 96, moving step length 1
P2	Pond section 3 × 3, maximum pond, moving step length 2
		RA2C1	Convolution kernel 3 × 3, channel 384, moving step length 2
RA3C1	Convolution kernel 1 × 1, channel 192, moving step length 1
		RA3C2	Convolution kernel 3 × 3, channel 224, moving step length 1
RA3C3	Convolution kernel 3 × 3, channel 256, moving step length 2
		P3	Pond section 2 × 2, average pond
LB1C1	Convolution kernel 1 × 1, channel 128, moving step length 1
		LB2C1	Convolution kernel 1 × 1, channel 384, moving step length 1
LB3C1	Convolution kernel 1 × 1, channel 192, moving step length 1
		LB3C2	Convolution kernel 1 × 7, channel 224, moving step length 1
LB3C3	Convolution kernel 7 × 1, channel 256, moving step length 1
		LB4C1	Convolution kernel 1 × 1, channel 192, moving step length 1
LB4C2	Convolution kernel 1 × 7, channel 192, moving step length 1
		LB4C3	Convolution kernel 7 × 1, channel 224, moving step length 1
LB4C4	Convolution kernel 1 × 7, channel 224, moving step length 1
		LB4C5	Convolution kernel 7 × 1, channel 256, moving step length 1
P4	Pond section 3 × 3, maximum pond, moving step length 2
		RB2C1	Convolution kernel 1 × 1, channel 192, moving step length 1
RB2C2	Convolution kernel 3 × 3, channel 192, moving step length 2
		RB3C1	Convolution kernel 1 × 1, channel 256, moving step length 1
RB3C2	Convolution kernel 1 × 7, channel 256, moving step length 1
		RB3C3	Convolution kernel 7 × 1, channel 320, moving step length 1
RB3C4	Convolution kernel 3 × 3, channel 320, moving step length 2
		P5	Pond section 2 × 2, average pond
LC1C1	Convolution kernel 1 × 1, channel 256, moving step length 1
		LC2C1	Convolution kernel 1 × 1, channel 256, moving step length 1
LC3C1	Convolution kernel 1 × 1, channel 384, moving step length 1
		LC3C21	Convolution kernel 1 × 3, channel 256, moving step length 1
LC3C22	Convolution kernel 3 × 1, channel 256, moving step length 1
		LC4C1	Convolution kernel 1 × 1, channel 384, moving step length 1
LC4C2	Convolution kernel 1 × 3, channel 448, moving step length 1
		LC4C3	Convolution kernel 3 × 1, channel 512, moving step length 1
LC4C41	Convolution kernel 1 × 1, channel 256, moving step length 1
		LC4C42	Convolution kernel 1 × 3, channel 256, moving step length 1
P6	1024 dimension of output
		Dropout	Random drop ratio 0.5
FC6	10575 dimension of output

As can be seen from Table 1, after the completion of the present embodiment model construction, that is, training set can be used, it is trained.

Pretreatment training image is inputted convolutional neural networks model by step S1-3.

Step S1-4, to transmitting and calculate training error before being carried out in convolutional neural networks model.

Step S1-5 is carried out in a model using back-propagation algorithm transmission error undated parameter according to training error Backpropagation calculates, and gradually adjusts each layer parameter, so that training error gradually decreases.

Step S1-6 repeats step S1-3 to step S1-5, (completes predetermined until having reached and having completed trained condition Cycle-index, parameter restrained or essentially eliminated training error), the convolutional neural networks model after being trained, make It is characterized extraction model.

In order to facilitate image input, accelerate model training speed, the above-mentioned training process of the present embodiment uses in batches defeated Enter the mode of processing.That is, the image of training set is divided into 203 batches, every batch of 64 images of input, then every batch of is distinguished Carry out step S1-4 ~ step S1-5 processing；Whole batches are completed after inputting and handling, and one cycle just completes, then It can carry out the input processing process in batches of next circulation.

In the present embodiment, circulation total degree is 200 times.In addition, the initial learning rate of model is set as in input 0.003, and 10 times are reduced respectively in 40,80 and 110 circulations.Model is exercised supervision using Softmax loss function, is passed through The setting of step S1-5 carries out backpropagation undated parameter.

By above-mentioned steps, the convolutional neural networks model of the present embodiment just completes building and training, can be used for into Every trade people identifies again.In the present embodiment, the above-mentioned convolutional neural networks model after training is used for as a kind of feature extraction unit Carry out the characteristic vector pickup of target image and image to be determined, obtained feature vector can be used to determine target image and each The similarity degree of a image to be determined, thus find out in multiple images to be determined with the consistent pedestrian image of target image.

It also needs to be pre-processed accordingly before target image and image to be determined are inputted trained model, to obtain The consistent image of size, i.e. step S2: pretreatment is carried out to target image and obtains pretreatment goal image, and to be determined Image carries out pretreatment and obtains corresponding pretreatment image to be determined.

In the present embodiment, using Market1501 data set as test data set, which includes Totally 19281 images of 750 people.

These images to be determined and target image also need to be pre-processed before inputting trained model, the pre- place Reason process and the preprocessing process (i.e. step) of the pedestrian image as training set are essentially identical, include the following steps:

Step S2-1 carries out pedestrian detection to image to be processed, finds out pedestrian position therein；

Pedestrian's human body that step S2-2, detecting step S2-1 are found out include at least head, trunk and four limbs including it is multiple Key position point；

Step S2-3 carries out alignment operation to image to be processed according to key position point and image to be processed is carried out size unification Change；

To be processed image of the size after unitized is carried out center and cut out, obtain corresponding pretreatment image by step S2-4.This reality It applies in example, different from training set, the use center of each pedestrian image as test set is cut out.

Above-mentioned image to be determined and target image can carry out feature extraction and judgement after pretreatment, i.e., following step It is rapid:

Pretreatment image to be determined and pretreatment goal image are sequentially input Feature Selection Model, to be divided by step S3 Not with pretreatment corresponding multiple feature vectors to be determined of image to be determined and mesh corresponding with pretreatment goal image Mark feature vector.

Step S4, according to target feature vector and vector to be determined determine in image to be determined with target image one The pedestrian image of cause.

In the present embodiment, it is trained the investigation of rear model judgement precision for convenience, by pretreatment goal image and right The pretreatment image to be determined (image obtained after pretreatment) answered is set as picture pair.It is obtained in input model respectively After corresponding feature vector, calculated using COS distance come the similarity of two feature vectors to a picture centering, When the COS distance being calculated is greater than preset value, then determine it is not same in the target image and image to be determined of picture centering One people, and be less than preset value when then determine to be the same person in the two.

Clearly as the to be determined of picture centering is only to be formed after target image is overturn, therefore actually the two is equal For the same person.When the process for obtaining feature vector by above-mentioned image preprocessing, input model, calculating COS distance and determining Afterwards, if the judgement result of a picture pair be not the same person, illustrate that the secondary face recognition result is wrong, and if it is determined that knot Fruit is the same person, then illustrates that secondary pedestrian's recognition result is correct.

Table 2 is the judgement precision based on the parallel pedestrian for dropping convolution kernel neural network again recognition methods of the embodiment of the present invention As a result, and the comparison with existing common method model accuracy.Wherein, " Inception V1 ", " Inception V2 ", " Alexnet ", " densenet " are common image recognition neural networks.

Table 2

。

Recognition methods accuracy of identification is higher again by the pedestrian based on parallel drop convolution kernel neural network of the present embodiment, is more than The precision of pedestrian commonly used in the prior art recognition methods again.

Embodiment action and effect

According to the present embodiment, by carrying out parallel decomposition to dimension convolution kernel, while by high-dimensional convolution kernel with it is multiple symmetrically or Asymmetric low dimensional convolution replaces, and greatly reduces calculative number of parameters and calculates the time, enables model training It faster completes, and also can faster be completed using the characteristic vector pickup that the model after training carries out, so that pedestrian be allowed to know again The feature extraction speed of other model training speed, target image and image to be determined is accelerated.

Above-described embodiment is only used for the specific embodiment illustrated the present invention, not limitation of the invention.

According to this method, the present invention can also provide corresponding pedestrian's weight identification device, comprising: pass through above-mentioned building and instruction The convolutional neural networks model got is packaged to form convolutional neural networks model construction and training module, sentences for treating Determine image and target image carries out pretreated preprocessing module, for the characteristic extracting module of feature extraction, and according to The target feature vector that characteristic extracting module extracts carries out the consistent determination module of consistency checking with vector to be determined.This four The successively operation for the execution pedestrian again step S1, step S2, step S3, step S4 of recognition methods of the function of a module.

In embodiment, between target image and image to be determined whether be unanimously between the feature vector by the two more than Chordal distance calculates to determine.In the present invention, other vector distance calculations be can use also to determine target image Consistency between image to be determined.

In embodiment, in order to facilitate image input, accelerate model training speed, training process uses input processing in batches Mode.But when using other few training sets of amount of images, can not also by the way of input processing in batches, but Directly training set is fully entered, then carries out step S1-4 ~ step S1-5 treatment process.

Claims

1. a kind of pedestrian's recognition methods again based on parallel dimensionality reduction convolutional neural networks, which is characterized in that specific step is as follows:

2. pedestrian according to claim 1 recognition methods again, which is characterized in that include following sub-steps in step S1:

Step S1-2 constructs the convolutional neural networks model based on parallel drop convolution kernel, which contains defeated Enter module, drop convolution module, reduction module, pond layer and full articulamentum；Wherein, input module is for inputting image to be detected Data simultaneously extract correlated characteristic；It drops convolution module and convolution is carried out using multi-channel parallel structure, to reduce image features；Contracting Subtract module and carries out dimensionality reduction to characteristics of image is extracted；Parameter in the parameter matrix of each layer is to be randomly provided；Wherein, convolution module drops Be divided into 3 classes, be denoted as: drop convolution module A, drop convolution module B, drop convolution module C, reduction module are divided into 2 classes, are denoted as: reduction mould Block A, reduction module B；

It is 4 sequentially connected drop convolution module A after input module；Convolution module A points of drop are made of 4 groups of convolution；

Drop convolution module C is followed by average pond layer, be followed by random drop layer, in above layers containing for calculating to next The calculating weighted value of the data of layer transmitting；

3. pedestrian according to claim 2 recognition methods again, which is characterized in that step S1-1, including following sub-step:

Step S1-1-1: Face datection is carried out to image to be processed, finds out face location therein；

Step S1-1-2: multiple key position points in the people pedestrian that detecting step S1-1-1 is found out, the key position point is extremely Key position point including head, trunk and four limbs less；

Step S1-1-3 carries out alignment operation to image to be processed according to key position point, and image to be processed is carried out size It is unified；

Step S1-1-4 is cut out the image to be processed of size after reunification, obtains corresponding pretreatment image.

4. pedestrian according to claim 2 recognition methods again, which is characterized in that in step S1.2:

The drop convolution module A points are 4 groups of convolution, are denoted as respectively: LA1, LA2, LA3, LA4；In LA1, first layer is average pond Change layer, is denoted as P1, the second layer is convolutional layer, is denoted as LA1C1；LA2 is denoted as LA2C1 containing only one layer of convolutional layer；In LA3, first layer For convolutional layer, it is denoted as LA3C1, the second layer is convolutional layer, is denoted as LA3C2；In LA4, first layer is convolutional layer, is denoted as LA4C1, the Two layers are convolutional layer, are denoted as LA4C2, third layer is convolutional layer, is denoted as LA4C3；

The drop convolution module B points are 4 road convolutional channels, are denoted as LB1, LB2, LB3, LB4 respectively；In LB1, first layer is average Pond layer is denoted as P3, and the second layer is convolutional layer, is denoted as LB1C1；LB2 is denoted as LB2C1 containing only one layer of convolutional layer；In LB3, first Layer is convolutional layer, is denoted as LB3C1, the second layer is convolutional layer, is denoted as LB3C2, and third layer is convolutional layer, is denoted as LB3C3；In LB4, First layer is convolutional layer, is denoted as LB4C1, and the second layer is convolutional layer, is denoted as LB4C2, and third layer is convolutional layer, is denoted as LB4C3, the Four layers are convolutional layer, are denoted as LB4C4；

The drop convolution module C is made of 5 groups of convolution, is denoted as respectively: LC1, LC2, LC3, LC4, LC5；In LC1, first layer is Average pond layer, is denoted as P5, the second layer is convolutional layer, is denoted as LC1C1；LC2 is denoted as LC2C1 containing only one layer of convolutional layer；In LC3, First layer is convolutional layer, is denoted as LC3C1, and the second layer is 2 groups of parallel-convolutions, is denoted as LC3C21 and LC3C22；In LC4, first layer For convolutional layer, it is denoted as LC4C1, the second layer is convolutional layer, is denoted as LC4C2, and third layer is convolutional layer, is denoted as LC4C3, and the 4th layer is 2 groups of parallel-convolutions, are denoted as LC4C41 and LC4C42；

The reduction modules A is made of 3 groups of convolution rows, is denoted as respectively: RA1, RA2, RA3；RA1 is maximum pond layer, is denoted as P2, RA2 is convolutional layer, is denoted as RA2C1, and in RA3, first layer is convolutional layer, is denoted as RA3C1, and the second layer is convolutional layer, is denoted as RA3C2, third layer are convolutional layer, are denoted as RA3C3；

Described reduction module B is made of 3 groups of convolution, is denoted as respectively: RB1, RB2, RB3；RB1 is maximum pond layer, is denoted as P4, In RB2, first layer is convolutional layer, is denoted as RB2C1, and the second layer is convolutional layer, is denoted as RB2C2；In RB3, first layer is convolutional layer, It is denoted as RB3C1, the second layer is convolutional layer, is denoted as RB3C2, and third layer is convolutional layer, is denoted as RB3C3, and the 4th layer is convolutional layer, note For RB3C4.

5. pedestrian according to claim 4 recognition methods again, which is characterized in that in the convolutional neural networks, same Being added in layer convolutional layer has parallel convolution kernel, so that single layer convolution is extracted the parameter attribute of different sparse degree, increases net Network width, while increasing the adaptability of network.

6. pedestrian according to claim 4 recognition methods again, which is characterized in that in the convolutional neural networks, convolution mind Convolution kernel through network is changed to be replaced by 2 n × n convolution kernels by common m × m, and n < m subtracts while to obtain same field of view The number of parameters of few convolutional layer, while also increasing the depth of neural network.

7. pedestrian according to claim 4 recognition methods again, which is characterized in that in the convolutional neural networks, to size The symmetrical convolution kernel of biggish n × n greater than 5 × 5 is combined to replace with the asymmetric convolution kernel of 1 × n and n × 1, is being extracted It is further reduced number of parameters and calculation amount under the premise of feature quantity is constant, obtains better training effect.

8. pedestrian according to claim 4 recognition methods again, which is characterized in that each layer parameter of convolutional neural networks model is such as Shown in following table:

9. pedestrian according to claim 4 recognition methods again, which is characterized in that each layer of convolutional layer of step S-1 Output all carries out batch standardization, each layer of output is all standardized to the normal distribution of a N (0,1).

10. pedestrian according to claim 2 recognition methods again, which is characterized in that the training completion condition in step S1-6 Are as follows: scheduled cycle-index is completed, parameter has restrained or eliminates training error.

11. pedestrian according to claim 1 recognition methods again, which is characterized in that step S2 includes following sub-step:

To be processed image of the size after unitized is carried out center and cut out, obtain corresponding pretreatment image by step S2-4.

12. pedestrian's weight identification device based on one of claim 1-11 the method characterized by comprising convolutional Neural Network model building, training module, image to be determined and target image preprocessing module, characteristic extracting module, and it is consistent Sex determination module；This four modules execute functions be corresponding in turn in pedestrian again recognition methods step S1, step S2, step S3, The operation of step S4.