CN109670528A

CN109670528A - The data extending method for blocking strategy at random based on paired samples towards pedestrian's weight identification mission

Info

Publication number: CN109670528A
Application number: CN201811352790.4A
Authority: CN
Inventors: 赵佳琦; 夏士雄; 姚睿; 周勇; 牛强; 闫秋艳; 张凤荣; 张家诚
Original assignee: China University of Mining and Technology CUMT
Current assignee: China University of Mining and Technology CUMT
Priority date: 2018-11-14
Filing date: 2018-11-14
Publication date: 2019-04-23
Anticipated expiration: 2038-11-14
Also published as: CN109670528B

Abstract

The invention discloses a kind of data extending methods for blocking strategy at random based on paired samples towards pedestrian's weight identification mission, in the training stage, increase the diversity of sample by using the data extending method for blocking strategy at random based on paired samples first, the robustness in depth pedestrian molality type training process is improved, and then improves the Generalization Capability of model.The present invention is compared with the method for data extending in the prior art, the characteristics of twin deep learning model training data are utilized, while considering the difficulty of twin network training, proposes a kind of new data extending method.By the diversity for increasing training data pair, it is effectively relieved that single row personal data collection classification is few and lack of diversity problem is influenced to bring, the Generalization Capability of model is improved, the pedestrian phase pedestrian that recognition methods can be handled preferably under complex environment again is allowed to identify problem again.

Description

The data for being blocked strategy at random based on paired samples towards pedestrian's weight identification mission are expanded Fill method

Technical field

It the invention belongs to image retrieval technologies field, is judged in image or video sequence using computer vision technique With the presence or absence of the technology of specific pedestrian, a kind of hiding at random based on paired samples towards pedestrian's weight identification mission is further related to Keep off the data extending method of strategy.

Background technique

In monitor video, background block with pedestrian away from camera farther out caused by due to low resolution etc., often It is unable to get the picture that can be used for recognition of face.And when face recognition technology can not be in the case where normal use, pedestrian knows again A very important substitute technology is not just become.Pedestrian identifies that having a very important characteristic is exactly across camera shooting again Head, so being the identical pedestrian's picture that retrieve under different cameras when academia evaluates performance.Pedestrian identifies again Through academia study for many years, but up to date several years with the development of deep learning, just achieve very huge breakthrough.

Tradition is roughly divided into following several classes based on the algorithm of image identified by feature representation method progress pedestrian again:

(1) bottom visual signature: this method is essentially all to divide an image into multiple regions, to each extracted region A variety of different bottom visual signatures obtain the better character representation form of robustness after combination, most common is exactly that color is straight Fang Tu；

(2) middle layer semantic attribute: judging whether belong to same a group traveling together in two images by semantic information, such as color, The information such as clothes and the packet of carrying, identical pedestrian semantic attribute under different video captures seldom change；

(3) high-level vision feature: the selection technique of feature promotes the discrimination that pedestrian identifies again.Use depth Habit progress pedestrian knows method for distinguishing again and is that it does not need artificial selected characteristic, passes through end with the maximum difference of conventional method To the study at end, automatically learn the various features in pedestrian's picture.Therefore, identify field again in pedestrian, in face of it is numerous can be with The feature of selection, the method based on deep learning model can reach preferable effect.Existing deep learning model mainly belongs to In the classification of convolutional neural networks, usually used model has CaffeNet, VGGNet and residual error network etc..

Pedestrian identifies that problem has the following problems compared to normal image classification problem again:

(1) have label data small scale: there are many database row personal data that existing pedestrian identifies again, and total amount of data is very big, But the image data amount of single pedestrian is small；

(2) data deficiency diversity: since the single individual images data scale for including in data is smaller, training dataset The image information of offer is naturally not abundant enough；

(3) field scene is complicated, often will appear the phenomenon that pedestrian is blocked, using data set training ideally Model is difficult to be applied directly in actual scene.

Pedestrian identifies that the problem of data set sample diversity difference significantly limits deep learning model treatment pedestrian weight again The performance of identification mission.Since data scale is limited, the feature representation of these model learnings is caused not have robustness, and mould Type is easy to produce the case where over-fitting.

Summary of the invention

It is an object of the invention to overcome above-mentioned the deficiencies in the prior art, propose a kind of towards pedestrian's again identification mission Block the data extending method of strategy at random based on paired samples.The present invention compared with the method for data extending in the prior art, The characteristics of twin deep learning model training data are utilized, while the difficulty of twin network training is considered, propose one kind New data extending method.By increasing the diversity of training data pair, it is few and scarce that single row personal data collection classification is effectively relieved Weary diverse problems give bring to influence, and improve the Generalization Capability of model, and allowing pedestrian, recognition methods can be handled preferably again Phase pedestrian under complex environment identifies problem again, can be widely applied to the fields such as intelligent video monitoring, intelligent security.

To achieve the goals above, the technical solution adopted by the present invention is that:

A kind of data extending method for blocking strategy at random based on paired samples towards pedestrian's weight identification mission,

In the training stage, increase sample by using the data extending method for blocking strategy at random based on paired samples first Diversity, improve depth pedestrian molality type training process in robustness, and then improve model Generalization Capability；

In test phase, pedestrian's weight identification mission can be effectively carried out without carrying out blocking processing to test image, Specifically includes the following steps:

S1 building pedestrian identifies the twin residual error network of depth again

S1.1, the first depth residual error network of construction, using transfer learning strategy, importing is instructed in advance on ImageNet data set Experienced residual error network parameter, as the underlying parameter of the first depth residual error network；

S1.2, the model structure by replicating the first depth residual error network and parameter obtain the second depth residual error network；

S1.3, square for calculating the feature vector difference that two depth residual error networks export, utilize convolutional layer and classifier Two classification are carried out, judge whether the input of above-mentioned two depth residual error network is same category of image；

S1.4, the Euclidean distance for calculating two residual error network output feature vectors, use its feature for generic image Between Euclidean distance as network model canonical lose；

S2 constructs training dataset

S2.1, upset the sequence that training data concentrates image, generate training data pair, during model parameter training By different classes of image to each period multiplied by the factor 1.01 until different classes of image pair and the same category image pair Between ratio 4:1 is progressively increased to by 1:1；

S2.2, by the size adjusting of every picture at 256 × 256, and random cropping is at 224 × 224；

S2.3, the sample of random selection 2/3 blocks at random to using based on paired samples from every batch of training data Policing action increases the diversity of training sample for data extending, specifically:

During execution blocks strategy at random based on paired samples, the sample for randomly choosing 90% synchronizes screening Gear, i.e., two pictures block identical region；5% sample to first image is blocked at random, and second image is without hiding Gear processing；Remaining 5% sample to second image is blocked at random, and first image is without blocking processing；

Execute block strategy at random based on paired samples during, for the image that blocks of needs by impartial division For 16 × 16 grid, every image is uniformly divided into 256 image blocks, one 1~128 random number is randomly generated Nre is used to record the number for the image block to be blocked；

During execution blocks strategy at random based on paired samples, the position of Nre image block is randomly generated, with instruction Practice the pixel value of the mean value substitution corresponding position of all images in sample set；

S3 pedestrian identifies the twin residual error network model training of depth again

S3.1, the training dataset constructed using step S2 identify depth to pedestrian using batch gradient descent algorithm again It spends twin residual error network and carries out parameter training；

After S3.2, parameter training are good, the first depth residual error network is taken out to the feature extraction for being used for pedestrian image；

S4 constructs test sample collection, including query set and library collection two set；

S5 test sample weight recognition performance

The image that all test samples are concentrated is sent into trained first depth residual error network and carries out feature extraction, and And sample is concentrated to search the pedestrian to be searched in the Euclidean distance of feature space according to query set and library；

S6 exports pedestrian's weight recognition result.

Step S1.1 is specific as follows:

S1.1.1, the existing depth residual error network of removal last full articulamentum and probability layer, form the first depth residual error Network exports the feature vector f of input picture₁；

S1.1.2, convolutional layer and full connection softmax classifier are added to the first depth residual error network, convolutional layer is set Characteristic pattern number is pedestrian's identity category number n, and convolutional layer is by f₁Mapping becomes n-dimensional vector, exports final class by full link sort device It does not predict；

S1.1.3, outputting and inputting for the first depth residual error network define loss function:

Wherein, x indicates all pedestrian's data in input network, and input indicates the input of the twin network of depth, The output of the output expression twin network of depth.

Step S1.2 is specific as follows:

S1.2.1, the first depth residual error network of removal last full articulamentum and probability layer, form the second depth residual error net Network exports the feature vector f of input picture₂；

S1.2.2, convolutional layer and full connection softmax classifier are added to the second depth residual error network, convolutional layer is set Characteristic pattern number is pedestrian's identity category number n, and convolutional layer is by f₂Mapping becomes n-dimensional vector, exports final class by full link sort device It does not predict；

S1.2.3, outputting and inputting for the second depth residual error network define loss function:

Step S1.3 is specific as follows:

S1.3.1, setting square layer, the feature vector f that two depth residual error networks are exported₁、f₂Squared difference is taken, is obtained f_s=(f₁-f₂)²

The convolutional layer that S1.3.2, setting Feature Mapping figure number are 2, by f_sMapping becomes 2 dimensional vectors and exports；

S1.3.3, full connection softmax classifier generate final prediction to 2 dimensional vectors of output, i.e. input picture is to being It is no to come from same category；

S1.3.4, for the same category or different classes of input picture to q, define loss function:

Wherein, i indicates the i-th dimension of 2 dimensional vectors, and q is input picture pair, and s is that depth residual error network predicts that two images are It is no to belong to of a sort prediction classification.

Step S1.4 is specific as follows:

S1.4.1, for input picture to (x_i, x_j), calculate the feature vector f of two depth residual error networks output₁、f₂'s Euclidean distanceDefine the canonical loss function of the same category image pair:

Wherein (x_i, x_j) indicate two images pair inputted, D (x_i, x_j) indicate image x_iAnd x_jIn the distance of feature space.

Step S3.1 is specific as follows:

S3.1.1, using batch descent method to 3 loss functions of step S1.1.3, step S1.2.3, step S1.3.4 It is optimized；

S3.1.2, the weight that 3 loss functions are set, respectively λ₁, λ₂, λ₃；

S3.1.3, parameter testing is carried out by a series of experiments, determines optimal weighted value.

Step S3.2 is specific as follows:

S3.2.1,3 loss function training are minimized into loss function to optimal；

S3.2.2, disaggregated model of the trained depth residual error network as next step is taken out.

Step S4 is specific as follows:

S4.1, building test sample collection, including query set and library collection two set；

Every picture size that S4.2, adjustment test sample are concentrated, allows picture that can be directly inputted to depth residual error network, Network can use the model parameter of pre-training, reduce the calculation amount of model training.

Step S5 is specific as follows:

S5.1, disaggregated model are single pass depth residual error networks, and corresponding input is single image；

S5.2, classification standard are using the first hit rate and mean accuracy mean value, wherein the first hit rate refers in search result It is the probability of correct result near a preceding figure, is generally averaged by testing repeatedly；Mean accuracy mean value be take it is more The mean value of secondary inquiry accuracy rate represents the accuracy rate of inquiry, and the two numerical value is higher, and expression model performance is better.

Compared with the prior art, the present invention has the following advantages:

First, it is proposed by the present invention to block strategy at random based on paired samples, in the twin insufficient situation of network sample Under, it can effectively expand the quantity of training sample, while improving the diversity of training sample.

Second, it is proposed by the present invention to be based on paired samples to block the tactful present invention at random being a kind of light weight method, it is not required to Any additional parameter learning or memory consumption are wanted, it can be easily integrated in various twin deep learning models, without Change learning strategy.

Third, the strategy proposed by the present invention that blocked at random based on paired samples is available data enhancing and regularization method Compensation process blocks tactful data extending method based on paired samples into one by combining with other regularization methods at random Step improves recognition performance.

4th, it is proposed by the present invention to block strategy at random based on paired samples and be applied to pedestrian again in identification problem, it can be with Effectively promoted twin depth model pedestrian identify again aspect in the first hit rate with the performance in terms of mean accuracy mean value.

5th, it is proposed by the present invention based on paired samples block at random strategy effectively alleviate twin deep learning model The insufficient problem of the training of the model parameter due to caused by effective training sample missing problem, simultaneously effective promotes deep learning Model is to the robustness for blocking sample.

6th, it is proposed by the present invention to block strategy at random based on paired samples, by blocking the significant of sample in training set Property region model learning can be allowed to the feature of time conspicuousness, and then the performance of lift scheme.

7th, multiple groups experiment the result shows that the data set that generates of the method for using data extending proposed by the present invention can be with Data line personal data under simulation complex scene very well, allows trained model to have good Generalization Capability, and then allow model Processing complex scene pedestrian identifies that problem method has bigger advantage again.

Detailed description of the invention

Fig. 1 is network structure of the invention；

Fig. 2 is step figure of the invention.

Specific embodiment

Present invention will be described in further detail below with reference to the accompanying drawings.

Referring to Fig.1, the present invention realizes that specific step is as follows:

S1, building pedestrian identify the twin residual error network model of depth again:

S1.1, construction depth residual error network model (ResNet), structure are as follows: input layer → convolutional layer → residual block → residual Poor block → residual block → residual block → residual block → normalization layer → full connection softmax classifier composition depth convolutional Neural Network；

S1.2, using transfer learning strategy, import and use the trained model parameter of ImageNet data set；

Last two layers, i.e. " fc1000 " and " prob " layer in S1.3, removal ResNet model, addition rate are 0.9 It dropout layers, then adds convolutional layer conv_1 and softmax and classifies layer, obtain new depth residual error network model, i.e., first Depth residual error network ResNet_1；

S1.4, the second depth residual error is obtained by replicating the structure and parameter of the first depth residual error network ResNet_1 model Network ResNet_2；

S1.5, pass through the first depth residual error network ResNet_1 of calculating and the second depth residual error network ResNet_2 output 4096 dimensional feature vector differences square obtain " diff_feature " layer of 4096 dimensions, and it is 0.9 that rate is added after this layer Dropout layer, then add conv_2 and softmax layer of convolutional layer carry out two classify, judge that twin network inputs are with this No is same category of image.

S1.6, the Euclidean distance for calculating two residual error network output feature vectors, use its feature for generic image Between Euclidean distance as network model canonical lose.

S2 training dataset construction:

S2.1, the sequence for upsetting image in data set at random, then select another piece image group from identical/different class At positive/negative sample pair, in order to mitigate prediction deviation, it is 1:1 to the initial ratio between positive sample pair that negative sample, which is arranged, in we, During model parameter training by each of which period multiplied by the factor 1.01 until it reaches 1:4, can allow model height in this way The convergence of effect ground, and effectively inhibit the risk of over-fitting；

During execution blocks strategy at random based on paired samples, the sample for randomly choosing 90% synchronizes screening Gear, for 5% sample to first image is blocked at random, remaining 5% sample blocks second image to random；

During execution blocks strategy at random based on paired samples, the position of Nre image block is randomly generated, with instruction Practice the pixel value of the mean value substitution corresponding position of all images in sample set.

S3 trains the twin residual error network model of depth:

S3.1, use batch gradient descent algorithm to the twin residual error network mould of depth using the training dataset constructed Type carries out parameter training；

After S3.2, parameter training are good, the first depth residual error network ResNet_1 is taken out to the feature for being used for pedestrian image It extracts.

S4 constructs test sample, including query set and library collection two set；

Test sample classification: S5 test sample is sent into trained first depth residual error network ResNet_1 model Classify, and obtains classification results in model output layer.

S6, output category result.

Depth residual error network architecture parameter described in step S2.1 is as follows:

For first layer input layer, it is 3, i.e. the three of image Color Channel that characteristic spectrum number, which is arranged,；

For second layer convolutional layer, it is 64 that characteristic spectrum number, which is arranged,；

For 9 layers of the residual block of third layer first, it is 64 that characteristic spectrum number, which is arranged,；

For the 4th layer of 3 layers of second residual block, it is 64 that characteristic spectrum number, which is arranged,；

For 6 layers of layer 5 third residual block, it is 128 that characteristic spectrum number, which is arranged,；

For the 4th 3 layers of residual block of layer 6, setting characteristic spectrum number is 256, is fast connected；

For the 5th 6 layers of residual block of layer 7, setting characteristic spectrum number is 256；

For the 8th layer of normalization layer, it is set as batch normalization mode；

For the 9th layer of pond layer, it is 256 that characteristic spectrum number, which is arranged,；

Connection softmax classifier complete for the 10th layer, setting characteristic spectrum number is pedestrian's class number.

Step S1.1 is specific as follows:

4096 dimension of the first depth residual error network ResNet_1 network output of S1.1.1, removal " fc1000 " and " prob " layer Feature vector f₁；

S1.1.2, the characteristic pattern number that convolutional layer conv_1 is arranged are pedestrian's identity category number n, and convolutional layer conv_1 is by f₁It reflects Penetrating becomes n-dimensional vector, exports final class prediction by softmax classification layer；

S1.1.3, for input input, the output output of softmax, define loss function:

Step S1.2 is specific as follows:

4096 dimension of the second depth residual error network ResNet_2 network output of S1.2.1, removal " fc1000 " and " prob " layer Feature vector f₂；

S1.2.2, the characteristic pattern number that convolutional layer conv_1 is arranged are pedestrian's identity category number n, and convolutional layer conv_1 is by f₂It reflects Penetrating becomes n-dimensional vector, exports final class prediction by softmax classification layer；

S1.2.3, for input input, the output output of softmax, define loss function:

Step S1.3 is specific as follows:

S1.3.2, convolutional layer conv_2 characteristic spectrum number be 2, by f_sMapping becomes 2 dimensional vectors；

S1.3.4, for input picture to q (same/different), define loss function:

Wherein, i indicates the i-th dimension of 2 dimensional vectors, and q is input picture pair, and s is that depth residual error network predicts that two images are It is no to belong to of a sort prediction classification；

Step S1.4 is specific as follows:

S1.4.1, for input picture to (x_i, x_j), calculate the feature vector f of two depth residual error networks output₁、f₂'s Euclidean distance

S1.4.2, the canonical loss function for defining the same category image pair:

How step S3 constructs training set:

Upset the sequence that training data concentrates image, generate training data pair, control different classes of image pair with it is identical Ratio between the image pair of classification progressively increases to 4: 1 by 1: 1；

Step S3.1, specific as follows:

S3.1.3, parameter testing is carried out by a series of experiments, determines optimal weighted value；

Step S4.2 is specific as follows:

S4.2.1,3 loss function training are minimized into loss function to optimal；

S4.2.2, take out disaggregated model of the trained network as next step；

It is as follows how step S5 constructs test sample:

S5.1 including query set and library collection two set；

S5.2, by test sample every picture adjustment size adjusting be 224 × 224；

Step S6 is specific as follows:

4-1a. disaggregated model is single pass ResNet, and corresponding input is single image；

4-1b. classification standard uses rankl Accuracy and mAP, i.e., near a preceding figure is correct result in result Accuracy rate and repeatedly inquiry accuracy rate mean value；

Effect of the invention is described further below:

1, experiment condition:

Experiment of the invention is the software loop of the hardware environment and MATLAB2017 in double NVIDIA GTX 1080Ti GPU It is carried out under border.

Experiment of the invention has used three pedestrians to identify data set Market-1501, DukeMMC and CUHK03 again.

The Market-1501 data set data set acquires in Tsinghua Campus, and image is from 6 different camera shootings Head, wherein having a camera is low pixel.The data set provides training set and test set simultaneously.Training set includes 12,936 Image, test set include 19,732 images.Image is detected and is cut automatically by detector, (close comprising some detection errors Actually use situation).One shares 751 people in training data, and test is concentrated with 750 people.So in training set, average every class (everyone) has 17.2 training datas.

DukeMMC data set acquires in Duke University, and image is from 8 different cameras.The data set provides training Collection and test set.Training set includes 16,522 images, and test set includes 17,661 images.One shares 702 in training data People, average every class (everyone) have 23.5 training datas.It is that current maximum pedestrian identifies data set again, and provides row The mark of humanized (gender/UNEVEN LENGTH OF SLEEVE/whether knapsack etc.).

CUHK03 data set acquires in Hong Kong Chinese University, and image is from 2 different cameras.The data set provides machine Device detection and manual inspection two datasets.Wherein detection data collection includes some detection errors, closer to actual conditions.It is average Everyone has 9.6 training datas.

2, interpretation of result:

Data extending method proposed by the present invention is not used using the method for the present invention and (1) in emulation experiment of the invention (DisCNN) trained and (2) using data extending method proposed by the present invention (Improved DisCNN) to three data sets into Row classification, and classifying quality is compared and analyzed.

Table 1 is that experiment of the invention carries out overall accuracy using three kinds of convolutional neural networks models and the method for the present invention The statistical form of comparison." data set " in table 1 indicates that the pedestrian used identifies that data set type, " method " include not using this again The method DisCNN of invention and using method Improved DisCNN weight recognition result of the invention, " Accuracy " are indicated point The accuracy of class, Rank-1 indicate that identification for the first time is the probability of correct pedestrian, and " Verif+identif " indicates that row is not used People is aligned the twin network of network, and " Base+Align " indicates that the pedestrian that twinned structure is not used is aligned network, " (Base+ Verif)+(Align+Verif) " indicates the method that the present invention uses.

1 pedestrian of table weight recognition result compares list

As it can be seen from table 1 the method for the present invention result on three data sets is superior to other methods.

Claims

1. a kind of data extending method for being blocked strategy at random based on paired samples towards pedestrian's weight identification mission, feature are existed In,

In the training stage, increase the more of sample by using the data extending method for blocking strategy at random based on paired samples first Sample improves the robustness in depth pedestrian molality type training process, and then improves the Generalization Capability of model；

In test phase, pedestrian's weight identification mission can be effectively carried out without carrying out blocking processing to test image, specifically The following steps are included:

S1.1, the first depth residual error network of construction import the pre-training on ImageNet data set using transfer learning strategy Residual error network parameter, as the underlying parameter of the first depth residual error network；

S1.3, square for calculating the feature vector difference that two depth residual error networks export utilize convolutional layer and classifier to carry out Two classification, judge whether the input of above-mentioned two depth residual error network is same category of image；

S1.4, the Euclidean distance for calculating two residual error network output feature vectors, for generic image using between its feature Euclidean distance as network model canonical lose；

S2 constructs training dataset

S2.1, upset the sequence that training data concentrates image, generate training data pair, it will not during model parameter training Generic image is to each period multiplied by the factor 1.01 until between the image pair of different classes of image pair and the same category Ratio 4:1 is progressively increased to by 1:1；

S2.3, the sample of random selection 2/3 blocks strategy to using based on paired samples at random from every batch of training data Operation increases the diversity of training sample for data extending, and remaining 1/3 sample remains unchanged, specifically:

Execute block strategy at random based on paired samples during, randomly choose 90% sample and synchronize and block, i.e., Two pictures block identical region；5% sample to first image is blocked at random, and second image is without blocking place Reason；Remaining 5% sample to second image is blocked at random, and first image is without blocking processing；

During execution blocks strategy at random based on paired samples, 16 are divided by impartial for the image that needs block × 16 grid, every image are uniformly divided into 256 image blocks, and one 1~128 random number N re is randomly generated and uses In the number of the record image block to be blocked；

During execution blocks strategy at random based on paired samples, the position of Nre image block is randomly generated, with training sample This concentrates the pixel value of the mean value substitution corresponding position of all images；

S3.1, the training dataset constructed using step S2 identify that depth is twin to pedestrian using batch gradient descent algorithm again Raw residual error network carries out parameter training；

S5 test sample weight recognition performance

The image that all test samples are concentrated is sent into trained first depth residual error network and carries out feature extraction, and root Sample is concentrated to search the pedestrian to be searched in the Euclidean distance of feature space according to query set and library；

S6 exports pedestrian's weight recognition result.

2. the data according to claim 1 that blocked strategy at random based on paired samples towards pedestrian's weight identification mission are expanded Fill method, which is characterized in that step S1.1 is specific as follows:

S1.1.1, the existing depth residual error network of removal last full articulamentum and probability layer, form the first depth residual error network, Export the feature vector f of input picture₁；

S1.1.2, convolutional layer and full connection softmax classifier are added to the first depth residual error network, the feature of convolutional layer is set Figure number is pedestrian's identity category number n, and convolutional layer is by f₁Mapping becomes n-dimensional vector, and it is pre- to export final classification by full link sort device It surveys；

Wherein, x indicates all pedestrian's data in input network, and input indicates the input of the twin network of depth, output table Show the output of the twin network of depth.

3. the data according to claim 1 that blocked strategy at random based on paired samples towards pedestrian's weight identification mission are expanded Fill method, which is characterized in that step S1.2 is specific as follows:

S1.2.1, the first depth residual error network of removal last full articulamentum and probability layer, form the second depth residual error network, defeated The feature vector f of input picture out₂；

S1.2.2, convolutional layer and full connection softmax classifier are added to the second depth residual error network, the feature of convolutional layer is set Figure number is pedestrian's identity category number n, and convolutional layer is by f₂Mapping becomes n-dimensional vector, and it is pre- to export final classification by full link sort device It surveys；

4. the data according to claim 1 that blocked strategy at random based on paired samples towards pedestrian's weight identification mission are expanded Fill method, which is characterized in that step S1.3 is specific as follows:

S1.3.1, setting square layer, the feature vector f that two depth residual error networks are exported₁、f₂Squared difference is taken, f is obtained_s= (f₁-f₂)²

S1.3.3, full connection softmax classifier generate final prediction to 2 dimensional vectors of output, i.e., whether input picture is to next From same category；

Wherein, i indicates the i-th dimension of 2 dimensional vectors, and q is input picture pair, and s is that depth residual error network predicts whether two images belong to In of a sort prediction classification.

5. the data according to claim 1 that blocked strategy at random based on paired samples towards pedestrian's weight identification mission are expanded Fill method, which is characterized in that step S1.4 is specific as follows:

S1.4.1, for input picture to (x_i,x_j), calculate the feature vector f of two depth residual error networks output₁、f₂It is European DistanceDefine the canonical loss function of the same category image pair:

Wherein (x_i,x_j) indicate two images pair inputted, D (x_i,x_j) indicate image x_iAnd x_jIn the distance of feature space.

6. a kind of number for blocking strategy at random based on paired samples towards pedestrian's weight identification mission according to claim 1 According to extending method, it is characterised in that: step S3.1 is specific as follows:

S3.1.1, it is carried out using 3 loss functions of the batch descent method to step S1.1.3, step S1.2.3, step S1.3.4 It optimizes；

7. a kind of number for blocking strategy at random based on paired samples towards pedestrian's weight identification mission according to claim 1 According to extending method, it is characterised in that: step S3.2 is specific as follows:

S3.2.1,3 loss function training are minimized into loss function to optimal；

8. a kind of number for blocking strategy at random based on paired samples towards pedestrian's weight identification mission according to claim 1 According to extending method, it is characterised in that: step S4 is specific as follows:

Every picture size that S4.2, adjustment test sample are concentrated, allows picture that can be directly inputted to depth residual error network, network The model parameter that can use pre-training reduces the calculation amount of model training.

9. a kind of number for blocking strategy at random based on paired samples towards pedestrian's weight identification mission according to claim 1 According to extending method, it is characterised in that: step S5 is specific as follows:

S5.2, classification standard are using the first hit rate and mean accuracy mean value, wherein the first hit rate refer in search result near A preceding figure is the probability of correct result, is averaged by testing repeatedly；Mean accuracy mean value is to take repeatedly inquiry quasi- The mean value of true rate represents the accuracy rate of inquiry, and the two numerical value is higher, and expression model performance is better.