CN109063649A

CN109063649A - Pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian

Info

Publication number: CN109063649A
Application number: CN201810876899.1A
Authority: CN
Inventors: 周勇; 郑沂; 赵佳琦; 姚睿; 刘兵; 夏士雄; 刘栩宁
Original assignee: China University of Mining and Technology CUMT
Current assignee: China University of Mining and Technology CUMT
Priority date: 2018-08-03
Filing date: 2018-08-03
Publication date: 2018-12-21
Anticipated expiration: 2038-08-03
Also published as: CN109063649B

Abstract

The invention discloses a kind of pedestrian's recognition methods again that residual error network is aligned based on twin pedestrian, comprising the following steps: S1, constructs the twin residual error network of basic branch；S2, building pedestrian are aligned the twin residual error network of branch；S3, the twin residual error network progress parameter training of branch is aligned to the twin network of basic branch built and pedestrian using the training dataset constructed, pedestrian in branch's prototype basic in the twin residual error network of trained basic branch and pedestrian's alignment twin residual error network of branch is aligned the taking-up of branch's prototype and carries out the disaggregated model that pedestrian identifies again.The present invention improves the accuracy that original algorithm pedestrian identifies again.

Description

Pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian

Technical field

It the invention belongs to image retrieval technologies field, is judged in image or video sequence using computer vision technique With the presence or absence of the technology of specific pedestrian, further relates to one of pedestrian's weight identification technology field and be aligned based on twin pedestrian The pedestrian of residual error network recognition methods again.

Background technique

In monitor video, background block with pedestrian away from camera farther out caused by due to low resolution etc., often It is unable to get the image that can be used for recognition of face.And when face recognition technology can not be in the case of normal use obtains, pedestrian knows again A very important substitute technology is not just become.Pedestrian identifies that having a very important characteristic is exactly across camera shooting again Head, so being the identical pedestrian image that retrieve under different cameras when evaluating performance in academic paper.Pedestrian knows again It does not study for many years, but the development with deep learning in several years up to date, just achieves very huge prominent in academia It is broken.

Tradition is roughly divided into following several classes based on the algorithm of image identified by feature representation method progress pedestrian again:

(1) bottom visual signature: this method is essentially all to divide an image into multiple regions, to each extracted region A variety of different bottom visual signatures obtain the better character representation form of robustness after combination, most common is exactly that color is straight Fang Tu；

(2) middle layer semantic attribute: judging whether belong to same a group traveling together in two images by semantic information, such as color, The information such as clothes and the packet of carrying, identical pedestrian semantic attribute under different video captures seldom change；

(3) high-level vision feature: the selection technique of feature promotes the discrimination that pedestrian identifies again.Use depth Habit progress pedestrian knows method for distinguishing again and is that it does not need artificial selected characteristic, passes through end with the maximum difference of conventional method To the study at end, automatically learn the various features in pedestrian image.

Therefore, field is identified again in pedestrian, another characteristic is known again for carrying out pedestrian by artificial selection, due to characteristic Measure it is numerous, it is practical by the picture that camera photographed may also be multifarious etc. reasons, be difficult to determine certain specific feature to institute Some images have good performance.Therefore, pedestrian's feature is chosen compared to artificial, the method based on deep learning model can Reach preferable effect.

Existing deep learning model principally falls into the classification of convolutional neural networks, and usually used model has CaffeNet, VGGNet and residual error network etc..

Summary of the invention

The invention proposes a kind of based on the pedestrian's recognition methods again for being aligned residual error network based on twin pedestrian.It can be effective Raising whole network precision, improve the accuracy rate that identifies again of pedestrian.

Meanwhile network takes channel structure, pedestrian image inputs in pairs, and image is to including similar image and inhomogeneity figure Picture provides the feedback of positive negative sample, makes e-learning to the feature with judgement index.

In order to achieve the above technical purposes, the present invention uses following specific technical solution:

A kind of pedestrian's recognition methods again being aligned residual error network based on twin pedestrian, including the following steps:

S1, the twin residual error network of basic branch is constructed；

S1.1, construction first foundation branch depth residual error network are imported using transfer learning strategy in ImageNet data The residual error network parameter of pre-training on collection, as the underlying parameter of first foundation branch depth residual error network；

S1.2, the model structure by replicating first foundation branch depth residual error network and parameter obtain the second basic branch Depth residual error network；

S1.3, square for calculating the feature vector difference that two basic branch depth residual error networks export, utilize convolutional layer Two classification are carried out with classifier, judge whether the input of above-mentioned two basic branch depth residual error network is same category of figure Picture；

S2, building pedestrian are aligned the twin residual error network of branch；

S2.1, the first pedestrian of construction are aligned branch depth residual error network, use trained first foundation branch depth Any one in residual error network or the second basic branch depth residual error network, is deleted for high dimensional feature image to be returned as spy Levy the residual block of vector；

The high dimensional feature image result of output is returned to the parameter for being used to carry out affine transformation via a residual block, to defeated Low-dimensional characteristic image out carries out affine transformation, obtains the pedestrian image by alignment；

By appointing in trained first foundation branch depth residual error network or the second basic branch depth residual error network Meaning one leaves out the low-dimensional characteristic image for obtaining carrying out affine transformation and its residual block before, and is used to training and obtains By alignment pedestrian image；

S2.2, the first pedestrian of duplication are aligned the model structure of branch depth residual error network and parameter obtains the second pedestrian alignment Branch depth residual error network；

S2.3, square that above-mentioned two pedestrian is aligned the feature vector difference of branch depth residual error network output, benefit are calculated Two classification are carried out with convolutional layer and classifier, judge whether the input of two branching networks is same category of image；

S3, to be aligned branch to the twin network of basic branch built and pedestrian using the training dataset constructed twin Raw residual error network carries out parameter training, and branch's prototype basic in the twin residual error network of trained basic branch and pedestrian are aligned Pedestrian is aligned branch's prototype and takes out the disaggregated model that progress pedestrian identifies again in the twin residual error network of branch；

S3.1, use batch gradient descent method twin to the basic branch constructed respectively using the training dataset constructed Raw residual error network and pedestrian are aligned the twin residual error network of branch and carry out parameter training；

S3.2, train after parameter that the twin residual error network of any one basic branch and pedestrian are aligned branch respectively is twin Raw residual error network, which takes out, is used as pedestrian image disaggregated model；

S4, building test sample and query sample；

S5, test sample classification: test sample is respectively fed to trained basic branch depth residual error network and pedestrian Feature extraction is carried out in alignment branch depth residual error network；

S6, the feature for obtaining two branching networks carry out feature connection；

S7, carry out Euclidean distance to the image of the image of test sample and query sample sorted lists are calculated；

S8, progress pedestrian identifies again on the basis of reordering.

Step S1.1 is specific as follows:

The 2048 dimensional vector f that S1.1.1, output are obtained by average Chi Huahou₁；

S1.1.2, the Feature Mapping figure number that convolutional layer is arranged are pedestrian's class number n, and convolutional layer is by f₁Mapping becomes n and ties up Vector exports final class prediction by full link sort device；

S1.1.3 residual error network twin for basic branch is output and input, and defines first-loss function:

Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θ_IConvolutional layer used in representing Parameter, t is pedestrian's classification, and f is the obtained feature vector after basic branch depth residual error network carries out feature extraction, The probability for belonging to some pedestrian's classification t for the feature vector f of classifier functions output, for any image i and certain a group traveling together's class Other t, p_iWhether representative image i belongs to pedestrian classification t, if belonged to, p_i=1, otherwise p_i=0,For any image i process The probability value obtained after the processing of softmax function.

Step S1.3 is specific as follows:

S1.3.1, setting square layer, by the feature vector f of two basic branch depth residual error network output₁、f₂Take difference Square, obtain f_s=(f₁-f₂)²

The convolutional layer that S1.3.2, setting Feature Mapping figure number are 2, by f_sMapping becomes 2 dimensional vectors；

S1.3.3, it is fully connected to output generation final prediction of two classifiers to S1.3.2, i.e., whether input picture is to next From same category；

S1.3.4, for the same category or different classes of input picture to q, define the second loss function:

Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θ_SConvolutional layer used in representing Parameter, s is identical or not identical two classifications, f_sFor the feature vector that convolution obtains after square layer,For classifier letter The feature vector f of number output_sIt whether is same class pedestrian θ_SProbability, f₁、f₂Respectively by the twin residual error net of formation base branch The feature that the basic branch depth residual error network of two of network extracts, if f₁、f₂It is same people, q₁=1, q₂=0；Otherwise q₁= 0, q₂=1, after convolutional layer and the processing of softmax function, by f_sIt is mapped as a bivectorThis two dimension Whether two images that vector represents input belong to the probability of same pedestrian's classification, wherein

Step S2.1 is specific as follows:

The 2048 dimensional vector f that S2.1.1 output is obtained by average Chi Huahou_a；

The Feature Mapping figure number that convolutional layer is arranged in S2.1.2 is pedestrian's class number n, and convolutional layer is by f₁Mapping become n tie up to Amount exports final class prediction by full link sort device；

S2.1.3 is aligned outputting and inputting for the twin residual error network of branch depth for pedestrian, defines third loss function:

Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θ_IConvolutional layer used in representing Parameter, t be pedestrian's classification, f_aFor the obtained feature vector after basic branch depth residual error network carries out feature extraction,The probability for belonging to some pedestrian's classification t for the feature vector f of classifier functions output, for any image i and certain a group traveling together Classification t,Whether representative image i belongs to pedestrian classification t, if belonged to,OtherwiseFor any figure The probability value obtained after the processing of softmax function as i.

Step S2.3 is specific as follows:

Two pedestrians, are aligned the feature vector of branch depth residual error network output by S2.3.1, setting square layer Squared difference is taken, is obtained

The convolutional layer that S2.3.2, setting Feature Mapping figure number are 2, by f_sMapping becomes 2 dimensional vectors；

S2.3.3, it is fully connected to output generation final prediction of two classifiers to S1.3.2, i.e., whether input picture is to next From same category；

S2.3.4, for the same category or different classes of input picture to q, define the 4th loss function:

Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θ_sConvolutional layer used in representing Parameter, s is identical or not identical two classifications,For the feature vector that convolution obtains after square layer,For classifier functions The feature vector of outputIt whether is same class pedestrian θ_sProbability,Respectively by the twin residual error network of formation base branch The feature extracted of two basic branch depth residual error networks, ifIt is same people,It is no ThenAfter convolutional layer and the processing of softmax function, by f_sIt is mapped as a bivector Whether two images that this bivector represents input belong to the probability of same pedestrian's classification, wherein

The present invention have compared with prior art it is following a little:

First, present invention employs the identification models and verifying model in convolutional neural networks, effectively combine both The advantages of model, wherein identification model verifies the similarity of model inspection input picture pair, this two for extracting characteristics of image The complementation of kind model makes whole network study to the feature description for more having judgement index, effectively avoids the generation of over-fitting.

Second, the present invention is aligned network using pedestrian, by crucial pedestrian's Feature Mapping to low-dimensional on high dimensional feature figure On characteristic pattern, enable entire neural network in the study for focusing more on pedestrian's feature at the very start；Meanwhile it is so effective that subtract Lack extra background and pedestrian image excalation in pedestrian image and brought to obtain interference, improves the standard of neural network recognization True property.

Third, the present invention is when carrying out pedestrian and identifying again while having used basic branch depth residual error network and alignment Two groups of features of branch depth residual error network, compared to the feature that basic branch depth residual error network extraction is used alone and individually It is aligned the feature that branch depth residual error network extracts using pedestrian, the method can further promote the precision that pedestrian identifies again.

Detailed description of the invention

Fig. 1 is network structure of the invention；

Fig. 2 is step figure of the invention.

Specific embodiment

Further detailed description is done to technical solution of the present invention with reference to the accompanying drawing.

S1 constructs the twin residual error network of basic branch；The full articulamentum of residual error network, addition convolutional layer and classification layer are deleted, Obtain basic branch depth residual error web original；Duplicate network prototype, and two sorter networks are added, it is twin to obtain basic branch Raw residual error network.

It constructs pedestrian and is aligned the twin residual error network of branch.On trained basic branch depth residual error network, delete most The latter residual block adds a grid network, and is superimposed the basic branch depth residual error network for removing first residual block, obtains Branch depth residual error web original is aligned to pedestrian；Duplicate network prototype, and two sorter networks are added, obtain pedestrian's alignment The twin residual error network of branch.

Using the training dataset constructed to the twin network of basic branch and pedestrian be aligned the twin residual error network of branch into Trained basic branch's prototype and pedestrian are aligned branch's prototype and take out the classification mould for carrying out pedestrian and identifying again by row parameter training Type.

In the training stage, the twin residual error network of basic branch is trained first, uses the basic branch of two shared weights deep Degree residual error web original forms the twin residual error network of basic branch and carries out feature respectively to the two images of the image pair of input It extracts, carries out the calculating of Euclidean distance by two sorter networks to obtained feature, judge whether it is the same pedestrian Classification, result are compared with image tag, for adjusting the parameter of the twin residual error network of entire basic branch；

Then the twin residual error network of depth being made of the alignment branch depth residual error network of two shared weights is trained, it is right Neat branch depth residual error network is that a grid network is added on trained basic branch depth residual error network, and effect is Generate six parameters of the affine transformation for pedestrian's alignment, and by the pedestrian image of input by obtaining after affine transformation pair Then neat pedestrian image carries out feature extraction to it, obtained pairs of feature carries out the calculating of Euclidean distance, judges that it is No to belong to same category, the parameter for being entirely aligned the twin residual error network of branch depth adjusts；

In test phase, its of the basic twin residual error network of branch and the alignment twin residual error network of branch depth are used respectively In a progress feature extraction, obtained two kinds of features carry out Fusion Features, for judging pedestrian's classification.

Referring to Fig.1, the present invention realizes that specific step is as follows:

Step S1 constructs the twin residual error network of basic branch:

S1.1 constructs the twin residual error network of first foundation branch, using transfer learning strategy, imports in ImageNet data The residual error network parameter of pre-training on collection, as the underlying parameter of first foundation branch depth residual error network；

S1.2 obtains the second basic branch by the model structure and parameter that replicate first foundation branch depth residual error network Depth residual error network；

S1.3 calculates square of the feature vector difference of two basic branch depth residual error networks output, using convolutional layer and Classifier carries out two classification, judges whether the input of two basic branch depth residual error networks is same category of image；

Step S2 constructs pedestrian and is aligned the twin residual error network of branch:

S2.1 constructs the first pedestrian and is aligned branch depth residual error network, uses trained basic branch depth residual error net Network basis branch depth residual error network, deletes the last one residual block；

The output result of 4th residual block is returned as six parameters via a residual block by S2.2, using this as progress The parameter of affine transformation carries out affine transformation, the pedestrian image being aligned to the image of second residual block output；

S2.3 by trained basic branch depth residual error network foundation branch depth residual error network leave out first it is residual Poor block, and it is used to the pedestrian image by alignment that training obtains；

S2.4 replicates the model structure of the first pedestrian alignment branch depth residual error network and parameter obtains the second pedestrian alignment Branch depth residual error network；

S2.5 calculates square that two pedestrians are aligned the feature vector difference of branch depth residual error network output, utilizes convolution Layer and classifier carry out two classification, judge whether the input of two branching networks is same category of image；

Step S3 constructs training dataset and substep carries out the twin residual error network of basic branch and pedestrian is aligned branch depth The training of twin residual error network:

S3.1 is using the training dataset constructed using batch gradient descent method respectively to the twin residual error net of basic branch Network and pedestrian are aligned the twin residual error network of branch depth and carry out parameter training；

Any one basic branch depth residual error network and pedestrian are aligned branch's depth respectively after training parameter by S3.2 Residual error network model is spent to take out as pedestrian image disaggregated model.

Step S4 constructs test sample and query sample.

The classification of step S5 test sample: test sample is respectively fed to basic branch depth residual error network and pedestrian's alignment point Feature extraction is carried out in branch depth residual error network.

The feature that step S6 obtains two branching networks carries out the feature that weight is 0.5 and links.

Sorted lists are calculated to what the image of the image of test sample and query sample carried out Euclidean distance in step S7.

Step S8 carries out pedestrian on the basis of reordering and identifies again.

Depth residual error network architecture described in step S1 can be divided into five residual blocks, each residual error block structure difference It is as follows:

Input is original pedestrian image, and Feature Mapping map number is 3, i.e. the three of image Color Channel；

Residual block one is made of four-layer network network, respectively convolutional layer, batch normalization layer, line rectification function layer and maximum pond Change layer, output Feature Mapping map number is 64；

Residual block two is made of multiple identity blocks respectively to residual block five, and the structure of each identity block is three convolution Layer, batch normalization layer line, property rectification function layer and one and function layer are by rearranging；

Residual block two includes three identity blocks, and output Feature Mapping map number is 256；

Residual block three includes four identity blocks, and output Feature Mapping map number is 512；

Residual block four includes six identity blocks, and output Feature Mapping map number is 1024；

Residual block five includes three identity blocks, and output Feature Mapping map number is 2048；

Average pond is carried out to the characteristic spectrum of 2048 dimensions of output, obtains the feature vector of 2048 dimensions；

By full articulamentum and classifier layer by the vector for the dimension that maps feature vectors to size are pedestrian's classification number.

Further, step S1.1 is specific as follows:

The 2048 dimensional vector f that S1.1.1 output is obtained by average Chi Huahou₁；

The Feature Mapping figure number that convolutional layer is arranged in S1.1.2 is pedestrian's class number n, and convolutional layer is by f₁Mapping become n tie up to Amount exports final class prediction by full link sort device；

Further, step S1.3 is specific as follows:

S1.3.1., square layer is set, the feature vector f that two depth residual error network models are exported₁、f₂Squared difference is taken, Obtain f_s=(f₁-f₂)²

S1.3.2. the convolutional layer that setting Feature Mapping figure number is 2, by f_sMapping becomes 2 dimensional vectors；

S1.3.3. two classifiers are fully connected to, final prediction is generated to the output of S1.3.2, i.e., whether input picture is to next From same category；

S1.3.4. the second loss function is defined to q (the same category/different classes of) for input picture:

Depth residual error network architecture described in step S2 can be divided into eight residual blocks and a grid residual block, often A residual error block structure difference is as follows:

Residual block two is made of multiple identity blocks respectively to residual block four, and the structure of each identity block is three convolution Layer, batch normalization layer line, property rectification function layer and one and function layer are by rearranging；

Grid network block includes three identity blocks and an average pond layer, but its output is the transformation ginseng of six dimensions Number carries out pedestrian's alignment for generating image lattice；

The image that residual block two is exported carries out Grid Align, the Feature Mapping map number of the pedestrian image of obtained alignment Mesh is 256；

Residual block five includes four identity blocks, and output Feature Mapping map number is 512；

Residual block six includes six identity blocks, and output Feature Mapping map number is 1024；

Residual block seven includes three identity blocks, and output Feature Mapping map number is 2048；

Further, step S2.1-S2.3 is specific as follows:

The 2048 dimensional vector f that S2.1.1 output is obtained by average Chi Huahou₁；

S2.1.3 residual error network twin for basic branch is output and input, and defines third loss function:

Further, step S2.5 is specific as follows:

S2.3.1., square layer is set, the feature vector f that two depth residual error network models are exported₁、f₂Squared difference is taken, Obtain f_s=(f₁-f₂)²

S2.3.2. the convolutional layer that setting Feature Mapping figure number is 2, by f_sMapping becomes 2 dimensional vectors；

S2.3.3. two classifiers are fully connected to, final prediction is generated to the output of S1.3.2, i.e., whether input picture is to next From same category；

S2.3.4. the 4th loss function is defined to q (the same category/different classes of) for input picture:

Wherein, softmax represents a classifier functions, and ο represents a convolution algorithm, θ_SConvolutional layer used in representing Parameter, s is identical or not identical two classifications,For the feature vector that convolution obtains after square layer,For classifier letter The feature vector of number outputIt whether is same class pedestrian θ_sProbability,Respectively by the twin residual error of formation base branch The feature that the basic branch depth residual error network of two of network extracts, ifIt is same people, OtherwiseAfter convolutional layer and the processing of softmax function, by f_sIt is mapped as a bivectorWhether two images that this bivector represents input belong to the probability of same pedestrian's classification, wherein

Further, step S3.1 is specific as follows:

S3.1.1. how to construct training set: upsetting the sequence that training data concentrates image, generate training data pair；

S3.1.2. it is carried out most using 4 loss functions of the batch descent method to S1.1.3, S1.3.4, S2.1.3, S2.3.4 Optimization；

S3.1.3., the weight of 4 loss functions, respectively λ are set₁, λ₂, λ₃, λ₄；

S3.1.4. optimal weighted value is determined into parameter testing by a series of experiments；

Further, step S3.2 is specific as follows:

S3.2.1. 4 loss function training are minimized into loss function to optimal；

S3.2.2. disaggregated model of the trained depth residual error network as next step is taken out；

It is as follows how step S4 constructs test sample:

S4-1., pedestrian is identified to remaining image is as test sample in data set again；

S4-2. every image trimming size in test sample is adjusted to 224 × 224；

Step S4 is specific as follows:

S4-1. disaggregated model is single depth residual error network model, and corresponding input is single image；

S4-2. classification standard is using overall accuracy and recognition correct rate, correct picture number of respectively classifying account for for the first time The percentage of test sample and the percentage that identification same category pedestrian correctly identifies for the first time.

Effect of the invention is described further below:

1, experiment condition:

Experiment of the invention be NVIDIA GTX 1080Ti GPU, I7-8700K CPU hardware environment and It is carried out under the software environment of MATLAB2017.

Experiment of the invention has used three pedestrians to identify data set Market-1501, DukeMMC and CUHK03 again.

Market-1501 data set is collected before a supermarket of Tsinghua University.Six cameras are used altogether, including 5 high resolution cameras and a low-resolution cameras.There is overlapping between different cameral.In general, this data set includes 32,668 mark bounding boxes, wherein including 1,501 marks.In this open system, the image of each identity at most by Six video camera shootings.Ensure that each annotation mark at least exists in two video cameras, to carry out across video camera search.

Duke provides a kind of tracking system inside video camera and across camera operation, and one by 8 Synchronous cameras The novel large-scale high-definition sets of video data of machine record, wherein including 7, a single camera track and 2 more than 000, more than 000 unique Identity, and a kind of new performance estimating method.

CUHK03 includes 13,164 images of 1,360 pedestrians, and entire data set is shot by six monitor cameras.Often A identity is shot by two disjoint cameras, which acquires in Hong Kong Chinese University, and image is from 2 different camera shootings Head.The data set provides machine detection and manual inspection two datasets.Wherein detection data collection includes some detection errors, more Close to actual conditions.It is average that everyone has 9.6 training datas.

2, interpretation of result

Twin network and (2) that pedestrian is aligned network are not used with (1) using the method for the present invention in emulation experiment of the invention The pedestrian that twinned structure is not used is aligned network and classifies to three data sets, and classifying quality is compared and analyzed.

Table 1 is that experiment of the invention carries out overall accuracy using three kinds of convolutional neural networks models and the method for the present invention The statistical form of comparison." Data Set " in table 1 indicates that the pedestrian used identifies that data set type, " result " expression are known again again Not as a result, the accuracy of " Accuracy " presentation class, Rank-1 indicate that identification for the first time is the probability of correct pedestrian, " Verif+identif " indicates that the twin network that pedestrian is aligned network is not used, and twin knot is not used in " Base+Align " expression The pedestrian of structure is aligned network, and " (Base+Verif)+(Align+Verif) " indicates the method that the present invention uses.

1 pedestrian of table weight recognition result compares list

As it can be seen from table 1 the method for the present invention result on three data sets is superior to other two methods.

Claims

1. a kind of pedestrian's recognition methods again for being aligned residual error network based on twin pedestrian, which is characterized in that including following step It is rapid:

S1, the twin residual error network of basic branch is constructed；

S1.1, construction first foundation branch depth residual error network are imported on ImageNet data set using transfer learning strategy The residual error network parameter of pre-training, as the underlying parameter of first foundation branch depth residual error network；

S1.3, square for calculating the feature vector difference that two basic branch depth residual error networks export, using convolutional layer and divide Class device carries out two classification, judges whether the input of above-mentioned two basic branch depth residual error network is same category of image；

S2.1, the first pedestrian of construction are aligned branch depth residual error network, use trained first foundation branch depth residual error Any one in network or the second basic branch depth residual error network, delete for by the return of high dimensional feature image be characterized to The residual block of amount；

The high dimensional feature image result of output is returned to the parameter for being used to carry out affine transformation via a residual block, to output Low-dimensional characteristic image carries out affine transformation, obtains the pedestrian image by alignment；

It will be any one in trained first foundation branch depth residual error network or the second basic branch depth residual error network It is a, leave out the low-dimensional characteristic image for obtaining carrying out affine transformation and its residual block before, and be used to the warp that training obtains Cross the pedestrian image of alignment；

S2.2, the first pedestrian of duplication are aligned the model structure of branch depth residual error network and parameter obtains the second pedestrian and is aligned branch Depth residual error network；

S2.3, square that above-mentioned two pedestrian is aligned the feature vector difference of branch depth residual error network output is calculated, utilizes volume Lamination and classifier carry out two classification, judge whether the input of two branching networks is same category of image；

S3, to be aligned branch to the twin network of basic branch built and pedestrian using the training dataset constructed twin residual Poor network carries out parameter training, and branch's prototype basic in the twin residual error network of trained basic branch and pedestrian are aligned branch Pedestrian is aligned branch's prototype and takes out the disaggregated model that progress pedestrian identifies again in twin residual error network；

S3.1, use batch gradient descent method twin residual to the basic branch constructed respectively using the training dataset constructed Poor network and pedestrian are aligned the twin residual error network of branch and carry out parameter training；

S3.2, train after parameter that the twin residual error network of any one basic branch and pedestrian are aligned branch respectively is twin residual Poor network, which takes out, is used as pedestrian image disaggregated model；

S4, building test sample and query sample；

S5, test sample classification: test sample is respectively fed to trained basic branch depth residual error network and pedestrian is aligned Feature extraction is carried out in branch depth residual error network；

S8, progress pedestrian identifies again on the basis of reordering.

2. pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian according to claim 2, which is characterized in that step Rapid S1.1 is specific as follows:

S1.1.2, the Feature Mapping figure number that convolutional layer is arranged are pedestrian's class number n, and convolutional layer is by f₁Mapping becomes n-dimensional vector, Final class prediction is exported by full link sort device；

Wherein, softmax represents a classifier functions,Represent a convolution algorithm, θ_IThe ginseng of convolutional layer used in representing Number, t are pedestrian's classification, and f is the feature vector obtained after basic branch depth residual error network carries out feature extraction,To divide The feature vector f of class device function output belongs to the probability of some pedestrian's classification t, for any image i and certain a group traveling together classification t, p_iWhether representative image i belongs to pedestrian classification t, if belonged to, p_i=1, otherwise p_i=0,For any image i process The probability value obtained after the processing of softmax function.

3. pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian according to claim 1, which is characterized in that step Rapid S1.3 is specific as follows:

S1.3.1, setting square layer, by the feature vector f of two basic branch depth residual error network output₁、f₂Squared difference is taken, Obtain f_s=(f₁-f₂)²

S1.3.3, it is fully connected to output generation final prediction of two classifiers to S1.3.2, i.e., whether input picture is to from same One classification；

Wherein, softmax represents a classifier functions,Represent a convolution algorithm, θ_SThe ginseng of convolutional layer used in representing Number, s are identical or not identical two classifications, f_sFor the feature vector that convolution obtains after square layer,It is defeated for classifier functions Feature vector f out_sIt whether is same class pedestrian θ_sProbability, f₁、f₂Respectively by the twin residual error network of formation base branch The feature that two basic branch depth residual error networks extract, if f₁、f₂It is same people, q₁=1, q₂=0；Otherwise q₁=0, q₂ =1, after convolutional layer and the processing of softmax function, by f_sIt is mapped as a bivectorThis bivector Whether two images for representing input belong to the probability of same pedestrian's classification, wherein

4. pedestrian's recognition methods again of residual error network is aligned based on twin pedestrian according to claim 1, which is characterized in that step Rapid S2.1 is specific as follows:

The Feature Mapping figure number that convolutional layer is arranged in S2.1.2 is pedestrian's class number n, and convolutional layer is by f₁Mapping becomes n-dimensional vector, by Full link sort device exports final class prediction；

Wherein, softmax represents a classifier functions,Represent a convolution algorithm, θ_IThe ginseng of convolutional layer used in representing Number, t are pedestrian's classification, f_aFor the obtained feature vector after basic branch depth residual error network carries out feature extraction,For The feature vector f of classifier functions output belongs to the probability of some pedestrian's classification t, for any image i and certain a group traveling together's classification t,Whether representative image i belongs to pedestrian classification t, if belonged to,OtherwiseFor any image i warp Cross the probability value obtained after the processing of softmax function.

Step S2.3 is specific as follows:

Two pedestrians, are aligned the feature vector of branch depth residual error network output by S2.3.1, setting square layer Take difference Square, it obtains

S2.3.3, it is fully connected to output generation final prediction of two classifiers to S1.3.2, i.e., whether input picture is to from same One classification；

Wherein, softmax represents a classifier functions,Represent a convolution algorithm, θ_SThe ginseng of convolutional layer used in representing Number, s are identical or not identical two classifications,For the feature vector that convolution obtains after square layer,For classifier functions The feature vector of outputIt whether is same class pedestrian θ_SProbability,Respectively by the twin residual error net of formation base branch The feature that the basic branch depth residual error network of two of network extracts, ifIt is same people, OtherwiseAfter convolutional layer and the processing of softmax function, by f_sIt is mapped as a bivectorWhether two images that this bivector represents input belong to the probability of same pedestrian's classification, wherein