CN110163110A

CN110163110A - A kind of pedestrian's recognition methods again merged based on transfer learning and depth characteristic

Info

Publication number: CN110163110A
Application number: CN201910329733.2A
Authority: CN
Inventors: 丁剑飞; 王进; 阚丹会; 闫盈盈; 曹扬
Original assignee: Division Big Data Research Institute Co Ltd
Current assignee: Division Big Data Research Institute Co Ltd
Priority date: 2019-04-23
Filing date: 2019-04-23
Publication date: 2019-08-23
Anticipated expiration: 2039-04-23
Also published as: CN110163110B

Abstract

The present invention provides a kind of pedestrian's recognition methods again merged based on transfer learning and depth characteristic, comprising the following steps: pre-training-human body attitude correction and segmentation-feature vector-depth characteristic fusion-training pattern-test model-recognition result.The present invention is by extracting the global and local feature of pedestrian using depth convolutional neural networks, depth integration is carried out to two kinds of features and obtains final pedestrian's characteristic present, then in depth convolutional neural networks training process, by the way of transfer learning and then the acquisition better pedestrian of effect identifies network model again, the feature for finally making pedestrian identify that network model extracts again has stronger resolution capability, to achieve the purpose that promote pedestrian's weight recognition accuracy.

Description

A kind of pedestrian's recognition methods again merged based on transfer learning and depth characteristic

Technical field

The present invention relates to a kind of pedestrian's recognition methods again merged based on transfer learning and depth characteristic, belong to depth It practises, transfer learning technical field.

Background technique

Pedestrian identifies again, pedestrian's matching task that main purpose carries out under the multi-cam network of non-overlap visual angle domain, i.e., Find out the target pedestrian that the camera of different location is shot in different moments.

With the development of artificial intelligence technology, pedestrian's weight identification technology in the application scenarios such as public security protection, image retrieval It is widely studied domain concern.But the pedestrian's weight traditional biologicals such as identification technology and recognition of face, gesture identification identification technology phase Than, because of the complicated reasons such as uncontrollable of the environment of monitor video, and then face low image resolution ratio, visual angle change, attitudes vibration, Light changes and blocks etc. the problem that factors cause accuracy of identification low.Therefore, pedestrian's weight identification technology is in practical application field Biggish challenge is faced under scape.

In order to promote the robustness of accuracy rate that pedestrian identifies and enhancing system again, numerous scholars by long-term research, Propose different pedestrians recognition methods again.It is spacious to identify that field obtains huge progress again in pedestrian depending on science and technology Facce++, it should A kind of new method is proposed in the paper AlignedReID that team delivers, is passed through dynamic alignment (Dynamic Alignment) With Cooperative Study (Mutual Learing), then at rearrangement (Re-Ranking), which is found through experiments that, Test phase extracts pedestrian's global characteristics and the recognition accuracy of fusion pedestrian overall situation and partial situation feature is very nearly the same；The bases such as Yi A kind of depth measure learning method is proposed in twin convolutional neural networks, is achieved good results；Liu et al. is based on neighborhood Constituent analysis and depth confidence network propose a kind of depth nonlinear metric learning method, and the effect of neighborhood transform analysis is logical Crossing data transformation maximizes the recognizable number of samples of every class data in training data, in order to extend in neighborhood transform analysis Data transformation, learnt using depth confidence network nonlinear characteristic transformation.But it finds under study for action, above-mentioned pedestrian knows again Other method is most of to extract global characteristics vector, while some sides as input based on pedestrian's global image in the training process Although method is extracted local feature, pedestrian's local feature is not made full use of to carry out depth integration, and obtaining has distinction Characterization image.And simply finely tuned being expert on personal data library using pre-training model, there is no in view of source domain with Data distribution difference between aiming field data set.And then it is undesirable to network migration effect.

Summary of the invention

In order to solve the above technical problems, the present invention provides a kind of pedestrians merged based on transfer learning and depth characteristic Recognition methods again, being somebody's turn to do the pedestrian based on transfer learning and depth characteristic fusion, recognition methods can be directed to pedestrian's global characteristics again Depth integration is not carried out with local feature, and the difference of data distribution is not fully taken into account in network trim process.

The present invention is achieved by the following technical programs.

A kind of pedestrian's recognition methods again merged based on transfer learning and depth characteristic provided by the invention, including it is following Step:

1. pre-training again: the pre-training model based on ImageNet being identified in pedestrian and carries out pre-training in data, is obtained Pedestrian identifies pre-training network model again；

2. human body attitude correction and segmentation: choosing difficulty on personal data of being expert at collection and divide sample pair, and be input to skeleton pass In 14 key points that key point detection network detects, and human body attitude correction and pedestrian part ROI segmentation are carried out, obtained The enhanced difficulty of data divide sample to, correction after global and local image；

3. feature vector: dividing the global and local image after correction, the enhanced difficulty of data to sample pair, be input to row People identifies pre-training network model again, obtains pedestrian part and global characteristics vector；

4. depth characteristic merges: pedestrian part and global characteristics vector being carried out depth characteristic fusion, obtain final pedestrian Feature vector；

5. training pattern: by the way of transfer learning and step 4. in final pedestrian's feature vector, pedestrian is identified again Pre-training network model is finely adjusted, and is identified again in pedestrian and added adaptation layer in pre-training network model altogether, and pedestrian is obtained Network model is identified again；

6. test model: input inquiry pedestrian and target pedestrian image identify that network model extracts two using only pedestrian again A pedestrian's global characteristics vector with distinction；

7. recognition result: based on pedestrian's global characteristics vector in step 6., calculating inquiry pedestrian and target line personal data The similarity between any one image is concentrated, it is considered as identical pedestrian that wherein similarity is highest.

Pedestrian in training stage identifies the input of network model again, using triple pedestrian image.

1. the step is divided into following steps:

(1.1) the preparatory trained depth convolutional network model on ImageNet data set is obtained, and by it in pedestrian It identifies and is trained in data again；

(1.2) when pedestrian identifies pre-training depth convolutional neural networks model in data again, merely with sample mark Information is finely adjusted depth convolutional network model.

The step (1.2) is divided into following steps:

The ResNet50 network model of the pre-training on ImageNet data set is removed the full connection of top layer by (1.2.1) Layer, adds two layers of full articulamentum and one layer softmax layers after maximum pond layer；

The label information that (1.2.2) is marked using pedestrian image is finely adjusted the depth convolutional neural networks of building, The three first layers of constant depth convolutional neural networks in trim process；

(1.2.3) obtains the prediction probability of pedestrian's global image according to depth convolutional neural networks；

(1.2.4) defines the loss function in depth convolutional neural networks according to prediction probability.

The step 2. in global and local image after the correction that obtains, for difficult point of positive negative sample triad row people figure Pedestrian's global image and local ROI image after picture, human body attitude correction.

2. the step is divided into following steps:

(2.1) each is trained into batch, selects the pedestrian of P ID at random, each pedestrian select at random K open it is different Image, each batch contain P × K pedestrian images；

(2.2) each is trained image in batch is anchor sample H_n, select a positive sample being most difficult toWith one The negative sample being most difficult toAnd H_nA triple is formed, the difficult requirement for dividing sample pair of selection isMaximum,It is minimum；

(2.3) triple pedestrian image is input to skeleton critical point detection network, detects 14 people respectively Body bone key point, including head, four limbs, the upper part of the body, the lower part of the body, and using 14 key points as coordinate pair human body attitude into Row correction；

(2.4) according to 14 skeleton key points, pedestrian's global image is divided into head, the upper part of the body, the lower part of the body Three pedestrian part ROI images obtain pedestrian's global image and three pedestrian topographies after a correction.

In the step (2.2), using the pre-training depth convolutional neural networks model in step (1.1), with anchor sample This H_nIn identical pedestrian ID image select score it is minimum pedestrian image sample composition difficulty divide positive sample pair, with anchor sample H_nNo Divide negative sample pair with selecting the pedestrian image sample of highest scoring to form difficulty in pedestrian's ID image.

3. the step is divided into following steps:

(3.1) it obtains by step (1.1) pre-training depth convolutional neural networks model and after 2. step is corrected Global and local image, and the softmax layer for removing pre-training depth convolutional neural networks model top layer connects entirely with one layer Layer；

(3.2) by the enhanced difficulty of data divide sample to, correction after global and local image be separately input to depth volume Product neural network model obtains pedestrian's global characteristics vector by the depth convolutional neural networks model that step (3.1) construct With pedestrian's local feature vectors.

4. the step is divided into following steps:

(4.1) by step 3. in pedestrian part and global characteristics vector be input to one layer of full articulamentum, it is special to carry out depth Sign fusion, obtains and exports fused pedestrian's feature vector；

(4.2) it by pedestrian's local feature vectors in fused pedestrian's feature vector and step (3.2), inputs respectively One layer of square of layer, square layer measure the similarity that hardly possible is divided between sample pair using squared euclidean distance.

6. the step is divided into following steps:

(6.1) inquiry is input to human body key point with target pedestrian image and posture correction network carries out human body attitude Correction；

(6.2) the pedestrian image input pedestrian after correcting human body attitude identifies network model again, and it is global special to obtain pedestrian Levy vector.

The beneficial effects of the present invention are: by extracting the global and local feature of pedestrian using depth convolutional neural networks, Depth integration is carried out to two kinds of features and obtains final pedestrian's characteristic present, then in depth convolutional neural networks training process In, by the way of transfer learning and then the acquisition better pedestrian of effect identifies network model again, finally pedestrian is identified again The feature that network model extracts has stronger resolution capability, to achieve the purpose that promote pedestrian's weight recognition accuracy.

Detailed description of the invention

Fig. 1 is flow chart of the invention；

Fig. 2 is the network structure of global characteristics of the embodiment of the present invention Yu local feature depth integration；

Fig. 3 is depth characteristic fusion and local feature learning mould of the embodiment of the present invention based on depth convolutional neural networks The network structure of type.

Specific embodiment

Be described further below technical solution of the present invention, but claimed range be not limited to it is described.

As shown in Figure 1, a kind of pedestrian's recognition methods again merged based on transfer learning and depth characteristic, including following step It is rapid:

1. the step is divided into following steps:

The step (1.2) is divided into following steps:

2. the step is divided into following steps:

3. the step is divided into following steps:

4. the step is divided into following steps:

6. the step is divided into following steps:

In conclusion advantage of the present invention using transfer learning and the adaptive learning of deep learning, using fusion pedestrian Image local feature and global feature obtain the network model that can pay close attention to pedestrian's local feature, improve what pedestrian identified again Accuracy rate.

Embodiment 1

As described above, a kind of pedestrian's recognition methods again merged based on transfer learning and depth characteristic, including following step It is rapid:

1. pre-training again: the pre-training model based on ImageNet being identified in pedestrian and carries out pre-training in data, is obtained Pedestrian identifies pre-training network model again；It is specifically divided into following steps:

(1.2) when pedestrian identifies pre-training depth convolutional neural networks model in data again, merely with sample mark Information is finely adjusted depth convolutional network model；

The label information that (1.2.2) is marked using pedestrian image is finely adjusted the depth convolutional neural networks of building, The three first layers of constant depth convolutional neural networks in trim process, because of the spy that the three first layers of depth convolutional neural networks are extracted Sign is usually texture, edge etc., and feature has certain versatility；

(1.2.3) obtains the prediction probability y of pedestrian's global image according to depth convolutional neural networks_i, indicate are as follows:

Wherein, y_iIndicate that sample x belongs to the probability of i-th of classification,It is normalization item, C is classification sum；

(1.2.4) defines the loss function L in depth convolutional neural networks according to prediction probability_I, indicate are as follows:

Wherein, q_jIndicate that label probability, C are classification sum.

2. human body attitude correction and segmentation: choosing difficulty on personal data of being expert at collection and divide sample pair, and be input to skeleton pass In 14 key points that key point detection network detects, and human body attitude correction and pedestrian part ROI segmentation are carried out, obtained The enhanced difficulty of data divide sample to, correction after global and local image；Wherein correct after global and local image, be Pedestrian's global image and local ROI image after the positive negative sample triad row people image of hardly possible point, human body attitude correction；Specific point For following steps:

Specifically, using the pre-training depth convolutional neural networks model in step (1.1), with anchor sample H_nIt is identical In pedestrian's ID image select score it is minimum pedestrian image sample composition difficulty divide positive sample pair, with anchor sample H_nDifferent pedestrians The pedestrian image sample composition difficulty of highest scoring is selected to divide negative sample pair in ID image；

3. feature vector: dividing the global and local image after correction, the enhanced difficulty of data to sample pair, be input to row People identifies pre-training network model again, obtains pedestrian part and global characteristics vector；It is specifically divided into following steps:

(3.2) by the enhanced difficulty of data divide sample to, correction after global and local image be separately input to depth volume Product neural network model obtains pedestrian's global characteristics vector A by the depth convolutional neural networks model that step (3.1) construct With pedestrian's local feature vectors B₁、 B₂、B₃, wherein B₁For head zone feature vector, B₂For upper part of the body provincial characteristics vector, B₃ For lower part of the body feature vector.

Further, it is each when the enhanced difficulty of data divides sample to depth convolutional neural networks are input to parallel respectively Depth convolutional neural networks model is propagated simultaneously, while shared weight.

4. depth characteristic merges: pedestrian part and global characteristics vector being carried out depth characteristic fusion, obtain final pedestrian Feature vector；It is specifically divided into following steps:

(4.1) by step 3. in pedestrian part and global characteristics vector be input to one layer of full articulamentum, it is special to carry out depth Sign fusion, obtains and exports fused pedestrian's feature vector C；

(4.2) by pedestrian's local feature vectors B in fused pedestrian's feature vector C and step (3.2)₁、B₂、B₃, One layer of square of layer is inputted respectively, and square layer measures the similarity that hardly possible is divided between sample pair using squared euclidean distance, indicates are as follows:

Wherein, a is anchor sample, and p is to be most difficult to positive sample, and n is to be most difficult to negative sample, d_a,pFor hardly possible divide between positive sample pair away from From d_a,nFor hardly possible point negative sample to the distance between.

Preferably, it in order to enable depth convolutional neural networks to extract pedestrian's feature with higher distinguishing, fills simultaneously Point using pedestrian sample markup information, in training process, intersection entropy loss and triple has been used to lose, wherein fusion is complete The depth convolutional neural networks of office's feature and local feature use two kinds of loss functions in the training process, extract head, upper half Body, the lower part of the body depth convolutional neural networks only used triple loss function；

Further, the depth convolutional neural networks of amalgamation of global characteristics and local feature used intersection entropy loss and TriHard loss, indicates are as follows:

Wherein, the sample set with anchor sample a with identical ID is A, and the sample set for being left different ID is B.L_thFor TriHard loss, L_ITo intersect entropy loss, L_thIn α be think setting threshold parameter, q_jIndicate label probability, C is class Not sum；

Further, extract the head of pedestrian, the upper part of the body, the lower part of the body depth convolutional neural networks used TriHard Loss function enables the depth convolutional neural networks for extracting pedestrian's global characteristics more to pay close attention to by sharing weight parameter Local feature with distinction, wherein loss function L_thAre as follows:

Wherein, the sample set with anchor sample a with identical ID is A, and the sample set for being left different ID is B.L_thIn α be Think the threshold parameter of setting；

The loss that depth integration feature and local feature will finally be extracted forms Total according to corresponding Weight Loss carries out backpropagation to overall network and updates network parameter；

5. training pattern: by the way of transfer learning and step 4. in final pedestrian's feature vector, pedestrian is identified again Pre-training network model is finely adjusted, and is identified again in pedestrian and added adaptation layer in pre-training network model altogether, and pedestrian is obtained Network model is identified again；In order to obtain better transfer learning effect, adaptation layer is added, so that the number of source domain and aiming field It is closer according to being distributed, so that pedestrian identifies that the effect of network model is more preferable again；

Specifically, the parameter learning that multicore MMD is measured adds the purpose degree in the training of depth convolutional neural networks The difference of source domain and aiming field is measured, wherein the multicore of multicore MMD measurement indicates are as follows:

Distribution distance between source domain and aiming field indicates are as follows:

Wherein, φ () is mapping, for by former variable mappings to reproducing kernel Hilbert space, H to indicate degree of a representation span From be mapped the data by φ () regeneration Hilbert space (RKHS) in measured；

The optimization aim of adaptation layer is made of loss function and adaptive loss, is indicated are as follows:

Wherein, Θ indicates all weights and offset parameter of network, is the target component of study, l₁To l₂It is Network adaptation Beginning and end layer, front without adaptation,n_aIndicate the set of all labeled data in source domain and aiming field, J () is loss function；

Specifically, selecting a row after the pre-training depth convolutional neural networks of acquisition are removed top layer softmax layers People's image inputs and uses the score of the several layers of convolutional layers in trained classifier calculated convolutional neural networks top, fixed score Network before highest one layer, one layer of highest scoring and network layer later are finely adjusted；

6. test model: identifying network model, input inquiry pedestrian and target pedestrian image again using only pedestrian, obtain Two have pedestrian's feature vectors of distinction, and it is complete to extract pedestrian respectively from two pedestrian's feature vectors with distinction Office's feature vector；It is specifically divided into following steps:

Further, the pedestrian in the training stage identifies the input of network model again, using triple (triplet) row People's image.

Embodiment 2

Pre-training model based on ImageNet is identified again in pedestrian and carries out pre-training in data, gone by step S1 People identifies pre-training network model again；

Step S11, obtain on ImageNet data set preparatory trained depth convolutional network model, and by its Pedestrian identifies again to be trained in data；

Step S12 only used sample when pedestrian identifies pre-training depth convolutional neural networks model in data again Markup information is finely adjusted network model；

The ResNet50 network model of the pre-training on ImageNet data set is removed connecting entirely for top layer by step S121 Layer is connect, two layers of full articulamentum and one layer softmax layers are added after maximum pond layer；

Further, two layers of addition full connection layer parameter is respectively 1 × 1 × 2048,1 × 1 × 751, input picture 224 × 224, when carrying out pre-training to ResNet50, optimization is iterated using gradient descent method, the number of iterations is set as 75 times, learning rate is initialized as 0.1, and weight pad value is set as 0.001 in optimization process, and each batch inputs 64 pedestrians Sample；

Step S122 is finely adjusted the depth convolutional neural networks of building using the label information that pedestrian image marks, The three first layers of fixed network in trim process.Because the feature that the three first layers of convolutional neural networks are extracted is usually texture, side Edge etc., feature have certain versatility；

Step S123 obtains the prediction probability of pedestrian's global image according to the convolutional neural networks, indicates are as follows:

Preferably, when Market-1501 database is trained with testing, C=751.

Step S124 sets L for the loss function in the convolutional neural networks according to the prediction probability_I, table It is shown as:

Wherein, q_jIndicate label probability, C 751；

The input of step S2, training stage, network model use triple (triplet) pedestrian image.Firstly, being expert at Difficulty is chosen on personal data collection and divides sample pair, and is input to skeleton critical point detection network and detects 14 key points, so Human body attitude correction and pedestrian part ROI segmentation, global image and topography after being corrected are carried out afterwards；

Each is trained batch by step S21, selects the pedestrian of P ID at random, and each pedestrian selects K not at random Same image, each batch contain P × K pedestrian images；

Specifically, choosing the pedestrian of 6 ID in the present embodiment, each ID pedestrian selects 16 different images at random, Each batch contains 64 pedestrian images；

Step S22, it is anchor sample H that each, which is trained image in batch,_n, select a positive sample being most difficult toWith one A negative sample being most difficult toAnd H_nA triple is formed, the difficult requirement for dividing sample pair of selection isMaximum,It is minimum；

Further, using the convolutional neural networks model of step S1 pre-training, with anchor sample H_nIdentical pedestrian ID figure As in selection score it is minimum pedestrian image sample composition difficulty divide positive sample pair, with anchor sample H_nIn different pedestrian ID images The pedestrian image sample composition difficulty of selection highest scoring divides negative sample pair；

Triple pedestrian image is input to skeleton critical point detection network, detects 14 respectively by step S23 A skeleton key point, including head, four limbs, the upper part of the body, the lower part of the body, and using 14 key points as coordinate pair human body appearance State is corrected；

Step S24, according to 14 skeleton key points, by pedestrian's global image be divided into head, the upper part of the body, under Three pedestrian part ROI images of half body, and then obtain pedestrian's global image and three pedestrian topographies after a correction；

Step S3, will by human body attitude correction and the enhanced difficulty of data divide sample pair, be input to pre-training network into And obtain pedestrian part and global characteristics vector；

Step S31 obtains the depth convolutional neural networks model Jing Guo step S1 pre-training and obtains base by step S2 Divide pedestrian's global image and the topography of sample pair in hardly possible, and removes pre-training depth convolutional neural networks top layer Softmax layers and one layer of full articulamentum；

In the present embodiment, the full connection layer parameter of addition is respectively 1 × 1 × 751, input picture 224 × 224, right When ResNet50 carries out pre-training, optimization is iterated using gradient descent method, the number of iterations is set as 60 times, and first 20 times repeatedly Be initialized as 0.01 for learning rate, behind the learning rates of 40 iteration be 0.001, weight pad value is set as in optimization process 0.0001, each batch inputs 64 pedestrian samples；

Step S32, the difficulty that will acquire divide sample to being separately input to depth convolutional neural networks, including pedestrian's overall situation figure Picture and topography obtain pedestrian's global characteristics vector A and pedestrian office by the depth convolutional neural networks that step S31 is constructed Portion feature vector B₁,B₂,B₃, wherein B₁For head zone feature vector, B₂For upper part of the body provincial characteristics vector, B₃For the lower part of the body Feature vector；

Step S33, when difficulty divides sample to depth convolutional neural networks are input to parallel respectively, each depth convolutional Neural Network model is propagated simultaneously, while shared weight；

Pedestrian's local feature vectors of acquisition are carried out depth characteristic with global characteristics vector and merged, obtained most by step S4 Whole pedestrian's feature vector；

Step S41 obtains pedestrian global characteristics vector A and pedestrian's local feature vectors B by step S3₁,B₂,B₃, so Pedestrian's global characteristics vector sum local feature vectors are input to one layer of full articulamentum afterwards and carry out depth characteristic fusion, are carried out defeated Fused pedestrian's feature vector C out, as shown in Figure 2；

Step S42, by fused pedestrian's feature vector C and local feature vectors B₁,B₂,B₃, one layer is inputted respectively puts down Square layer, square layer measure the similarity that hardly possible is divided between sample pair using squared euclidean distance, indicate are as follows:

Wherein, a is anchor sample, and p is to be most difficult to positive sample, and n is to be most difficult to negative sample, d_a,pFor hardly possible divide between positive sample pair away from From d_a,nFor hardly possible point negative sample to the distance between；

Step S43, in order to enable depth convolutional neural networks to extract pedestrian's feature with higher distinguishing, simultaneously The markup information of pedestrian sample is made full use of, has used intersection entropy loss and triple to lose during training network, wherein melting The depth convolutional neural networks for closing global characteristics and local feature use two kinds of loss functions in the training process, extract head, Above the waist, the depth convolutional neural networks of lower part of the body feature only used triple loss function；

The depth convolutional neural networks of step S431, amalgamation of global characteristics and local feature used intersection entropy loss and TriHard loss, indicates are as follows:

Wherein, the sample set with anchor sample a with identical ID is A, and the sample set for being left different ID is B.L_thFor TriHard loss, L_ITo intersect entropy loss, L_thIn α be think setting threshold parameter, q_jIndicate that label probability, C are 751；

Step S432, extract the head of pedestrian, the upper part of the body, the lower part of the body depth convolutional neural networks used TriHard Loss function enables the depth convolutional neural networks for extracting pedestrian's global characteristics more to pay close attention to by sharing weight parameter Local feature with distinction, wherein loss function L_thAre as follows:

Step S433 will finally extract the loss of depth integration feature and local feature, according to corresponding Weight, Total loss is formed, backpropagation is carried out to overall network and updates network parameter, as shown in Figure 3；

Further, each network losses according to corresponding Weight combination are as follows:

Wherein, p_cFor the intersection entropy loss for extracting depth integration feature, p_t、The depth respectively extracted The TriHard loss of fusion feature, head feature, upper part of the body feature, lower part of the body feature, weight factor α₁、α₂、α₃、α₄、α₅Point It is not set as 0.2,0.2,0.2,0.2,0.2；

Step S5, during the entire process of pedestrian identifies network model training again, to pre- instruction by the way of transfer learning Practice network to be finely adjusted, by adding adaptation layer in a network；

Step S51 after the pre-training convolutional neural networks of acquisition are removed top layer softmax layers, selects pedestrian's figure It is fixed as being input in network and using the score of the several layers of convolutional layers in trained classifier calculated convolutional neural networks top Network before one layer of highest scoring, one layer of highest scoring and network layer later are finely adjusted；

Step S52 adds adaptation layer, purpose to obtain better transfer learning effect in the network layer of fine tuning So that source domain and the data distribution of aiming field are closer, so that pedestrian identifies that network obtains better effect again.By multicore MMD The parameter learning of measurement adds in the training of depth convolutional neural networks, and purpose measures the difference of source domain and aiming field, wherein The multicore of multicore MMD measurement indicates are as follows:

The optimization aim of step S53, adaptation layer are made of loss function and adaptive loss, are indicated are as follows:

Step S6 is used only trained network model and extracts pedestrian's global characteristics vector in test phase.Based on upper The trained network model of step, input inquiry pedestrian and target pedestrian image are stated, two rows with higher distinction are obtained People's feature vector；

Step S61, pedestrian's global characteristics that the depth convolutional neural networks after above-mentioned steps training extract are to measurer There is higher distinction, therefore the test phase of model only extracts pedestrian's global characteristics vector；

Inquiry is input to human body key point with target pedestrian image and posture correction network carries out human body appearance by step S62 State correction；

Step S7 is based on pedestrian's global characteristics vector, calculates in inquiry pedestrian image and target pedestrian image data set and appoints The similarity anticipated between an image, it is considered as identical pedestrian that wherein similarity is highest, carries out obtaining pedestrian identifying again As a result.

Specifically, the embodiment of the present invention is using Market-1501 pedestrian's database as training set and test set, proposition The rank-1 of recognition methods reaches 85%, mAp and reaches 60% again by a kind of pedestrian merged based on transfer learning and depth characteristic. Pedestrian of the present invention again recognition methods using transfer learning and in training using depth integration pedestrian global characteristics and local feature Method greatly improves the accuracy rate that pedestrian identifies again, it can be seen that the validity of the method for the present invention.

In conclusion the net of the present invention training depth integration pedestrian part and global characteristics by the way of transfer learning Network model divides negative sample pair to difficulty with difficulty point positive sample is chosen on pedestrian's data set, and be input to human body in the training stage Bone critical point detection network detects 14 key points, is corrected and is divided to pedestrian's posture with 14 key points It is cut into three pedestrian image subregions；It will be defeated to dividing pedestrian's training image of negative sample pair to distinguish with hardly possible comprising difficult point positive sample Enter to pre-training network, wherein inputting every sample is extended to pedestrian's general image and three pedestrian's sub-district area images, obtains Obtain pedestrian part and global characteristics vector；Three pedestrian's local feature vectors are input to one layer of full articulamentum and pedestrian is global Feature vector is merged, and pedestrian's feature vector of depth characteristic fusion is obtained；Identify that network training process uses again in pedestrian The mode of transfer learning is finely adjusted pre-training network, pre-training network top add adaptation layer come complete source domain and Aiming field it is adaptive so that source domain and the data distribution of aiming field are more nearly, so that pedestrian identifies the effect of network more again It is good；Global characteristics network model, input inquiry pedestrian image and target pedestrian image is used only to global characteristics in test phase Network model is extracted, two global characteristics vectors is obtained, and then calculate the similarity of inquiry pedestrian and target pedestrian, is known Other result.

Claims

1. a kind of pedestrian's recognition methods again merged based on transfer learning and depth characteristic, it is characterised in that: the following steps are included:

1. pre-training again: the pre-training model based on ImageNet being identified in pedestrian and carries out pre-training in data, obtains pedestrian Pre-training network model is identified again；

2. human body attitude correction and segmentation: choosing difficulty on personal data of being expert at collection and divide sample pair, and be input to skeleton key point In 14 key points that detection network detects, and human body attitude correction and pedestrian part ROI segmentation are carried out, obtains data and increase Difficulty after strong divides sample to the global and local image after, correction；

3. feature vector again: dividing the global and local image after correction, the enhanced difficulty of data to sample pair, be input to pedestrian and know Other pre-training network model obtains pedestrian part and global characteristics vector；

4. depth characteristic merges: pedestrian part and global characteristics vector being carried out depth characteristic fusion, obtain final pedestrian's feature Vector；

5. training pattern: by the way of transfer learning and step 4. in final pedestrian's feature vector, pedestrian identify again and is instructed in advance Practice network model to be finely adjusted, and identified again in pedestrian and add adaptation layer in pre-training network model altogether, obtains pedestrian and know again Other network model；

6. test model: input inquiry pedestrian and target pedestrian image identify that network model extracts two tools using only pedestrian again There is pedestrian's global characteristics vector of distinction；

7. recognition result: based on pedestrian's global characteristics vector in step 6., calculating inquiry pedestrian and target line personal data is concentrated Similarity between any one image, it is considered as identical pedestrian that wherein similarity is highest.

2. the pedestrian's recognition methods again merged as described in claim 1 based on transfer learning and depth characteristic, it is characterised in that: Pedestrian in training stage identifies the input of network model again, using triple pedestrian image.

3. the pedestrian's recognition methods again merged as described in claim 1 based on transfer learning and depth characteristic, it is characterised in that: 1. the step is divided into following steps:

(1.1) the preparatory trained depth convolutional network model on ImageNet data set is obtained, and it is known again in pedestrian It is trained in other data；

(1.2) when pedestrian identifies pre-training depth convolutional neural networks model in data again, merely with sample markup information Depth convolutional network model is finely adjusted.

4. the pedestrian's recognition methods again merged as claimed in claim 3 based on transfer learning and depth characteristic, it is characterised in that: The step (1.2) is divided into following steps:

The ResNet50 network model of the pre-training on ImageNet data set is removed the full articulamentum of top layer by (1.2.1), Two layers of full articulamentum and one layer softmax layers are added after maximum pond layer；

The label information that (1.2.2) is marked using pedestrian image is finely adjusted the depth convolutional neural networks of building, is finely tuning The three first layers of constant depth convolutional neural networks in the process；

5. the pedestrian's recognition methods again merged as described in claim 1 based on transfer learning and depth characteristic, it is characterised in that: The step 2. in global and local image after the correction that obtains, for difficult point of positive negative sample triad row people image, human body appearance Pedestrian's global image and local ROI image after state correction.

6. the pedestrian's recognition methods again merged as described in claim 1 based on transfer learning and depth characteristic, it is characterised in that: 2. the step is divided into following steps:

(2.1) each being trained into batch, selects the pedestrian of P ID at random, each pedestrian selects K different images at random, Each batch contains P × K pedestrian images；

(2.2) each is trained image in batch is anchor sample H_n, select the positive sample being most difficult to pointIt is most difficult to one The negative sample dividedAnd H_nA triple is formed, the difficult requirement for dividing sample pair of selection isMaximum, It is minimum；

(2.3) triple pedestrian image is input to skeleton critical point detection network, detects 14 human body bones respectively Bone key point, including head, four limbs, the upper part of the body, the lower part of the body, and rectified using 14 key points as coordinate pair human body attitude Just；

(2.4) according to 14 skeleton key points, pedestrian's global image is divided into head, the upper part of the body, the lower part of the body three Pedestrian part ROI image obtains pedestrian's global image and three pedestrian topographies after a correction.

7. the pedestrian's recognition methods again merged as claimed in claim 6 based on transfer learning and depth characteristic, it is characterised in that: In the step (2.2), using the pre-training depth convolutional neural networks model in step (1.1), with anchor sample H_nIt is identical In pedestrian's ID image select score it is minimum pedestrian image sample composition difficulty divide positive sample pair, with anchor sample H_nDifferent pedestrian ID The pedestrian image sample composition difficulty of highest scoring is selected to divide negative sample pair in image.

8. the pedestrian's recognition methods again merged as described in claim 1 based on transfer learning and depth characteristic, it is characterised in that: 3. the step is divided into following steps:

(3.1) it obtains by step (1.1) pre-training depth convolutional neural networks model and the overall situation after 2. step is corrected And topography, and remove the softmax layer and one layer of full articulamentum of pre-training depth convolutional neural networks model top layer；

(3.2) sample is divided to be separately input to depth convolution mind to the global and local image after, correction the enhanced difficulty of data Pedestrian's global characteristics vector sum pedestrian is obtained by the depth convolutional neural networks model that step (3.1) construct through network model Local feature vectors.

9. the pedestrian's recognition methods again merged as described in claim 1 based on transfer learning and depth characteristic, it is characterised in that: 4. the step is divided into following steps:

(4.1) by step 3. in pedestrian part and global characteristics vector be input to one layer of full articulamentum, carry out depth characteristic and melt It closes, obtains and export fused pedestrian's feature vector；

(4.2) by pedestrian's local feature vectors in fused pedestrian's feature vector and step (3.2), one layer is inputted respectively and is put down Square layer, square layer measure the similarity that hardly possible is divided between sample pair using squared euclidean distance.

10. the pedestrian's recognition methods again merged as described in claim 1 based on transfer learning and depth characteristic, feature are existed In: 6. the step is divided into following steps:

(6.2) the pedestrian image input pedestrian after correcting human body attitude identifies network model again, obtain pedestrian's global characteristics to Amount.