CN113887448A - Pedestrian re-identification method based on deep reloading - Google Patents

Pedestrian re-identification method based on deep reloading Download PDF

Info

Publication number
CN113887448A
CN113887448A CN202111174153.4A CN202111174153A CN113887448A CN 113887448 A CN113887448 A CN 113887448A CN 202111174153 A CN202111174153 A CN 202111174153A CN 113887448 A CN113887448 A CN 113887448A
Authority
CN
China
Prior art keywords
pedestrian
deep
reloading
identity
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111174153.4A
Other languages
Chinese (zh)
Inventor
闫禹铭
于慧敏
李殊昭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Zhejiang Lab
Original Assignee
Zhejiang University ZJU
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU, Zhejiang Lab filed Critical Zhejiang University ZJU
Priority to CN202111174153.4A priority Critical patent/CN113887448A/en
Publication of CN113887448A publication Critical patent/CN113887448A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a pedestrian re-identification method based on deep reloading, which comprises a training stage and a testing stage. The overall frame is divided into two branches: and (4) extracting a branch network E from the original image characteristic, and deeply reloading the branch network M to extract the characteristic. Firstly, the pictures in the training data set are reloaded by using a ready-made deep reloading model and are stored in the training data set. In the training phase, both branches participate in training. Taking the deep reloading feature extraction branch M as an example, the pedestrian picture after deep reloading is input into the backbone network to extract features, and then is separated into identity features and clothing features through an attention mechanism. The identity features extracted by the two branches are drawn as close as possible to extract more robust identity features. In the testing stage, only the network E is used for extracting identity features for the input pictures for identity inference. The pedestrian re-identification method can complete the pedestrian re-identification task, and effectively reduce the negative influence of appearance changes such as pedestrian reloading on pedestrian re-identification.

Description

Pedestrian re-identification method based on deep reloading
Technical Field
The invention belongs to the field of computer vision and pattern recognition, and particularly relates to a pedestrian re-recognition method based on deep reloading.
Background
In recent years, with the wide application of monitoring equipment, pedestrian identification related technologies have gained more and more attention, and pedestrian identification focuses on finding pedestrians with the same identity in a pedestrian database by using a shot pedestrian picture to determine the identity of the shot pedestrian. The pedestrian identity recognition has wide application scenes in the environment of the Internet of things and big data, and comprises the fields of intelligent cities, intelligent security and the like. At present, the main pedestrian re-identification technology is closely related to pedestrian identity identification, the pedestrian re-identification also obtains wide attention recently, and remarkable performance improvement is achieved on the public data set. However, the high pedestrian identity labeling cost in the real scene, the great difference of the pedestrian pictures obtained in different domains (scenes) in the aspects of illumination, background, posture and the like brings great challenges to the application of pedestrian re-identification in the real scene, the current mainstream deep learning method generally focuses on the appearance information of the pedestrian for inference, and the pedestrian identity labeling cost is difficult to apply in the real scene in which the pedestrian is frequently changed.
Most of the current algorithms use an attention mechanism to focus the model on a region with higher identification to improve the performance of the model. However, in a real scene, pedestrians frequently change clothes, the same pedestrian wears different clothes with different appearance characteristics, and if only a local area is concerned, the generalization performance of the model is poor.
Disclosure of Invention
The invention aims to provide a pedestrian re-identification method based on deep reloading, aiming at the defect that a pedestrian re-identification algorithm in the prior art is poor in effect in reloading scenes. The pedestrian re-identification method can complete the pedestrian re-identification task, and effectively reduce the negative influence of appearance changes such as pedestrian reloading on pedestrian re-identification.
The purpose of the invention is realized by the following technical scheme: a pedestrian re-identification method based on deep reloading comprises the following steps:
1) and replacing the pedestrians in the training set picture by using the depth replacing model and the preselected clothing template, storing and supplementing the pedestrians in the training set picture.
2) In the training stage, the original image feature extraction branch network E and the deep reloading feature extraction branch network M are used for respectively extracting the identity features and the clothing features of the original image and the deep reloading image, and the networks E and M are trained, so that the extracted features have a better classification effect.
3) In the training phase, the networks E and M are trained so that the identity features extracted by E and M are closer together.
4) In the testing stage, the original image feature extraction branch network E is only used for completing the extraction of the identity information, the identity information is used for carrying out similarity measurement and identity inference, and the highest similarity is the final matching result.
Further, in step 2), the original picture feature extraction branch network E and the deep retooling feature extraction branch network M respectively extract the identity features and clothing features of the original picture and the deep retooling picture, and the specific process is as follows: firstly, inputting a picture into a backbone network to extract a characteristic fsThen, the identity characteristic and the clothing characteristic are separated through an attention mechanism:
fs=Backbone (I)
fclo=Atten(fs)*fs
fID=(1-Atten(fs))*fs
wherein f iscloAs a characteristic of the information on the garment, fIDAs identity information features, fsInputting features extracted from the backbone network for pedestrian pictures, I picture input, Atten (f)s) For attention mechanism applied to fsThe resulting attention was sought.
Further, in the step 2), the clothing characteristics and the identity characteristics separated by the attention mechanism are supervised and trained by utilizing a classification loss:
Figure BDA0003294560550000021
Figure BDA0003294560550000022
where CE represents a classification loss.
Further, in the step 3), the network E and the network M extract the identity features and the clothing features of the original picture and the deep repackaged picture respectively. E and M are trained to minimize the distance between the identity features of the two branches:
Figure BDA0003294560550000023
wherein Ic represents the picture after depth reloading, and I represents the original picture.
Further, the step 4) is specifically as follows: in the testing stage, the deep reloading feature extraction branch network M is not used, and a pedestrian picture is input into the original picture feature extraction branch network E to extract identity features for pedestrian identity inference.
Further, in the step 1), PF-AFN and the like are adopted as the deep reloading model.
Further, in step 2), the backbone networks of the networks E and M adopt a ResNet-50 network structure.
Further, in step 2), the attention mechanism consists of channel attention and spatial attention.
Further, in the step 2), the classification loss adopts the classification loss and the triple loss based on the cross entropy.
Further, in step 3), using an MSE metric function, the distance between the identity features extracted by the networks E and M is measured.
The invention has the beneficial effects that: according to the invention, the identity characteristics and the clothing characteristics are separated by attention, the identity characteristics with more identity recognition degree are extracted, and then the identity characteristics are used for deduction, so that the adaptability of the model to the pedestrian dressing change is improved, and meanwhile, the pictures of the same figure and posture of the same pedestrian wearing different clothing are obtained through deep dressing change, so that the model learning of the identity characteristics irrelevant to the clothing is facilitated. In a real scene, pedestrians are frequently changed, a conventional deep learning method focuses on appearance characteristics to deduce, pictures of the same pedestrian wearing different clothes can cause misjudgment due to overlarge appearance difference, the pedestrian re-identification method is expected to reduce the negative influence of pedestrian changing on pedestrian re-identification in the real scene to a certain extent, and the identification accuracy in the real scene is improved.
Drawings
FIG. 1 is a schematic diagram of the overall structure of a pedestrian re-identification network of the present invention;
FIG. 2 is a flow chart of the training phase of the present invention;
FIG. 3 is a flow chart of the testing phase of the present invention;
FIG. 4 is a schematic diagram of an example of the present invention using an attention mechanism;
FIG. 5 is a diagram illustrating matching results sorted by similarity according to an embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the figures and specific examples.
As shown in FIG. 1, the pedestrian re-identification method based on deep reloading of the invention comprises a training phase and a testing phase. The overall frame is divided into two branches: and (4) extracting a branch network E from the original image characteristic, and deeply reloading the branch network M to extract the characteristic. Before training, firstly, the pictures in the training data set are reloaded by using a ready-made deep reloading model in an offline data enhancement mode, and are supplemented and stored in the training data set. In the training phase, both branches participate in training. The original image feature extraction branch network E and the deep reloading feature extraction branch network M are used for respectively extracting the identity features and clothing features of the original image and the deep reloading image, the networks E and M are trained to enable the extracted features to have a better classification effect, and meanwhile the networks E and M are trained to enable the distances between the identity features extracted by the networks E and M to be closer; taking the deep reloading feature extraction branch M as an example, the pedestrian picture after deep reloading is input into the backbone network to extract features, and then is separated into identity features and clothing features through an attention mechanism. The identity features extracted by the two branches are drawn as close as possible to extract more robust identity features. After training is finished, in a testing stage, for an input picture, a deep reloading feature extraction branch network M is not used, and only an original picture feature extraction branch network E is used for extracting identity features for identity inference. The method specifically comprises the following steps:
1) and replacing the pedestrians in the training set picture by using the depth replacing model and the preselected clothing template, storing and supplementing the pedestrians in the training set picture into the training set. The used depth reloading model can be any current public model, the clothes template can be any clothes, and no style characteristic requirement exists.
2) As shown in fig. 2, in the training stage, the original image feature extraction branch network E and the deep retooling feature extraction branch network M are used to respectively extract the identity features and clothing features of the original image and the deep retooling image, and the networks E and M are trained to enable the extracted features to have a better classification effect. Wherein, the pedestrian picture is input into the backbone networks of the networks E and M to extract the characteristics, and then the pedestrian picture is separated into the identity characteristic and the clothing characteristic through an attention mechanism:
fs=Backbone(I)
fclo=Atten(fs)*fs
fID=(1-Atten(fs))*fs
wherein I is picture input, fsInputting features extracted by a Backbone network Backbone for a pedestrian picture I; atten (f)s) For attention mechanism applied to fsThe resulting attention map; f. ofcloAs a characteristic of the information on the garment, fIDIs an identity information feature. The backbone networks of networks E and M can be any of the backbone network structures currently available, such as ResNet, VGGNet, and the like; the attention mechanism can be any current attention module
The clothing characteristics and the identity characteristics separated by the attention mechanism are supervised and trained by utilizing a classification loss:
Figure BDA0003294560550000041
Figure BDA0003294560550000042
wherein E represents an original picture feature extraction branch network, and M represents a deep reloading feature extraction branch network; CE represents a classification penalty, which may be any penalty used for classification.
3) As shown in fig. 2, in the training phase, the networks E and M are trained so that the distance between the identity features extracted by E and M is closer:
Figure BDA0003294560550000043
wherein Ic represents the picture after depth reloading, and I represents the original picture.
4) As shown in fig. 3, in the actual testing stage, only the original image feature extraction branch network E is used to complete the extraction of the identity information, and the identity information is used to perform similarity measurement and identity inference. Specifically, an image is input into the network E to extract an identity feature, and the identity feature is used to perform similarity measurement and identity inference.
The implementation process of one embodiment of the invention is as follows:
1) changing the pedestrian in the training set picture by using the depth changing model and the preselected clothing template, storing and supplementing the pedestrian into the training set; and (5) replacing the pedestrians in the training set picture by using a third-party deep replacement model PF-AFN (CVPR 2021).
2) In the training stage, the original picture feature extraction branch network E and the deep reloading feature extraction branch network M are used for respectively extracting the identity features and clothing features of the original picture and the deep reloading picture, and the networks E and M are trained to enable the extracted features to have better classification effects.
The main network of the networks E and M adopts a ResNet-50 network structure, as shown in FIG. 4, the attention mechanism is composed of channel attention and space attention, and the final attention attentive is composed of general attentionRoad attention force diagram AchaAnd spatial attention diagram aspaMultiplication results in:
Acha=sigmoid(Relu(Conv(Relu(Conv(GAP(fs))))))
Aspa=softmax(Relu(Conv(CGAP(fs))))
Atten=Aspa*Acha
wherein GAP, CGAP, Conv, Relu and sigmoid are global average pooling, global average pooling in channel direction, convolution layer, Relu activation layer and sigmoid activation layer respectively.
The classification loss adopts the classification loss L based on the cross entropyCEAnd triplet loss LTLThe clothes type label of the original picture adopts 11 pre-labeled types, the labeling of the types takes the color and style of clothes as a standard, and the clothes type label of the deeply reloaded picture is labeled according to the adopted clothes template types:
Figure BDA0003294560550000051
Figure BDA0003294560550000052
CE=LCE+LTL
Figure BDA0003294560550000053
Figure BDA0003294560550000054
where CE represents a classification loss. y isiA real label representing the sample i,
Figure BDA0003294560550000055
a prediction tag representing sample i. N represents the number of samples. f. ofaRepresents an anchor sampleIdentity or clothing features extracted from network E or network M, fpRepresenting features corresponding to positive samples belonging to the same identity as the anchor sample, fnRepresenting features corresponding to negative samples where the anchor samples belong to different identities, and alpha representing a margin value where positive and negative samples are expected to be pushed away from the distance.
3) In the training phase, the networks E and M are trained so that the identity features extracted by E and M are closer together. Measuring the distance between the identity features extracted by the networks E and M by using an MSE (Mean Square Error) measurement function, and enabling the distance between the identity features extracted by the networks E and M to be closer by training the networks E and M:
Figure BDA0003294560550000056
wherein Ic represents the picture after depth reloading, and I represents the original picture.
4) In the actual testing stage, the deep reloading feature extraction branch network M cannot be reserved, only the original picture feature extraction branch network E is used for completing the extraction of identity information, and the identity information is used for carrying out similarity measurement and identity inference so as to improve the reloading robustness of the method. The final obtained results are sorted according to the similarity, an example challenge picture matching result is shown in fig. 5, and the final matching result is the one with the highest similarity.
The above description is only exemplary of the present invention and should not be taken as limiting the invention, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A pedestrian re-identification method based on deep reloading is characterized by comprising the following steps:
1) and replacing the pedestrians in the training set picture by using the depth replacing model and the preselected clothing template, storing and supplementing the pedestrians in the training set picture.
2) In the training stage, the original image feature extraction branch network E and the deep reloading feature extraction branch network M are used for respectively extracting the identity features and the clothing features of the original image and the deep reloading image, and the networks E and M are trained, so that the extracted features have a better classification effect.
3) In the training phase, the networks E and M are trained so that the identity features extracted by E and M are closer together.
4) In the testing stage, the original image feature extraction branch network E is only used for completing the extraction of the identity information, the identity information is used for carrying out similarity measurement and identity inference, and the highest similarity is the final matching result.
2. The pedestrian re-identification method based on the deep retooling of claim 1, wherein in the step 2), the original image feature extraction branch network E and the deep retooling feature extraction branch network M respectively extract the identity features and clothing features of the original image and the deep retooling image, and the specific process is as follows: firstly, inputting a picture into a backbone network to extract a characteristic fsThen, the identity characteristic and the clothing characteristic are separated through an attention mechanism:
fs=Backbone (I)
fclo=Atten(fs)*fs
fID=(1-Atten(fs))*fs
wherein f iscloAs a characteristic of the information on the garment, fIDAs identity information features, fsInputting features extracted from the backbone network for pedestrian pictures, I picture input, Atten (f)s) For attention mechanism applied to fsThe resulting attention was sought.
3. The pedestrian re-identification method based on deep reloading as claimed in claim 2, wherein in the step 2), the clothing characteristics and the identity characteristics separated by the attention mechanism are supervised trained by using a classification loss:
Figure FDA0003294560540000011
Figure FDA0003294560540000012
where CE represents a classification loss.
4. The pedestrian re-identification method based on the deep suit-changing as claimed in claim 1, wherein in the step 3), the network E and the network M respectively extract the identity characteristics and clothing characteristics of the original picture and the deep suit-changing picture. E and M are trained to minimize the distance between the identity features of the two branches:
Figure FDA0003294560540000013
wherein Ic represents the picture after depth reloading, and I represents the original picture.
5. The pedestrian re-identification method based on deep reloading as claimed in claim 1, wherein the step 4) is specifically as follows: in the testing stage, the deep reloading feature extraction branch network M is not used, and a pedestrian picture is input into the original picture feature extraction branch network E to extract identity features for pedestrian identity inference.
6. The pedestrian re-identification method based on deep reloading as claimed in claim 1, wherein in step 1), the deep reloading model adopts PF-AFN or the like.
7. The pedestrian re-identification method based on deep reloading as claimed in claim 2, wherein in the step 2), the backbone networks of the networks E and M adopt a ResNet-50 network structure.
8. The pedestrian re-identification method based on depth reloading as claimed in claim 2, wherein in the step 2), the attention mechanism is composed of channel attention and spatial attention.
9. The pedestrian re-identification method based on deep reloading as claimed in claim 3, wherein in the step 2), the classification loss adopts the classification loss and the triple loss based on cross entropy.
10. The pedestrian re-identification method based on deep reloading as claimed in claim 4, wherein in step 3), the distance of the identity features extracted by the networks E and M is measured by using MSE measurement function.
CN202111174153.4A 2021-10-09 2021-10-09 Pedestrian re-identification method based on deep reloading Pending CN113887448A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111174153.4A CN113887448A (en) 2021-10-09 2021-10-09 Pedestrian re-identification method based on deep reloading

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111174153.4A CN113887448A (en) 2021-10-09 2021-10-09 Pedestrian re-identification method based on deep reloading

Publications (1)

Publication Number Publication Date
CN113887448A true CN113887448A (en) 2022-01-04

Family

ID=79005618

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111174153.4A Pending CN113887448A (en) 2021-10-09 2021-10-09 Pedestrian re-identification method based on deep reloading

Country Status (1)

Country Link
CN (1) CN113887448A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116129473A (en) * 2023-04-17 2023-05-16 山东省人工智能研究院 Identity-guide-based combined learning clothing changing pedestrian re-identification method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116129473A (en) * 2023-04-17 2023-05-16 山东省人工智能研究院 Identity-guide-based combined learning clothing changing pedestrian re-identification method and system

Similar Documents

Publication Publication Date Title
CN108960140B (en) Pedestrian re-identification method based on multi-region feature extraction and fusion
CN108537136B (en) Pedestrian re-identification method based on attitude normalization image generation
Spencer et al. Defeat-net: General monocular depth via simultaneous unsupervised representation learning
CN109961051B (en) Pedestrian re-identification method based on clustering and block feature extraction
CN111709311B (en) Pedestrian re-identification method based on multi-scale convolution feature fusion
CN111666851B (en) Cross domain self-adaptive pedestrian re-identification method based on multi-granularity label
CN112200111A (en) Global and local feature fused occlusion robust pedestrian re-identification method
CN111639564B (en) Video pedestrian re-identification method based on multi-attention heterogeneous network
CN109598268A (en) A kind of RGB-D well-marked target detection method based on single flow depth degree network
CN111666843A (en) Pedestrian re-identification method based on global feature and local feature splicing
CN109635636B (en) Pedestrian re-identification method based on fusion of attribute characteristics and weighted blocking characteristics
CN112784728B (en) Multi-granularity clothes changing pedestrian re-identification method based on clothing desensitization network
Zhang et al. Integration convolutional neural network for person re-identification in camera networks
CN113221770B (en) Cross-domain pedestrian re-recognition method and system based on multi-feature hybrid learning
Wang et al. A comprehensive overview of person re-identification approaches
CN113052185A (en) Small sample target detection method based on fast R-CNN
CN113361464A (en) Vehicle weight recognition method based on multi-granularity feature segmentation
CN112801019B (en) Method and system for eliminating re-identification deviation of unsupervised vehicle based on synthetic data
CN112070010B (en) Pedestrian re-recognition method for enhancing local feature learning by combining multiple-loss dynamic training strategies
CN115841683B (en) Lightweight pedestrian re-identification method combining multi-level features
CN114333062B (en) Pedestrian re-recognition model training method based on heterogeneous dual networks and feature consistency
Han et al. LiCamGait: gait recognition in the wild by using LiDAR and camera multi-modal visual sensors
CN113887448A (en) Pedestrian re-identification method based on deep reloading
Rizzoli et al. Source-free domain adaptation for rgb-d semantic segmentation with vision transformers
CN112836611A (en) Method and equipment for determining semantic graph of body part, model training and pedestrian re-identification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination