CN114187655B - Unsupervised pedestrian re-recognition method based on joint training strategy - Google Patents

Unsupervised pedestrian re-recognition method based on joint training strategy Download PDF

Info

Publication number
CN114187655B
CN114187655B CN202111430274.0A CN202111430274A CN114187655B CN 114187655 B CN114187655 B CN 114187655B CN 202111430274 A CN202111430274 A CN 202111430274A CN 114187655 B CN114187655 B CN 114187655B
Authority
CN
China
Prior art keywords
pedestrian
camera
features
centroid
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111430274.0A
Other languages
Chinese (zh)
Other versions
CN114187655A (en
Inventor
刘雨轩
葛宏伟
孙亮
候亚庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN202111430274.0A priority Critical patent/CN114187655B/en
Publication of CN114187655A publication Critical patent/CN114187655A/en
Application granted granted Critical
Publication of CN114187655B publication Critical patent/CN114187655B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/66Analysis of geometric attributes of image moments or centre of gravity
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the field of artificial intelligence and pedestrian re-recognition, and discloses an unsupervised pedestrian re-recognition method based on a combined training strategy. Aiming at the problem of larger inter-camera domain gap, the method for learning the inter-camera invariance features is provided, and aims to enable a model to learn the capability of distinguishing invariance features under different cameras. The method comprises the following steps: extracting pedestrian image characteristics; clustering and distributing pseudo tags; calculating the mass center of the pedestrian and the mass center of the camera; digging edge characteristics and invariance characteristics between cameras; updating pedestrian instance characteristics and a camera centroid; the parameters of the model are updated with the contrast loss. By using the method and the device, the tag noise can be effectively reduced, the inter-camera domain gap is reduced, and the pedestrian re-recognition precision is obviously improved. The method for unsupervised pedestrian re-recognition based on the combined training strategy can be widely applied to the field of pedestrian re-recognition.

Description

Unsupervised pedestrian re-recognition method based on joint training strategy
Technical Field
The invention belongs to the field of artificial intelligence and pedestrian re-recognition, and particularly relates to an unsupervised pedestrian re-recognition method based on a combined training strategy.
Background
Pedestrian re-recognition mainly matches pedestrian images, and finds pedestrian images of the same category as given pedestrian images. The pedestrian re-identification technology plays a vital role in the fields of smart cities, intelligent security and the like, and can be applied to the fields such as criminal suspects tracking, missing population searching, people flow statistics and the like.
In recent years, the task of re-identifying the supervised pedestrians has been greatly advanced, but due to the large demand of a large-scale monitoring system, the monitoring data is continuously increased, and the high marking cost is added, the application of the system is greatly limited due to the dependence on a large number of manual marks. Therefore, the unsupervised pedestrian re-recognition task is increasingly focused, can be directly learned from unlabeled data, has stronger expandability, and has great application value in the industrial field.
The main methods of current unsupervised pedestrian re-identification task research are generally classified into three categories, (1) an unsupervised domain adaptive method is used to adjust the feature distribution between the source domain and the target domain. (2) The camera perception method is utilized to enable the model to learn the capability of distinguishing sample characteristics under different cameras. (3) And generating pseudo labels for training on the target domain by a clustering method, and distributing the same pseudo labels to similar images. The first class defines unsupervised pedestrian re-recognition tasks as transfer learning tasks, which typically use both source and target domain data sets and employ marked data sets on the source domain to assist in training. The latter two categories are training pedestrian re-recognition models with complete unsupervised. Compared with an unsupervised domain self-adaptive pedestrian re-recognition method, the completely unsupervised pedestrian re-recognition method has more application value. This is because when the difference in the characteristic distribution of the source domain and the target domain is large, it is difficult to obtain a high-quality pseudo tag, which tends to affect the performance because of much tag noise. In practical application, the sample with the tag is difficult to obtain, so that the application of the unsupervised domain self-adaptive method is limited. The completely unsupervised pedestrian re-recognition method can only train the depth model by using unlabeled images, so that the method has more practical value in the industrial field and is more widely applied. The invention mainly aims at the field of completely unsupervised pedestrian re-recognition and provides an unsupervised pedestrian re-recognition method based on a combined training strategy.
The popular unsupervised pedestrian re-identification method in recent years mainly comprises the steps of adopting a clustering algorithm to distribute pseudo labels to unlabeled samples, then updating an example feature library, calculating a centroid, and finally optimizing a model by utilizing contrast learning loss. The contrast learning shows good performance in the field of unsupervised pedestrian re-recognition. Ge et al propose a self-walking contrast learning framework that dynamically updates a hybrid feature library containing source and target domain dataset features, and then performs contrast learning (Yixiao Ge,Feng Zhu,Dapeng Chen,Rui Zhao,et al.Self-paced contrastive learning with hybrid memory for domain adaptive object re-id.[C]//Advances in Neural Information Processing Systems,NeurIPS.2020:11309–11321). with the appearance of the person varying from camera view to camera view due to changes in viewpoint, lighting conditions, background, etc. In general, pedestrians of the same type have higher similarity in the same camera view and have larger appearance differences under different cameras, so how to reduce the field gap generated by the cameras is also one of research hotspots for unsupervised pedestrian re-recognition. The direction of current research is usually at the training level to let the model learn the invariance features between cameras. Yang et al propose a camera perceptron learning to mitigate the negative effects of noise samples and learn inter-camera invariance features (Fengxiang Yang,Zhun Zhong,Zhiming Luo,et al.Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification[C]//in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,CVPR.2021:4855–4864)., while the existing approach is effective, ignoring two important factors. (1) influence of tag noise. In each iteration process, the instance features are updated continuously, which inevitably introduces label noise, so that the accurate update of the instance features can effectively optimize the cluster distribution of the instance features and reduce the influence of the label noise. (2) learning of invariance features between cameras. Invariance between cameras is characterized by the most difficult to distinguish samples of the same identity in each camera, with a large inter-camera field gap. The clustering algorithm is difficult to cluster difficult samples with the same identity in all cameras into the same set, and the real labels are lacking in unsupervised pedestrian re-recognition, so that real supervised learning cannot be performed, and the model cannot effectively learn invariance characteristics among the cameras. The present invention aims to solve the two key unsupervised pedestrian re-recognition problems described above.
For the first problem, the present patent proposes a centralized momentum update strategy aimed at optimizing the cluster distribution to reduce the impact of label noise.
For the second problem, the invention provides a method for learning invariance characteristics among cameras, which aims to enable a model to learn the capability of distinguishing invariance characteristic samples under different cameras, so that the domain gap among the cameras is reduced, and the performance of unsupervised pedestrian re-identification is improved.
Disclosure of Invention
In order to solve the technical problems, the invention aims to provide an unsupervised pedestrian re-identification method based on a combined training strategy. The method can optimize global cluster distribution by utilizing a centralized momentum update strategy, and the model learns the capability of distinguishing invariance characteristic samples under different cameras by utilizing an invariance characteristic learning method among the cameras, so that the influence of tag noise and the domain gap among the cameras are reduced, and the performance of unsupervised pedestrian re-identification is improved.
An unsupervised pedestrian re-identification method based on a joint training strategy comprises the following steps:
Step 1: dividing the pedestrian image into a training set and a testing set;
Step 2: extracting pedestrian characteristics of the training set by using a CNN network loaded with a pre-training model;
step 3: calculating the similarity between pedestrian features, clustering the similarity between the features by using a density clustering algorithm, and generating a pseudo tag;
step 4: removing outlier features, and constructing a new pedestrian training set by using the pseudo tag and the corresponding pedestrian features;
Step 5: and (5) extracting the pedestrian training set constructed in the step (4) to initialize the pedestrian centroid and the camera centroid. An arithmetic average of pedestrian features having the same pseudo tag is calculated as a pedestrian centroid, and an arithmetic average of pedestrian features having the same camera ID is calculated as a camera centroid.
Step 6: and extracting the pedestrian characteristics in the pedestrian training set constructed in the step 4 by using ResNet network. Then executing a centralized momentum update strategy, wherein the specific content of the strategy is as follows: and 5, calculating the similarity between the pedestrian features with the same pseudo labels and the pedestrian centroid obtained in the step 5, and representing the pedestrian feature with the minimum similarity as an edge feature. And updating all pedestrian instance features under the same pseudo tag by utilizing the edge features. Then, the invariance characteristic study among the cameras is carried out, and the specific content of the method is as follows: firstly, calculating the similarity between the pedestrian characteristics with the same camera ID and the camera mass center obtained in the step 5, representing the pedestrian characteristics with the maximum similarity as camera invariance characteristics, and then updating all pedestrian example characteristics under the same pseudo-label and the camera mass center under the same camera ID by utilizing the characteristics. And finally, calculating the mass center of the pedestrian by using the updated pedestrian example characteristics.
Step 7: and (3) extracting a pedestrian inquiry sample by using the pedestrian training set constructed in the step (4), calculating a comparison learning loss by using the sample and the pedestrian centroid obtained in the step (6), and updating parameters of the model.
Step 8: inputting the test set image into the optimal CNN model obtained by training in the step 7 to extract the pedestrian characteristics of the image. And calculating the characteristic distance of the pedestrian images in the query set and the test set to obtain an unsupervised pedestrian re-identification result.
Further, the specific process of step 2 is as follows: the selected CNN is ResNet to load the ImageNet pre-training model and delete its last classification layer. All images of the training set are input into ResNet, and a feature space V (Φ (x 1),φ(x2),...,φ(xn)) is formed assuming that the pedestrian feature Φ (x i) of the ith image is extracted.
Further, the specific process of step 3 is as follows:
The similarity between pedestrian features is calculated using the Jaccard similarity formula, which is as follows:
Wherein s i and s j represent the ith and jth pedestrian features, respectively, J (s i,sj) represents the similarity between the pedestrian features s i and s j, n represents the intersection, and u represents the union.
After the similarity between the pedestrian features is obtained, a pseudo tag Y= { Y 1,y2,...,yM,yM+1,...,yN } is assigned to each pedestrian feature X= { X 1,x2,...,xM,xM+1,...,xN } by using density clustering.
Further, the specific process of step 4 is as follows:
And 3, obtaining a pseudo tag Y= { Y 1,y2,...,yM,yM+1,...,yN } and a corresponding pedestrian characteristic X= { X 1,x2,...,xM,xM+1,...,xN } in the training set, wherein Y= { Y M+1,...,yN } is a discrete value pseudo tag generated by clustering, the corresponding discrete value characteristic is X= { X M+1,...,xN }, the training set with the discrete value removed is X= { X 1,x2,...,xM }, and the corresponding pseudo tag is Y= { Y 1,y2,...,yM }.
Further, the specific process of step 5 is as follows:
The training set X= { X 1,x2,…,xM } and the pseudo tag Y= { Y 1,y2,…,yM } with discrete values removed are obtained through the step 4, the pedestrian characteristic X i={x1,x2,...,xn of the ith pseudo tag is extracted, the average value of the characteristic set is calculated as the centroid of the ith pedestrian, and the calculation formula is as follows:
wherein, Is a d-dimensional vector belonging to the nth pedestrian instance feature in the ith cluster set. Alpha i represents all pedestrian instance features under the ith cluster set, |·| represents the number of all pedestrian instance features under the cluster set, and V i is the ith class of pedestrian centroid.
Extracting pedestrian characteristics Y k={y1,y2,...,ym under the kth camera, and calculating the average value of the characteristic set as the mass center of the kth camera, wherein the calculation formula is as follows:
Wherein the method comprises the steps of Is a d-dimensional vector belonging to the nth instance in the kth camera set. Beta k denotes all camera instance features under the kth camera set, |·| denotes the number of all camera instance features under the camera, and C k is the kth camera centroid.
Further, the specific process of step 6 is as follows:
And 5, obtaining a pedestrian centroid V and a camera centroid C which are obtained through the step 5, and obtaining the pedestrian characteristic X with discrete values removed through the step 4. The pedestrian feature X is input into ResNet network for feature extraction, and then a centralized momentum update strategy is executed: firstly, calculating the similarity between the extracted features and the pedestrian centroid of the same pseudo tag, and selecting the edge feature with the minimum similarity to update all pedestrian instance features under the same pseudo tag. Then, the invariance characteristic study among cameras is carried out: firstly, calculating the similarity between pedestrian features under the same camera ID and the camera centroid, selecting the pedestrian feature with the largest similarity as the camera invariance feature, and updating all pedestrian instance features under the same pseudo-tag and the camera centroid of the same camera ID by utilizing the feature. And finally, calculating the mass center of the pedestrian by using the updated pedestrian example characteristics, wherein the calculation formula is as follows:
Ck←mcCk+(1-mc)pk (6)
Where V i is the i-th pedestrian centroid and C k is the k-th camera centroid. m is a momentum update parameter, and m c is a camera momentum update parameter. Alpha i represents all pedestrian instance features under the i-th cluster set, beta k represents all camera instance features under the k-th camera set. p k denotes the invariance features of the kth camera set, Representing edge features of the class i cluster set.Is a d-dimensional vector, belongs to the nth pedestrian instance feature in the ith cluster set,Representing any one of the features under the class i cluster set.
Further, the specific process of step 7 is as follows:
And (3) comparing the pedestrian centroid obtained in the step (5) with the pedestrian inquiry sample in the training set X extracted in the step (4) to learn and calculate the loss, wherein the calculation formula is as follows:
Where τ is a temperature hyper-parameter, f is a pedestrian query sample, V + is a positive sample pedestrian centroid, K is a number of cluster categories, and the objective of the model parameter optimization is to improve the similarity of a pedestrian query instance and a corresponding same pseudo-tag pedestrian centroid, and reduce the similarity of a pedestrian query instance and different pseudo-tag pedestrian centroids.
The beneficial effects of the invention are as follows: the invention adopts the pedestrian centroid to mine the edge characteristics and uses the characteristics to update the pedestrian example characteristics, so that the influence of the label noise on the cluster distribution can be reduced. According to the invention, the camera centroid is adopted to mine the invariance characteristics among the cameras, and the characteristics are utilized to update all pedestrian example characteristics under the same pseudo labels, so that the inter-camera domain gaps distributed in a clustering way can be reduced, and the capability of the model for distinguishing the same pedestrian identity under different cameras is improved. Finally, the model parameters are optimized through comparison and learning loss, and the accuracy of pedestrian re-identification is effectively improved.
Drawings
FIG. 1 is a flow chart of an unsupervised pedestrian re-recognition method based on a joint training strategy of the present invention;
FIG. 2 is a flowchart of training steps according to an embodiment of the present invention;
FIG. 3 is a flow chart of testing steps according to an embodiment of the invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings and specific examples, which include, but are not limited to, the following examples.
As shown in fig. 1, the invention provides an unsupervised pedestrian re-recognition method based on a joint training strategy, which comprises the following specific implementation processes:
1. Pedestrian image feature extraction
As shown in fig. 2, the training set features are extracted by using CNNs, and assuming that the pedestrian features phi (x i) of the ith image are extracted, a feature space V (phi (x 1),φ(x2),...,φ(xn)) is formed, the selected CNNs are ResNet for loading the ImageNet pre-training model, and then the last classification layer is deleted.
2. Cluster allocation pseudo tag
And (3) calculating the similarity between the pedestrian features extracted in the step (2) by using a Jaccard similarity formula, wherein the calculation formula is as follows:
Where s i and s j represent the ith and jth pedestrians, J (s i,sj) represents the similarity between the pedestrian features s i and s j, and n represents the intersection, and u represents the union.
After the similarity between the pedestrian features is obtained, a pseudo tag Y= { Y 1,y2,...,yM,yM+1,...,yN } is assigned to each pedestrian feature X= { X 1,x2,...,xM,xM+1,...,xN } by using density clustering.
3. Generating a training set of pseudo tags
And 3, obtaining a pseudo tag Y= { Y 1,y2,...,yM,yM+1,...,yN } and a corresponding pedestrian characteristic X= { X 1,x2,...,xM,xM+1,...,xN }, wherein the training set with discrete values removed is X= { X 1,x2,...,xM }, and the corresponding pseudo tag is Y= { Y 1,y2,...,yM }.
4. Centroid initialization and updating of camera centroids and pedestrian instance features
And (3) extracting the pedestrian training set X= { X 1,x2,...,xN } constructed in the step (4), extracting the pedestrian characteristic X i={x1,x2,...,xn } under the ith pseudo tag, and initializing and calculating the pedestrian centroid V= { V 1,V2,...,Vn }. Pedestrian feature Y k={y1,y2,...,ym under the kth camera is extracted and the calculated camera centroid c= { C 1,C2,...,Cn } is initialized. The centroid calculation formula is as follows:
Wherein the method comprises the steps of Is a d-dimensional vector belonging to the nth pedestrian instance feature in the ith cluster set. Alpha i represents all pedestrian instance features under the ith cluster set, and |·| represents the number of all pedestrian instance features under the cluster set.Is a d-dimensional vector belonging to the nth instance in the kth camera set. Beta k denotes all camera instance features under the kth camera set. V i is the i-th pedestrian centroid and C k is the k-th camera centroid.
And (3) inputting the pedestrian characteristics obtained in the step (4) into a ResNet network for characteristic extraction, calculating the similarity between the pedestrian centroid V obtained in the step (5) and the pedestrian characteristics under the same pseudo-label, representing the pedestrian characteristics with the minimum similarity as edge characteristics, and updating all the pedestrian example characteristics under the same pseudo-label by using the characteristics. And simultaneously calculating the similarity between the pedestrian characteristics under the same camera ID and the camera centroid, representing the pedestrian characteristics with the maximum similarity as camera invariance characteristics, and updating all pedestrian instance characteristics under the same pseudo-label and the camera centroid of the same camera ID by using the characteristics. And finally, calculating the mass center of the pedestrian by utilizing the pedestrian example characteristics. The calculation formula is as follows:
Ck←mcCk+(1-mc)pk (14)
Where V i is the i-th pedestrian centroid and C k is the k-th camera centroid. m is a momentum update parameter, and m c is a camera momentum update parameter. Alpha i represents all pedestrian instance features under the ith cluster set, beta k represents all camera instance features under the kth camera set. p k denotes the invariance features of the kth camera set, Representing edge features of the class i cluster set.Is a d-dimensional vector belonging to the nth pedestrian instance feature in the ith cluster set.
5. Contrast loss training network
And (3) comparing the updated pedestrian centroid in the step (6) with the pedestrian inquiry sample in the training set X extracted in the step (4) to learn and calculate the loss, wherein the calculation formula is as follows:
where τ is the temperature hyper-parameter, f is the pedestrian query sample, V + is the positive sample pedestrian centroid, K is the number of cluster categories. The direction of optimizing the model parameters by the loss function is to improve the similarity between the pedestrian inquiry example and the corresponding pedestrian centroid of the same pseudo tag and reduce the similarity between the pedestrian inquiry example and the pedestrian centroid of different pseudo tags.
6. Test set pedestrian retrieval
As shown in fig. 3, the pedestrian image of the test set is input to the best ResNet obtained through training in step 7 to perform feature extraction, and the unsupervised pedestrian re-recognition result can be obtained by calculating the distance between the pedestrian features in the query set and the test set. For example, the Euclidean distance is used for measuring the distance between the features of pedestrians, if the smaller the Euclidean distance between the features is, the more similar the two pedestrian images are, the larger the probability of belonging to the same type of pedestrians is, and finally, the unsupervised pedestrian re-recognition result is obtained.
In summary, the invention discloses an unsupervised pedestrian re-identification method based on a combined training strategy. According to the invention, the edge characteristics and the invariance samples among the cameras are jointly excavated through the center of mass of the camera and the center of mass of the pedestrian, the characteristics of the pedestrian example are updated by utilizing the excavated characteristics, and then the model parameters are optimized through comparison and learning. Therefore, the influence of label noise and inter-camera domain gaps on cluster distribution is reduced, so that the model can learn the capability of distinguishing invariance characteristic samples among cameras, and the performance of unsupervised pedestrian re-recognition is improved.
Firstly, the pedestrian centroid is utilized to mine the characteristics at the clustering edge, and the characteristics are utilized to update the pedestrian instance characteristics, so that the label noise is reduced, and the clustering effect is improved.
Secondly, adopting the mass center of the cameras to mine invariance characteristics among the cameras, and updating all pedestrian example characteristics under the same pseudo labels by utilizing the characteristics. Thus, the model can learn the distribution of invariance characteristics among the cameras, and the domain gap among the cameras is reduced.
Finally, in the process of updating pedestrian example characteristics in a combined mode through the edge characteristics and the invariance characteristics between the cameras, the optimal global cluster distribution is gradually achieved, and the more robust characteristics are learned by the model through contrast learning, so that the accuracy of pedestrian re-identification is effectively improved.

Claims (2)

1. An unsupervised pedestrian re-identification method based on a joint training strategy is characterized by comprising the following steps:
Step 1: dividing the pedestrian image into a training set and a testing set;
Step 2: extracting pedestrian characteristics of the training set by using a CNN network loaded with a pre-training model;
The selected CNN is ResNet to load an ImageNet pre-training model, and the last classification layer is deleted; inputting all images of the training set into ResNet, and forming a feature space V (phi (x 1),φ(x2),...,φ(xn)) on the premise of extracting the pedestrian feature phi (x i) of the ith image;
Step 3: calculating the similarity between pedestrian features by using a Jaccard similarity formula, clustering the similarity between features by using a density clustering algorithm, and generating a pseudo tag;
and calculating the similarity between pedestrian features by using a similarity formula, wherein the calculation formula is as follows:
Wherein s i and s j represent the ith and jth pedestrian features, respectively, J (s i,sj) represents the similarity between the pedestrian features s i and s j, ∈represents the intersection, and u represents the union;
After the similarity between the pedestrian features is obtained, a pseudo tag Y= { Y 1,y2,...,yM,yM+1,...,yN } is distributed to each pedestrian feature X= { X 1,x2,…,xM,xM+1,…,xN } by using density clustering;
step 4: removing outlier features, and constructing a new pedestrian training set by using the pseudo tag and the corresponding pedestrian features;
Obtaining a pseudo tag Y= { Y 1,y2,…,yM,yM+1,…,yN } and a corresponding pedestrian feature X= { X 1,x2,...,xM,xM+1,...,xN } in the training set through the step 3, wherein Y= { Y M+1,...,yN } is a discrete value pseudo tag generated by clustering, the corresponding discrete value feature is X= { X M+1,...,xN }, the training set with the discrete value removed is X= { X 1,x2,...,xM }, and the corresponding pseudo tag is Y= { Y 1,y2,...,yM };
Step 5: extracting the pedestrian training set constructed in the step 4 to initialize the pedestrian centroid and the camera centroid; calculating an arithmetic average of pedestrian features having the same pseudo tag as a pedestrian centroid, and calculating an arithmetic average of pedestrian features having the same camera ID as a camera centroid;
The training set X= { X 1,x2,...,xM } and the pseudo tag Y= { Y 1,y2,...,yM } with discrete values removed are obtained through the step 4, the pedestrian characteristic X i={x1,x2,...,xn of the ith pseudo tag is extracted, the average value of the characteristic set is calculated as the centroid of the ith pedestrian, and the calculation formula is as follows:
wherein, Is a d-dimensional vector, belonging to the nth pedestrian instance feature in the ith cluster set; alpha i represents all pedestrian instance features under the ith cluster set, |·| represents the number of all pedestrian instance features under the cluster set, V i is the ith class of pedestrian centroid;
Extracting pedestrian characteristics Y k={y1,y2,…,ym under the kth camera, and calculating the average value of the characteristic set as the mass center of the kth camera, wherein the calculation formula is as follows:
wherein, Is a d-dimensional vector belonging to the nth instance in the kth camera set; β k denotes all camera instance features under the kth camera set, |·| denotes the number of all camera instance features under the camera, C k is the kth camera centroid;
Step 6: extracting the pedestrian characteristics in the pedestrian training set constructed in the step 4 by using ResNet network; then executing a centralized momentum update strategy; the specific content of the strategy is as follows: calculating the similarity between the pedestrian features with the same pseudo labels and the pedestrian centroid obtained in the step 5, and representing the pedestrian feature with the minimum similarity as an edge feature; updating all pedestrian example features under the same pseudo tag by utilizing the edge features; then, camera invariance feature learning is carried out, and the specific content of the method is as follows: firstly, calculating the similarity between pedestrian features with the same camera ID and the camera mass center obtained in the step 5, representing the pedestrian feature with the maximum similarity as a camera invariance feature, and then updating all pedestrian instance features under the same pseudo tag and the camera mass center under the same camera ID by using the feature; finally, calculating the mass center of the pedestrian by using the updated pedestrian example characteristics;
Step 7: extracting a pedestrian inquiry sample by using the pedestrian training set constructed in the step 4, calculating and comparing learning loss and updating parameters of the model by using the sample and the pedestrian centroid loss obtained in the step 6;
And (3) comparing the pedestrian centroid obtained in the step (5) with the pedestrian inquiry sample in the extracted training set X to learn and calculate the loss, wherein the calculation formula is as follows:
Wherein τ is a temperature superparameter, f is a pedestrian inquiry sample, V + is a positive sample pedestrian centroid, K is the number of clustering categories, and the objective of optimizing model parameters by the loss function is to improve the similarity of a pedestrian inquiry example and a corresponding same pseudo-tag pedestrian centroid and reduce the similarity of the pedestrian inquiry example and different pseudo-tag pedestrian centroids;
Step 8: inputting the test set image into the optimal CNN model obtained by training in the step 7 to extract pedestrian characteristics of the image; and calculating the characteristic distance of the pedestrian images in the query set and the test set to obtain an unsupervised pedestrian re-identification result.
2. The method for unsupervised pedestrian re-recognition based on the combined training strategy according to claim 1, wherein in step 6, the pedestrian centroid V and the camera centroid C obtained in step 5 are subjected to step 4 to obtain the pedestrian feature X with the discrete values removed; the pedestrian feature X is input into ResNet network for feature extraction, and then a centralized momentum update strategy is executed: firstly, calculating the similarity between the extracted features and the pedestrian centroids of the same pseudo tags, and selecting edge features with minimum similarity to update all pedestrian instance features under the same pseudo tags; then, the invariance characteristic study among cameras is carried out: firstly, calculating the similarity between pedestrian features under the same camera ID and the camera centroid, selecting the pedestrian feature with the largest similarity as a camera invariance feature, and updating all pedestrian instance features under the same pseudo-tag and the camera centroid of the same camera ID by using the feature; and finally, calculating the mass center of the pedestrian by using the updated pedestrian example characteristics, wherein the calculation formula is as follows:
Ck←mcCk+(1-mc)pk (6)
where V i is the i-th class pedestrian centroid, C k is the k-th camera centroid, m is the momentum update parameter, m c is the camera momentum update parameter, alpha i represents all pedestrian instance features under the i-th class cluster set, beta k represents all camera instance features under the k-th camera set, p k represents the invariance feature of the k-th camera set, Representing edge features of a set of class i clusters,Is a d-dimensional vector, belongs to the nth pedestrian instance feature in the ith cluster set,Representing any one of the features under the class i cluster set.
CN202111430274.0A 2021-11-29 2021-11-29 Unsupervised pedestrian re-recognition method based on joint training strategy Active CN114187655B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111430274.0A CN114187655B (en) 2021-11-29 2021-11-29 Unsupervised pedestrian re-recognition method based on joint training strategy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111430274.0A CN114187655B (en) 2021-11-29 2021-11-29 Unsupervised pedestrian re-recognition method based on joint training strategy

Publications (2)

Publication Number Publication Date
CN114187655A CN114187655A (en) 2022-03-15
CN114187655B true CN114187655B (en) 2024-08-13

Family

ID=80602835

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111430274.0A Active CN114187655B (en) 2021-11-29 2021-11-29 Unsupervised pedestrian re-recognition method based on joint training strategy

Country Status (1)

Country Link
CN (1) CN114187655B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476168A (en) * 2020-04-08 2020-07-31 山东师范大学 Cross-domain pedestrian re-identification method and system based on three stages
CN112836675A (en) * 2021-03-01 2021-05-25 中山大学 Unsupervised pedestrian re-identification method and system based on clustering-generated pseudo label

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112906606B (en) * 2021-03-05 2024-04-02 南京航空航天大学 Domain self-adaptive pedestrian re-identification method based on mutual divergence learning
CN113065409A (en) * 2021-03-09 2021-07-02 北京工业大学 Unsupervised pedestrian re-identification method based on camera distribution difference alignment constraint

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476168A (en) * 2020-04-08 2020-07-31 山东师范大学 Cross-domain pedestrian re-identification method and system based on three stages
CN112836675A (en) * 2021-03-01 2021-05-25 中山大学 Unsupervised pedestrian re-identification method and system based on clustering-generated pseudo label

Also Published As

Publication number Publication date
CN114187655A (en) 2022-03-15

Similar Documents

Publication Publication Date Title
CN111126360B (en) Cross-domain pedestrian re-identification method based on unsupervised combined multi-loss model
CN108960140B (en) Pedestrian re-identification method based on multi-region feature extraction and fusion
CN111666851B (en) Cross domain self-adaptive pedestrian re-identification method based on multi-granularity label
Lin et al. RSCM: Region selection and concurrency model for multi-class weather recognition
CN109117793B (en) Direct-push type radar high-resolution range profile identification method based on deep migration learning
CN112906606B (en) Domain self-adaptive pedestrian re-identification method based on mutual divergence learning
CN110728694B (en) Long-time visual target tracking method based on continuous learning
CN111274958B (en) Pedestrian re-identification method and system with network parameter self-correction function
CN112434599B (en) Pedestrian re-identification method based on random occlusion recovery of noise channel
CN116910571B (en) Open-domain adaptation method and system based on prototype comparison learning
CN112818175B (en) Factory staff searching method and training method of staff identification model
CN113920472A (en) Unsupervised target re-identification method and system based on attention mechanism
CN112819065A (en) Unsupervised pedestrian sample mining method and unsupervised pedestrian sample mining system based on multi-clustering information
CN108345866B (en) Pedestrian re-identification method based on deep feature learning
CN111967325A (en) Unsupervised cross-domain pedestrian re-identification method based on incremental optimization
Wang et al. Multiple pedestrian tracking with graph attention map on urban road scene
CN114092873B (en) Long-term cross-camera target association method and system based on appearance and morphological decoupling
CN112115780A (en) Semi-supervised pedestrian re-identification method based on deep multi-model cooperation
CN115205570A (en) Unsupervised cross-domain target re-identification method based on comparative learning
CN113657267B (en) Semi-supervised pedestrian re-identification method and device
Wang et al. Online visual place recognition via saliency re-identification
CN111695531B (en) Cross-domain pedestrian re-identification method based on heterogeneous convolution network
CN115527269B (en) Intelligent human body posture image recognition method and system
CN114187655B (en) Unsupervised pedestrian re-recognition method based on joint training strategy
CN112052722A (en) Pedestrian identity re-identification method and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant