CN114898406A - Unsupervised pedestrian re-identification method based on contrast clustering - Google Patents

Unsupervised pedestrian re-identification method based on contrast clustering Download PDF

Info

Publication number
CN114898406A
CN114898406A CN202210664167.2A CN202210664167A CN114898406A CN 114898406 A CN114898406 A CN 114898406A CN 202210664167 A CN202210664167 A CN 202210664167A CN 114898406 A CN114898406 A CN 114898406A
Authority
CN
China
Prior art keywords
clustering
feature
cluster
storage unit
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210664167.2A
Other languages
Chinese (zh)
Inventor
张远辉
冯化涛
刘康
朱俊江
付铎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Jiliang University
Original Assignee
China Jiliang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Jiliang University filed Critical China Jiliang University
Priority to CN202210664167.2A priority Critical patent/CN114898406A/en
Publication of CN114898406A publication Critical patent/CN114898406A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an unsupervised pedestrian re-identification method based on contrast clustering, which comprises the following steps: performing forward calculation on the unmarked pedestrian image data set by using an initial feature encoder, and initializing a feature storage unit by using the encoded features; clustering the feature storage units before each round of training, and screening clustering results according to a clustering independence standard and a clustering closeness standard; carrying out feature coding on each group of small-batch training samples, and carrying out backward propagation updating on the network by using a uniform contrast loss function; dynamically updating instance features in the feature storage unit in a momentum updating manner by using the encoded features; and circularly updating the feature encoder and the feature storage unit according to the preset number of training rounds until the pedestrian re-identification network converges. The invention fully excavates the available information of the non-clustered outlier by using a contrast clustering method, and improves the identification accuracy of the unsupervised pedestrian re-identification model.

Description

Unsupervised pedestrian re-identification method based on contrast clustering
Technical Field
The invention relates to the field of computer vision and pedestrian re-identification, in particular to an unsupervised pedestrian re-identification method based on contrast clustering.
Background
Pedestrian re-recognition, also known as pedestrian re-recognition, is considered as a sub-problem of image retrieval with the goal of retrieving a particular pedestrian in multiple surveillance camera areas that do not overlap. The pedestrian re-identification technology can make up the visual limitation of a single fixed camera, can be combined with the pedestrian detection and pedestrian tracking technology, and is widely applied to the security protection fields of intelligent monitoring, video tracking and the like. With the development of deep learning technology and the proposal of large-scale data sets, the performance of the prior supervised pedestrian re-identification method is greatly improved, however, the algorithm based on supervised learning greatly depends on the artificially labeled real labels, and the further development of the pedestrian re-identification technology is hindered. On the other hand, a large amount of non-labeled pedestrian image data can be easily obtained in reality, and the method for researching how to train a more robust pedestrian re-identification model by using a large-scale non-labeled pedestrian image has a great research value. Therefore, an unsupervised pedestrian re-recognition method without any labeling information is proposed to solve the above-described problems.
The unsupervised pedestrian re-identification method mainly comprises a pseudo label-based method and an image generation-based method, wherein the clustering-based pseudo label method is proved to be more effective and maintains the most advanced precision at present. At present, most of the clustering-based pseudo label methods are divided into two steps in training: firstly, performing feature coding on a pedestrian image by using an initial feature coder; and secondly, clustering the coded features to obtain a pseudo label so as to supervise the training of the network. Although the quality of the pseudo label can be continuously improved to a certain extent along with the optimization of the model, the training of the model is often interfered by unavoidable pseudo label noise, and the model has a greater risk of collapse under the condition that the initial pseudo label noise is relatively large. In addition, the clustering-based pseudo-label approach often does not use all label-free training data, and the density-based clustering algorithm itself generates clustered outliers that are generally simply discarded without being used for model training because they cannot assign pseudo-labels. However, such clustering outliers are often just difficult training samples worth mining in the pedestrian data set, and especially in the early stage of training, a large number of clustering outliers exist, and if the clustering outliers are simply discarded, the training samples are greatly reduced, and the performance of the model is seriously damaged.
In recent years, contrast learning is widely applied to an unsupervised representation learning task, and can enable a model to fully learn the similarity between samples of the same type and the difference between different types under the unsupervised condition, regard each unlabeled sample as a different type, and lose discriminability representation of a learning sample instance through optimizing contrast loss. However, most of the current contrast loss is at the sample instance level, and it is difficult to correctly measure the intra-class relationship in the pedestrian image dataset.
Disclosure of Invention
In order to solve the problems, the invention provides an unsupervised pedestrian re-identification method based on comparison clustering, which fully excavates difficult training samples in a target domain data set and effectively models the intra-class relation of pedestrians by performing combined comparison learning of the clustering centroid and the non-clustering outlier.
Specifically, the technical scheme of the unsupervised pedestrian re-identification method based on the contrastive clustering provided by the invention comprises the following steps:
step 1: using initial feature encoders f θ Forward calculation is carried out on the pedestrian images in the unmarked training data set, and a feature storage unit based on a class prototype is initialized by utilizing the coded features;
step 2: clustering the coding features in the feature storage unit by using a DBSCAN clustering algorithm before each round of training, and screening clustering results according to a clustering reliability evaluation standard;
and step 3: for each set of small training samples, encoder f is used θ Carrying out feature coding on the small batch sample characteristics f to obtain small batch sample characteristics f, calculating the loss between the small batch sample characteristics f and the characteristics in the characteristic storage unit by using a uniform contrast loss function, and carrying out reverse propagation updating on the network;
and 4, step 4: in each iterative training process, dynamically updating a feature storage unit in a momentum updating mode by using coding features obtained by forward calculation of small-batch training samples;
and 5: and (4) according to the preset number of training rounds, circularly performing the steps 2 to 4 until the pedestrian re-identification model is converged.
Further, the initialization process of the feature encoder and the feature storage unit in step 1 is as follows:
using ResNet-50 deep neural network as feature encoder f θ And initializing it using pre-training weights on the ImageNet image dataset;
using feature encoders f θ Forward calculation is carried out on samples in the pedestrian image data set to extract features, and a feature set { v is obtained 1 ,…,v n And n represents the number of samples in the pedestrian image data set, and all the features in the feature set are stored in the feature storage unit by taking an example as a unit, so that the category prototypes in the feature storage unit can be continuously updated under the condition that the clustering and non-clustering outliers are continuously changed.
Further, the clustering and screening process of the features in the step 2 is as follows:
firstly, a DBSCAN clustering algorithm is used for the feature set { v ] in the feature storage unit in the step 1 1 ,…,v n Clustering is carried out, and the category prototype in the feature storage unit is further divided into a cluster centroid
Figure BDA0003691050720000021
And unclustered outlier instances
Figure BDA0003691050720000022
Wherein n is c Denotes the number of clusters, n o And representing the number of the non-clustered outliers, screening the clustering result according to the clustering independence and clustering compactness standard, and reordering the retrieval result by adopting a k-recurrocal neighbor algorithm.
Clustering clusters and non-clustering examples in the feature storage unit are both regarded as equal and independent classes, so that the clustering reliability is of great importance to the training, the network has poor discriminative ability on images at the beginning of training, and the clustering noise is also large, so that the effect of improving clustering by the self-learning strategy is provided. Specifically, clustering is performed again before each round of training is started, reliable clustering clusters are reserved from the most reliable clustering, features in unreliable clustering clusters are disassembled back to clustering-free outlier examples, the number of clustering clusters is increased gradually, and clustering standards are alternately relaxed and tightened by adjusting an e-neighborhood distance threshold of samples in a DBSCAN clustering algorithm to obtain a more reliable clustering result.
The clustering independence standard is used for measuring the distance between classes and is expressed as the intersection ratio between the feature set and the feature set after the clustering standard is relaxed:
Figure BDA0003691050720000023
where | represents the number of features in the set, I (f) i ) Representing a set of samples in the same cluster, I loose (f i ) Representing a sample set, R, in the same cluster after the clustering criterion is relaxed indep (f i ) Represents a cluster I (f) i ) An independence score of;
the clustering compactness standard is used for measuring the intra-class distance and is expressed as the intersection ratio between the feature set and the feature set after the clustering standard is tightened:
Figure BDA0003691050720000024
wherein I tight (f i ) Representing a sample set R in the same cluster after the cluster standard is tightened comp (f i ) Represents a cluster I (f) i ) A closeness score of;
the measurement of independence between clustering clusters and the tightness between samples is realized through the clustering reliability evaluation standard, and the clustering reliability evaluation standard has the starting points that a reliable clustering is stable in a multi-scale clustering environment, and the hyper-parameter alpha is set, and the beta belongs to [0,1 ]]Express a singleThreshold of stereo and compactness, preserving inter-class independence R comp (f i )>Alpha and intima-like tightness R indep (f i )>Clustering samples of beta, and dividing the rest samples into non-clustered outliers.
Further, the unified contrast loss function in step 3 is expressed as:
given unlabeled training sample
Figure BDA0003691050720000031
After the features are coded by the feature coder, the features are all stored in a feature storage unit, and the self-learning strategy in the step 2 is used for dividing the feature set into clustering features and non-clustering outlier features, so that the whole training data set is divided into a sample set with clustering pseudo labels
Figure BDA0003691050720000032
And a set of outlier instance samples that do not belong to any cluster
Figure BDA0003691050720000033
And is
Figure BDA0003691050720000034
Given training sample
Figure BDA0003691050720000035
Forward calculation is carried out on each training sample by using a characteristic encoder to obtain a coded characteristic f, and a unified contrast loss function is constructed:
Figure BDA0003691050720000036
wherein z is + A front class prototype of the feature f, τ temperature coefficient,<·,·>represents the vector inner product, c k For the centroid of the current cluster k, representing the class prototype within the cluster, v k Representing a class prototype without clustering for example characteristics of k outliers of the current cluster;
if f belongs to cluster k, then z + =c k Is the centroid of cluster k; if f belongs to an uncleaved outlier, then z + =v k For unclustered outlier instance features, the above-described contrast loss facilitates encoding features close to their true class, and after encoding features for a small sample run, the features are compared to two class prototypes such that each training sample is close to the class to which it belongs and far from the other classes.
Further, the process of updating the momentum of the feature storage unit in step 4 is as follows:
firstly, all training samples are subjected to feature storage by taking an example as a unit, and then features in each small batch of samples are accumulated into example features corresponding to a feature storage unit in a momentum updating mode according to index numbers;
in the feature storage unit, features { v ] in the same cluster are subjected to 1 ,…,v n Calculating the average value among the characteristics to obtain the centroid of the cluster
Figure BDA0003691050720000037
Instance features that do not cluster outliers
Figure BDA0003691050720000038
Extracting the remaining example features directly from the feature storage unit, wherein the centroid of the kth cluster is represented as:
Figure BDA0003691050720000039
wherein I k Representing a feature vector set of a kth clustering cluster, | · | representing the number of feature vectors in the set, initializing example features { v } in a feature storage unit through network forward calculation for one time initially, and continuously updating in a training process later to perform more robust clustering;
in each round of training, the class prototypes in the feature storage unit are updated in a momentum updating mode by using the coding features in the small-batch samples, and the example features v are subjected to the training process i And current sample characteristics f i And carrying out momentum weighted summation to obtain updated example characteristics:
v i ←mv i +(1-m)f i
wherein m is [0,1 ]]For momentum factors, an example feature v is represented i And sample characteristics f i The specific gravity occupied during momentum update, given the updated v i If f is i Belongs to cluster k, then the corresponding cluster centroid c k And updating accordingly.
Further, the overall training process of the method in step 5 is as follows:
for label-free training data sets
Figure BDA00036910507200000310
Performing feature coding through a ResNet-50 feature coder, and initializing a feature storage unit by using the obtained feature set;
in each training round thereafter, clustering the feature set { v } in the feature storage unit according to the clustering independence and clustering compactness standard, and training samples
Figure BDA00036910507200000311
Partitioning into clustered samples
Figure BDA00036910507200000312
And clustering-free outlier samples
Figure BDA00036910507200000313
Computing
Figure BDA0003691050720000041
Cluster centroids of (1);
for training samples in each small batch, a feature encoder f is used θ Performing feature encoding, calculating contrast loss, and performing back propagation to update encoder f θ
Updating the feature set { v } in the feature storage unit, and clustering the centroid c according to the updated feature set { v }, wherein the feature set { v } is stored in the feature storage unit k Updating of (1);
cyclic progression feature encoder f θ And updating the feature storage unit until the model achieves better convergence.
Compared with the prior art, the invention has the beneficial effects that:
(1) the invention provides a characteristic storage unit for storing all effective information on a label-free data domain to be used for learning more sufficient characteristic representation, provides two kinds of monitoring information of clustering and non-clustering outlier examples through a dynamic updating characteristic storage unit, and fully utilizes a difficult training sample in a label-free data set;
(2) the invention provides a method for screening clustering results by using a self-walking learning strategy, wherein a network training process starts from the most reliable clustering in a feature storage unit, more non-clustered outliers are put into a new clustering cluster for carrying out feature clustering for multiple times to obtain more reliable clustering results, the discriminability of feature representation is gradually improved, the problem of pseudo-label noise can be effectively relieved, and the learning process of feature representation is optimized;
(3) the invention provides a multi-scale clustering reliability measurement standard which comprises clustering independence and clustering compactness standards, wherein the most reliable clustering clusters are started, the number of clustering clusters is increased step by step, and the more reliable clustering is created step by combining a self-walking learning strategy from difficult simplification, so that the dynamic optimization of a feature storage unit and a pedestrian re-identification network is realized, and the performance of an unsupervised pedestrian re-identification model is greatly improved.
Drawings
FIG. 1 is a flow chart of an unsupervised pedestrian re-identification method based on contrast clustering according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a system of an unsupervised pedestrian re-identification method based on contrast clustering according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a unified contrast loss calculation according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating a feature storage unit momentum update according to an embodiment of the present invention.
Detailed description of the preferred embodiments
The following embodiments and the accompanying drawings are used to describe the present invention in further detail, and the following embodiments and the accompanying drawings are used to illustrate the present invention but not to limit the scope of the present invention.
System embodiment
Referring to fig. 1 to 4, the present embodiment provides an unsupervised pedestrian re-identification method based on contrastive clustering, which includes the following steps:
step 1: using initial feature encoders f θ Forward calculation is carried out on the pedestrian images in the unmarked training data set, and a feature storage unit based on a class prototype is initialized by utilizing the coded features;
step 2: clustering the coding features in the feature storage unit by using a DBSCAN clustering algorithm before each round of training, and screening clustering results according to a clustering reliability evaluation standard;
and step 3: for each set of small training samples, encoder f is used θ Carrying out feature coding on the small batch sample characteristics f to obtain small batch sample characteristics f, calculating the loss between the small batch sample characteristics f and the characteristics in the characteristic storage unit by using a uniform contrast loss function, and carrying out reverse propagation updating on the network;
and 4, step 4: in each iterative training process, dynamically updating a feature storage unit in a momentum updating mode by using coding features obtained by forward calculation of small-batch training samples;
and 5: and (4) according to the preset number of training rounds, circularly performing the steps 2 to 4 until the pedestrian re-identification model is converged.
The above steps can be summarized as the following process: (1) initializing a feature encoder and a feature storage unit; (2) clustering and screening the characteristics; (3) unifying the comparison loss calculation process; (4) updating the momentum of the characteristic storage unit; (5) and (5) an overall network training process.
The following is a detailed description.
(1) Feature encoder and initialization process of feature storage unit
Using ResNet-50 deep neural network as feature encoder f θ And performing parameter initialization on the ImageNet image data set by using the pre-training weight on the ImageNet image data set, and replacing the last full-connection layer with a batch standardLayer of formation and L 2 A regularization layer to accommodate the needs of unsupervised tasks;
in the training process, each small batch of samples comprises 64 unlabeled pedestrian images, the unlabeled pedestrian images belong to at least 16 different categories, under the condition that the clustered values and the non-clustered outliers are regarded as independent categories, 4 pedestrian images are distributed to each clustered value, a single pedestrian image is distributed to each non-clustered outlier, the size of each pedestrian image in the data set is adjusted to 256 x 128 before training, and data enhancement is carried out by using the modes of random overturning, random cutting, random erasing and the like.
Using feature encoders f θ Forward calculation is carried out on samples in the pedestrian image data set to extract features, and a feature set { v is obtained 1 ,…,v n And n represents the number of samples in the pedestrian image data set, and all the features in the feature set are stored in the feature storage unit by taking an example as a unit, so that the category features in the feature storage unit can be continuously updated under the condition that clustering and non-clustering outliers are continuously changed.
(2) Clustering and screening process of features
Firstly, a DBSCAN clustering algorithm is used for the feature set { v ] in the feature storage unit in the step 1 1 ,…,v n Performing clustering, and further dividing category prototypes in the feature storage unit into clustering centroids
Figure BDA0003691050720000051
And unclustered outlier instances
Figure BDA0003691050720000052
Wherein n is c Denotes the number of clusters, n o And representing the number of the non-clustered outliers, screening the clustering result according to the clustering independence and clustering compactness standard, and reordering the retrieval result by adopting a k-recurrocal neighbor algorithm.
In the DBSCAN algorithm, the maximum e-neighborhood distance threshold is set to d ═ 0.6, and the minimum number of neighborhood points is set to 4, that is, one cluster contains at least 4 samples. And (3) alternately widening and tightening the clustering standard by adjusting the threshold value of the E-neighborhood distance of the sample in the DBSCAN clustering algorithm to obtain a more reliable clustering result, and widening and tightening the clustering standard by adjusting the neighborhood distance d to be 0.58 and d to be 0.62.
Clustering and non-clustering examples in the feature storage unit are both regarded as equal and independent classes, so that the clustering reliability is of great importance to the training, the network has poor discriminative ability on images at the beginning of training, and the clustering noise is also large, so that the effect of improving clustering by a step learning strategy is provided. Specifically, clustering is performed again before each round of training begins, reliable clusters are retained from the most reliable clusters, and unreliable clusters are broken down back to the clustering-free outlier example, so that the number of clusters is increased gradually.
The clustering independence standard is used for measuring the distance between classes and is expressed as the intersection ratio between the feature set and the feature set after the clustering standard is relaxed:
Figure BDA0003691050720000053
where | represents the number of features in the set, I (f) i ) Representing a set of samples in the same cluster, I loose (f i ) Representing a sample set, R, in the same cluster after the clustering criterion is relaxed indep (f i ) Represents a cluster I (f) i ) An independence score of;
the cluster compactness standard is used for measuring the intra-class distance and is expressed as the intersection ratio between the feature set and the feature set after the cluster compactness standard is tightened:
Figure BDA0003691050720000054
wherein I tight (f i ) Representing a sample set R in the same cluster after the clustering standard is tightened comp (f i ) Represents a cluster I (f) i ) A closeness score of;
realizing independence and sample among clusters through the cluster reliability evaluation standardThe starting point of the cluster reliability evaluation standard is that a reliable cluster should be stable in a multi-scale clustering environment, and a hyper-parameter alpha is set, and beta belongs to [0,1 ]]Denotes the independence and closeness thresholds, where α is initialized to α -0.9R indep-1th And then kept consistent during subsequent training, R indep-1th Represents the clustering independence standard obtained in the first training round, and beta is set as the maximum clustering compactness standard R in the whole training process comp-max To fully reserve the most closely spaced samples in each cluster, and to reserve inter-class independence R indep (f i )>Alpha and intima-like tightness R comp (f i )>And beta, dividing the rest samples into non-clustered outliers.
(3) Unified contrast loss function calculation process
Given unlabeled training sample
Figure BDA0003691050720000061
After the features are coded by the feature coder, the features are all stored in a feature storage unit, so that a self-learning strategy divides a feature set into clustering features and non-clustering outlier features, and the whole training data set is divided into a sample set with clustering pseudo labels
Figure BDA0003691050720000062
And a set of outlier instance samples that do not belong to any cluster
Figure BDA0003691050720000063
And is
Figure BDA0003691050720000064
And the characteristic storage unit is used for storing the front prototype of the characteristic of the sample, and for a sample from the label-free data set, if the sample is in the cluster, the front prototype is the cluster centroid corresponding to the sample, otherwise, if the sample is not in the cluster, the sample is the cluster outlier, and the front prototype is the example characteristic corresponding to the outlier.
FIG. 3 is a schematic diagram of the unified contrast loss calculationAs shown, given a training sample
Figure BDA0003691050720000065
Forward calculation is carried out on each training sample by using a feature encoder to obtain a coded feature f, vector dot product operation is carried out on the feature f and a corresponding class prototype to calculate the similarity, and a unified contrast loss function is constructed:
Figure BDA0003691050720000066
wherein z is + A front class prototype representing the feature f, τ represents the temperature coefficient, empirically set to 0.05,<·,·>represents the vector inner product, c k For the centroid of the current cluster k, representing the class prototype within the cluster, v k Representing a class prototype without clustering for example characteristics of k outliers of the current cluster;
if f belongs to cluster k, then z + =c k Is the centroid of cluster k; if f belongs to an uncleaved outlier, then z + =v k The above-mentioned contrast loss facilitates the encoder features to be close to their true class, and after a small batch of samples are encoded with features, they are compared to two class prototypes, so that each training sample is close to the class to which it belongs and far from the other classes.
(4) Feature storage unit momentum update procedure
As shown in the momentum updating schematic diagram of the feature storage unit in fig. 4, firstly, feature storage is performed on all training samples in units of instances, and then, features in each small batch of samples are accumulated into instance features corresponding to the feature storage unit in a momentum updating manner according to indexes, the momentum updating manner is widely applied to an optimization algorithm in deep learning, so that a previous updating direction can be retained to a certain extent during parameter updating, and meanwhile, a final updating direction is fine-tuned by using features of samples in a current small batch, that is, in short, the current feature is updated by accumulating previous momentums.
In the feature storage unit, the same cluster is clusteredFeatures in a cluster { v 1 ,…,v n Calculating the average value between the features to obtain the clustering centroid
Figure BDA0003691050720000067
Instance features that do not cluster outliers
Figure BDA0003691050720000068
The remaining example features are extracted directly from the feature storage unit, without loss of generality, assuming that the non-clustered outlier feature index in the feature set { v } is {1, …, n } 0 Then the cluster feature index is { n } c +1, n, the centroid of the kth cluster is represented as:
Figure BDA0003691050720000071
wherein I k Representing a feature vector set of a kth cluster, | · | representing the number of feature vectors in the set, initializing example features { v } in a feature storage unit through network forward calculation for one time initially, and continuously updating in a training process later to perform more robust clustering;
in each round of training, the class prototypes in the feature storage unit are updated in a momentum updating mode by using the coding features in the small-batch samples, and the example features v are subjected to the training process i And current sample characteristics f i And carrying out momentum weighted summation to obtain updated example characteristics:
v i ←mv i +(1-m)f i
wherein m is [0,1 ]]For momentum factors, an example feature v is represented i And sample characteristics f i The specific gravity occupied during momentum update is empirically set to 0.2, and updated v is given i If f is i Belongs to cluster k, then the corresponding cluster centroid c k And updating accordingly.
(5) Whole network training process
For label-free training data sets
Figure BDA0003691050720000072
Feature coding is carried out through a ResNet-50 feature encoder, a feature storage unit is initialized by using an obtained feature set, model parameters are updated by using an Adam optimizer, weight attenuation is set to be 0.0005, training round number is set to be 70, training of the model is carried out in a learning rate scheduling mode, optimization progress is responded in a dynamic mode, an initial learning rate is set to be 0.00035, and the learning rate is reduced by 10 times every 20 training rounds.
In each training round thereafter, clustering the feature set { v } in the feature storage unit according to the clustering independence and clustering compactness standard, and training samples
Figure BDA0003691050720000073
Partitioning into clustered samples
Figure BDA0003691050720000074
And clustering-free outlier samples
Figure BDA0003691050720000075
Computing
Figure BDA0003691050720000076
Cluster centroids of (1);
for training samples in each small batch, a feature encoder f is used θ Performing feature encoding, calculating contrast loss, and performing back propagation to update encoder f θ
Updating the feature set v in the feature storage unit, and clustering the centroid c according to the updated feature set { v } k Updating of (1);
cyclic progression feature encoder f θ And updating the feature storage unit until the model achieves better convergence.
Finally, it should be noted that the embodiment provides theoretical premises, implementation steps and parameter setting of the unsupervised pedestrian re-identification method based on the contrast clustering. In addition, the implementation is only a preferred specific implementation of the present invention, and the setting of the parameters needs to be adjusted according to specific variables and data in the specific implementation process, so as to achieve a better practical effect.

Claims (7)

1. An unsupervised pedestrian re-identification method based on contrast clustering is characterized by comprising the following steps:
step 1: performing forward calculation on the unmarked pedestrian image data set by using an initial feature encoder, and initializing a feature storage unit based on a category prototype by using the encoded features;
step 2: clustering coding features in the feature storage unit before each round of training, and screening clustering results according to a clustering reliability evaluation standard;
and step 3: performing feature coding on each group of small-batch training samples by using a feature coder, performing network back propagation by using a uniform contrast loss function, and updating the feature coder;
and 4, step 4: dynamically updating the feature storage unit in a momentum updating mode by utilizing the coded features;
and 5: and (4) circularly performing the steps 2 to 4 according to the number of training turns until the pedestrian re-identification network converges.
2. The unsupervised pedestrian re-identification method based on the contrasted clustering as claimed in claim 1, wherein the step 1 comprises:
using a ResNet-50 deep neural network as a feature encoder and initializing the feature encoder by using pre-training weights on the ImageNet image dataset;
performing feature extraction on samples in the pedestrian image data set by using a feature encoder to obtain a feature set { v } 1 ,…,v n And storing all sample characteristics into a characteristic storage unit in an example unit.
3. The unsupervised pedestrian re-identification method based on the contrasted clustering as claimed in claim 1, wherein the step 2 comprises:
using DBSCAN clustering algorithm to pair the feature sets in the feature storage unit in step 1{v 1 ,…,v n Clustering is carried out, and the category prototype in the feature storage unit is further divided into a cluster centroid
Figure FDA0003691050710000013
And unclustered outlier instances
Figure FDA0003691050710000014
Wherein n is c Denotes the number of clusters, n o And representing the number of the non-clustered outliers, using a self-walking learning strategy and combining clustering independence and clustering compactness standards, reserving reliable clustered clusters, and resolving the features in unreliable clustered clusters back to the non-clustered outlier examples.
4. The method according to claim 3, wherein the self-walking learning strategy re-clusters before each training run, gradually increases the number of clusters from the most reliable cluster, and alternately relaxes and contracts the clustering criteria by adjusting a sample neighborhood distance threshold in the DBSCAN clustering algorithm;
the clustering independence standard is used for measuring the distance between classes and is expressed as the intersection ratio between the feature set and the feature set after the clustering standard is relaxed:
Figure FDA0003691050710000011
where | represents the number of features in the set, I (f) i ) Representing a set of samples in the same cluster, I loose (f i ) Representing a sample set, R, in the same cluster after the clustering criterion is relaxed indep (f i ) Represents a cluster I (f) i ) An independence score of;
the cluster compactness standard is used for measuring the intra-class distance and is expressed as the intersection ratio between the feature set and the feature set after the cluster compactness standard is tightened:
Figure FDA0003691050710000012
wherein I tight (f i ) Representing a sample set R in the same cluster after the clustering standard is tightened comp (f i ) Represents a cluster I (f) i ) A closeness score of;
measuring independence and compactness among data in the clusters through the cluster reliability evaluation standard, and setting alpha and beta to be E to [0,1 ∈]Representing independence and closeness thresholds, preserving inter-class independence R comp (f i )>Alpha and intima-like tightness R indep (f i )>Clustering samples of beta, and dividing the rest samples into non-clustered outliers.
5. The unsupervised pedestrian re-identification method based on the contrasted clustering as claimed in claim 1, wherein the step 3 comprises:
given unlabeled training sample
Figure FDA0003691050710000021
Dividing the sample into sample sets with clustering pseudo labels by using the self-walking learning strategy in the step 2
Figure FDA0003691050710000022
And a set of outlier instance samples that do not belong to any cluster
Figure FDA0003691050710000023
And is
Figure FDA0003691050710000024
Given training sample
Figure FDA0003691050710000025
Using a characteristic encoder to perform forward calculation to obtain a coded characteristic f, and constructing a uniform contrast loss function:
Figure FDA0003691050710000026
wherein z is + A front class prototype of the feature f, τ temperature coefficient,<·,·>represents the vector inner product, c k Representing class prototypes within a cluster for the cluster centroid of the current cluster, v k Representing a class prototype without clustering for example characteristics of the current clustering outliers;
after encoding the features for a small batch of samples, a comparison is made with the two class prototypes, so that each training sample is closer to the class to which it belongs and farther away from the other classes.
6. The unsupervised pedestrian re-identification method based on the contrasted clustering as claimed in claim 1, wherein the step 4 comprises:
in the feature storage unit, features { v ] in the same cluster are subjected to 1 ,…,v n Calculating the average value among the characteristics to obtain the centroid of the cluster
Figure FDA0003691050710000027
Instance features that do not cluster outliers
Figure FDA0003691050710000028
Extracting the remaining example features directly from the feature storage unit, wherein the centroid of the kth cluster is represented as:
Figure FDA0003691050710000029
wherein I k Representing a feature vector set of a kth clustering cluster, | · | representing the number of feature vectors in the set, and example features { v } in a feature storage unit are initialized once by network forward calculation and are continuously updated in a training process;
initializing the feature storage unit by taking the example as a unit for all training samples, accumulating the features in the current small batch of samples into the example features corresponding to the feature storage unit according to the index in each round of training, and dynamically updating the class prototype in the feature storage unit in a momentum updating mode by using the coding features in the small batch of samples:
v i ←mv i +(1-m)f i
wherein m is [0,1 ]]Momentum factor, given updated instance feature v i If f is i Belongs to cluster k, then the corresponding cluster centroid c k And needs to be updated accordingly.
7. The unsupervised pedestrian re-identification method based on the contrasted clustering as claimed in claim 1, wherein the step 5 comprises:
in each training round, clustering the features in the feature storage unit according to the clustering independence and clustering compactness standards, and training samples
Figure FDA00036910507100000210
Partitioning into clustered samples
Figure FDA00036910507100000211
And clustering-free outlier samples
Figure FDA00036910507100000212
Computing
Figure FDA00036910507100000213
Cluster centroid in (2);
for training samples in each small batch, performing feature coding by using a feature coder, calculating the uniform contrast loss, and performing back propagation to update the coder;
updating the feature set in the feature storage unit according to the class prototype momentum updating mode, and updating the cluster centroid of the cluster by combining the updated feature set;
and circularly updating the feature encoder and the feature storage unit until the model achieves better convergence.
CN202210664167.2A 2022-06-13 2022-06-13 Unsupervised pedestrian re-identification method based on contrast clustering Pending CN114898406A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210664167.2A CN114898406A (en) 2022-06-13 2022-06-13 Unsupervised pedestrian re-identification method based on contrast clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210664167.2A CN114898406A (en) 2022-06-13 2022-06-13 Unsupervised pedestrian re-identification method based on contrast clustering

Publications (1)

Publication Number Publication Date
CN114898406A true CN114898406A (en) 2022-08-12

Family

ID=82728695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210664167.2A Pending CN114898406A (en) 2022-06-13 2022-06-13 Unsupervised pedestrian re-identification method based on contrast clustering

Country Status (1)

Country Link
CN (1) CN114898406A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116030502A (en) * 2023-03-30 2023-04-28 之江实验室 Pedestrian re-recognition method and device based on unsupervised learning
CN118152826A (en) * 2024-05-09 2024-06-07 深圳市翔飞科技股份有限公司 Intelligent camera alarm system based on behavior analysis

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116030502A (en) * 2023-03-30 2023-04-28 之江实验室 Pedestrian re-recognition method and device based on unsupervised learning
CN118152826A (en) * 2024-05-09 2024-06-07 深圳市翔飞科技股份有限公司 Intelligent camera alarm system based on behavior analysis
CN118152826B (en) * 2024-05-09 2024-08-02 深圳市翔飞科技股份有限公司 Intelligent camera alarm system based on behavior analysis

Similar Documents

Publication Publication Date Title
CN111814854B (en) Target re-identification method without supervision domain adaptation
CN110516095B (en) Semantic migration-based weak supervision deep hash social image retrieval method and system
Lin et al. RSCM: Region selection and concurrency model for multi-class weather recognition
CN114930352A (en) Method for training image classification model
CN114898406A (en) Unsupervised pedestrian re-identification method based on contrast clustering
CN110929848B (en) Training and tracking method based on multi-challenge perception learning model
CN110941734B (en) Depth unsupervised image retrieval method based on sparse graph structure
CN108446334B (en) Image retrieval method based on content for unsupervised countermeasure training
CN114358188A (en) Feature extraction model processing method, feature extraction model processing device, sample retrieval method, sample retrieval device and computer equipment
CN110196918B (en) Unsupervised deep hashing method based on target detection
CN104268546A (en) Dynamic scene classification method based on topic model
CN111639540A (en) Semi-supervised character re-recognition method based on camera style and human body posture adaptation
CN114329031B (en) Fine-granularity bird image retrieval method based on graph neural network and deep hash
CN115292532B (en) Remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning
WO2022258624A1 (en) Method for determining an image descriptor, encoding pipeline, and visual place recognition method
CN117152459A (en) Image detection method, device, computer readable medium and electronic equipment
CN114579794A (en) Multi-scale fusion landmark image retrieval method and system based on feature consistency suggestion
CN114596464A (en) Multi-feature interactive unsupervised target detection method and system, electronic device and readable storage medium
CN114429648B (en) Pedestrian re-identification method and system based on contrast characteristics
CN115049894A (en) Target re-identification method of global structure information embedded network based on graph learning
CN114724075A (en) Pedestrian re-identification method and system based on momentum network and contrast learning
CN114092735A (en) Self-labeling method and system for target detection level of instance object
CN117911955A (en) Unsupervised pedestrian re-identification method based on hybrid memory updating
CN118587510B (en) Roller conveying deviation detection method and system
CN116612445B (en) Unsupervised vehicle re-identification method based on self-adaptive clustering and difficult sample weighting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination