CN112381056B - Cross-domain pedestrian re-identification method and system fusing multiple source domains - Google Patents
Cross-domain pedestrian re-identification method and system fusing multiple source domains Download PDFInfo
- Publication number
- CN112381056B CN112381056B CN202011399651.4A CN202011399651A CN112381056B CN 112381056 B CN112381056 B CN 112381056B CN 202011399651 A CN202011399651 A CN 202011399651A CN 112381056 B CN112381056 B CN 112381056B
- Authority
- CN
- China
- Prior art keywords
- pedestrian
- sample pair
- cross
- domain
- domain data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000012549 training Methods 0.000 claims abstract description 50
- 238000005259 measurement Methods 0.000 claims abstract description 18
- 238000004364 calculation method Methods 0.000 claims description 13
- 238000005457 optimization Methods 0.000 claims description 10
- 238000011478 gradient descent method Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 abstract description 4
- 230000009467 reduction Effects 0.000 abstract description 4
- 238000012360 testing method Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a cross-domain pedestrian re-identification method and system fusing a plurality of source domains. The method comprises the following steps: when the cross-domain pedestrian re-recognition model is trained, a plurality of groups of source domain data sets are obtained, the cross-domain pedestrian re-recognition model is trained by adopting a method of training a representation learning network and a measurement learning network by adopting the group of source domain data sets, the trained cross-domain pedestrian re-recognition model is obtained, and when the pedestrian re-recognition is carried out, a pedestrian sample to be recognized is input into the trained cross-domain pedestrian re-recognition model, and a pedestrian re-recognition result is obtained. By adopting the method and the system, data of a plurality of source domains are fused in the training process, the characteristic representation of the pedestrian can be better learned, compared with a single training model, the accuracy of pedestrian re-identification can be improved, and the problem of model performance reduction caused by inter-domain difference is effectively solved.
Description
Technical Field
The invention relates to the technical field of pedestrian re-identification, in particular to a cross-domain pedestrian re-identification method and system fusing multiple source domains.
Background
Pedestrian re-identification (Person re-identification), also known as pedestrian re-identification, is a technique that uses computer vision techniques to determine whether a particular pedestrian is present in an image or video sequence. In recent years, due to rapid development of deep learning, the performance of the pedestrian re-identification algorithm is improved unprecedentedly. When the pedestrian re-recognition system based on deep learning is realized, when the model trained by a single training set is applied to an actual scene, the performance is obviously reduced, because when the trained model is directly applied to the actual scene, the performance of the model is reduced because the pedestrian in the actual scene and the training data set have inter-domain difference.
Disclosure of Invention
The invention aims to provide a cross-domain pedestrian re-identification method and a cross-domain pedestrian re-identification system fusing a plurality of source domains, which can effectively solve the problem of model performance reduction caused by inter-domain difference and improve the accuracy of pedestrian re-identification.
In order to achieve the purpose, the invention provides the following scheme:
a pedestrian re-identification method, comprising:
acquiring a pedestrian sample pair to be identified; the pedestrian sample pair to be identified comprises two pedestrian samples to be identified;
inputting the to-be-recognized pedestrian sample pair into a trained cross-domain pedestrian re-recognition model to obtain a pedestrian re-recognition result; the cross-domain pedestrian re-identification model comprises a plurality of representation learning networks and a measurement learning network, wherein the representation learning networks adopt a ResNet-50 network structure, and the measurement learning networks adopt a three-layer full-connection network structure;
the training method of the cross-domain pedestrian re-recognition model specifically comprises the following steps:
acquiring a plurality of groups of source domain data sets; each group of source domain data sets comprises a plurality of pedestrian samples to be trained, and each pedestrian sample to be trained corresponds to a pedestrian label; the number of the source domain data sets is the same as that of the representation learning networks;
and training the cross-domain pedestrian re-recognition model by adopting a method of training a representation learning network and a metric learning network by adopting a group of source domain data sets to obtain the trained cross-domain pedestrian re-recognition model.
Optionally, the step of inputting the to-be-recognized pedestrian sample pair into the trained cross-domain pedestrian re-recognition model to obtain a pedestrian re-recognition result specifically includes:
inputting the to-be-recognized pedestrian sample pair into a trained cross-domain pedestrian re-recognition model to obtain a plurality of to-be-recognized pedestrian feature groups; each pedestrian feature group comprises two pedestrian features;
calculating the distance between two pedestrian features of each to-be-identified pedestrian feature group to obtain a plurality of to-be-identified pedestrian feature distances;
judging whether the number of the characteristic distances of the pedestrians to be identified, which is smaller than the preset distance, exceeds half of the total number of the characteristic distances of the pedestrians to be identified; if so, determining that the pedestrian samples in the to-be-identified pedestrian sample pair are the same; and if not, determining that the pedestrian samples in the to-be-identified pedestrian sample pair are different.
Optionally, the training of the cross-domain pedestrian re-recognition model by using a method of training a representation learning network and a metric learning network by using a group of source domain data sets to obtain the trained cross-domain pedestrian re-recognition model specifically includes:
selecting a group of source domain data sets and a representation learning network; the source domain dataset comprises a positive sample pair and a negative sample pair, the positive sample pair comprising two pedestrian samples that are the same as a pedestrian, the negative sample pair comprising two pedestrian samples that are different from a pedestrian, and one pedestrian sample of the negative sample pair being the same as one pedestrian sample of the positive sample pair;
inputting the selected source domain data set into the selected representation learning network to obtain a plurality of first pedestrian characteristics;
inputting a plurality of first pedestrian features into the metric learning network to obtain a plurality of second pedestrian features;
calculating the pedestrian characteristic distance corresponding to the positive sample pair and the pedestrian characteristic distance corresponding to the negative sample pair according to the second pedestrian characteristic;
calculating a loss value according to the pedestrian characteristic distance corresponding to the positive sample pair and the pedestrian characteristic distance corresponding to the negative sample pair, and optimizing parameters in the cross-domain pedestrian re-identification model by adopting a gradient descent method with the minimum loss value as an optimization target;
and judging whether all the source domain data sets are completely selected, if so, obtaining a trained cross-domain pedestrian re-recognition model, otherwise, selecting one source domain data set from the unselected source domain data sets, selecting one representation learning network from the unselected representation learning networks, and returning to the step of inputting the selected source domain data set into the selected representation learning network to obtain a plurality of first pedestrian characteristics.
Alternatively to this, the first and second parts may,
the calculation formula of the pedestrian characteristic distance is as follows:
in the formula,is composed ofAndthe characteristic distance of the pedestrian is calculated,for the ith pedestrian sample in the tth source domain data set,for the jth pedestrian sample in the tth source domain data set,is composed ofA corresponding second pedestrian characteristic is provided,is composed ofTo a corresponding secondTwo pedestrian characteristics, C is a metric learning network;
the loss value is calculated as follows:
Ltriple=[dp-dn+α]+
in the formula, LtripleTo a loss value, dpFor the negative sample to the corresponding pedestrian characteristic distance, dnIs the positive sample to the corresponding pedestrian characteristic distance, alpha is dpAnd dnA predetermined interval of [ d ]p-dn+α]+Is shown when dn<dpAt + alpha, the loss value remains unchanged, when dn≥dpAt + α, the loss value is 0.
The present invention also provides a pedestrian re-identification system, comprising:
the pedestrian re-identification module is used for acquiring a pedestrian sample pair to be identified; the pedestrian sample pair to be identified comprises two pedestrian samples to be identified; the pedestrian re-recognition method is also used for inputting the to-be-recognized pedestrian sample pair into a trained cross-domain pedestrian re-recognition model to obtain a pedestrian re-recognition result; the cross-domain pedestrian re-identification model comprises a plurality of representation learning networks and a measurement learning network, wherein the representation learning networks adopt a ResNet-50 network structure, and the measurement learning networks adopt a three-layer full-connection network structure;
the training module is used for training the cross-domain pedestrian re-recognition model to obtain a trained cross-domain pedestrian re-recognition model; the training method specifically comprises the following steps:
acquiring a plurality of groups of source domain data sets; each group of source domain data sets comprises a plurality of pedestrian samples to be trained, and each pedestrian sample to be trained corresponds to a pedestrian label; the number of the source domain data sets is the same as that of the representation learning networks;
and training the cross-domain pedestrian re-recognition model by adopting a method of training a representation learning network and a metric learning network by adopting a group of source domain data sets to obtain the trained cross-domain pedestrian re-recognition model.
Optionally, the pedestrian re-identification module specifically includes:
the pedestrian feature group generation unit is used for inputting the to-be-identified pedestrian sample pair into a trained cross-domain pedestrian re-identification model to obtain a plurality of to-be-identified pedestrian feature groups; each pedestrian feature group comprises two pedestrian features;
the pedestrian feature distance calculation unit is used for calculating the distance between two pedestrian features of each pedestrian feature group to be recognized to obtain a plurality of pedestrian feature distances to be recognized;
the first judging unit is used for judging whether the number of the characteristic distances of the pedestrians to be identified, which is smaller than the preset distance, exceeds half of the total number of the characteristic distances of the pedestrians to be identified; if so, determining that the pedestrian samples in the to-be-identified pedestrian sample pair are the same; and if not, determining that the pedestrian samples in the to-be-identified pedestrian sample pair are different.
Optionally, the training module specifically includes:
a selecting unit for selecting a group of source domain data sets and a representation learning network; the source domain dataset comprises a positive sample pair and a negative sample pair, the positive sample pair comprising two pedestrian samples that are the same as a pedestrian, the negative sample pair comprising two pedestrian samples that are different from a pedestrian, and one pedestrian sample of the negative sample pair being the same as one pedestrian sample of the positive sample pair;
the first pedestrian characteristic generation unit is used for inputting the selected source domain data set into the selected representation learning network to obtain a plurality of first pedestrian characteristics;
the second pedestrian characteristic generating unit is used for inputting a plurality of first pedestrian characteristics into the metric learning network to obtain a plurality of second pedestrian characteristics;
the pedestrian characteristic distance calculation unit of the sample pair is used for calculating the pedestrian characteristic distance corresponding to the positive sample pair and the pedestrian characteristic distance corresponding to the negative sample pair according to the second pedestrian characteristic;
the model optimization unit is used for calculating a loss value according to the pedestrian characteristic distance corresponding to the positive sample pair and the pedestrian characteristic distance corresponding to the negative sample pair, and optimizing parameters in the cross-domain pedestrian re-identification model by adopting a gradient descent method with the minimum loss value as an optimization target;
and the second judgment unit is used for judging whether all the source domain data sets are completely selected, if so, obtaining a trained cross-domain pedestrian re-recognition model, otherwise, selecting one source domain data set from the unselected source domain data sets, selecting one representation learning network from the unselected representation learning networks, and then executing the first pedestrian feature generation unit.
Alternatively to this, the first and second parts may,
the calculation formula of the pedestrian characteristic distance is as follows:
in the formula,is composed ofAndthe characteristic distance of the pedestrian is calculated,for the ith pedestrian sample in the tth source domain data set,for the jth pedestrian sample in the tth source domain data set,is composed ofA corresponding second pedestrian characteristic is provided,is composed ofCorresponding second pedestrian characteristics, C is a metric learning network;
the loss value is calculated as follows:
Ltriple=[dp-dn+α]+
in the formula, LtripleTo a loss value, dpFor the negative sample to the corresponding pedestrian characteristic distance, dnIs the positive sample to the corresponding pedestrian characteristic distance, alpha is dpAnd dnA predetermined interval of [ d ]p-dn+α]+Is shown when dn<dpAt + alpha, the loss value remains unchanged, when dn≥dpAt + α, the loss value is 0.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a cross-domain pedestrian re-recognition method and system fusing multiple source domains. The method integrates data of a plurality of source domains in the training process, can better learn the characteristic representation of the pedestrian, can improve the accuracy of pedestrian re-identification compared with a single training model, and effectively solves the problem of model performance reduction caused by inter-domain difference.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flowchart of a cross-domain pedestrian re-identification method fusing multiple source domains according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a training phase according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a testing phase according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide a cross-domain pedestrian re-identification method and a cross-domain pedestrian re-identification system fusing a plurality of source domains, which can effectively solve the problem of model performance reduction caused by inter-domain difference and improve the accuracy of pedestrian re-identification.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Examples
Fig. 1 is a flowchart of a cross-domain pedestrian re-identification method fusing multiple source domains in an embodiment of the present invention, and as shown in fig. 1, a cross-domain pedestrian re-identification method fusing multiple source domains includes:
step 101: acquiring a pedestrian sample pair to be identified; the pedestrian sample pair to be identified includes two pedestrian samples to be identified. Wherein, the pedestrian sample is the pedestrian picture.
Step 102: inputting a pedestrian sample pair to be recognized into the trained cross-domain pedestrian re-recognition model to obtain a pedestrian re-recognition result; the cross-domain pedestrian re-identification model comprises a plurality of representation learning networks and a measurement learning network, wherein the representation learning network adopts a ResNet-50 network structure, and the measurement learning network adopts a three-layer full-connection network structure.
inputting a pedestrian sample pair to be recognized into the trained cross-domain pedestrian re-recognition model to obtain a plurality of pedestrian feature groups to be recognized; each pedestrian feature group comprises two pedestrian features;
calculating the distance between two pedestrian features of each to-be-identified pedestrian feature group to obtain a plurality of to-be-identified pedestrian feature distances;
judging whether the number of the characteristic distances of the pedestrians to be identified, which is smaller than the preset distance, exceeds half of the total number of the characteristic distances of the pedestrians to be identified; if so, determining that the pedestrian samples in the to-be-identified pedestrian sample pair are the same; and if not, determining that the pedestrian samples in the to-be-identified pedestrian sample pair are different.
Wherein,
the training method of the cross-domain pedestrian re-recognition model specifically comprises the following steps:
acquiring a plurality of groups of source domain data sets; each group of source domain data sets comprises a plurality of pedestrian samples to be trained, and each pedestrian sample to be trained corresponds to a pedestrian label; the number of the source domain data sets is the same as the number of the representation learning networks;
the method for training the cross-domain pedestrian re-recognition model by adopting a group of source domain data sets to train an expression learning network and a measurement learning network is adopted to train the cross-domain pedestrian re-recognition model, so that the trained cross-domain pedestrian re-recognition model is obtained, and the method specifically comprises the following steps:
selecting a group of source domain data sets and a representation learning network; the source domain data set comprises a positive sample pair and a negative sample pair, the positive sample pair comprises two pedestrian samples with the same pedestrian, the negative sample pair comprises two pedestrian samples with different pedestrians, and one pedestrian sample in the negative sample pair is the same as one pedestrian sample in the positive sample pair; (namely two different pictures of the same pedestrian are selected as positive samples, and one picture of the other pedestrian and one picture of the positive samples are selected as negative samples).
And inputting the selected source domain data set into the selected representation learning network to obtain a plurality of first pedestrian characteristics.
And inputting the plurality of first pedestrian features into the metric learning network to obtain a plurality of second pedestrian features.
And calculating the pedestrian characteristic distance corresponding to the positive sample pair and the pedestrian characteristic distance corresponding to the negative sample pair according to the second pedestrian characteristic.
And calculating a loss value according to the pedestrian characteristic distance corresponding to the positive sample pair and the pedestrian characteristic distance corresponding to the negative sample pair, and optimizing parameters in the cross-domain pedestrian re-identification model by adopting a gradient descent method with the minimum loss value as an optimization target.
And judging whether all the source domain data sets are completely selected, if so, obtaining a trained cross-domain pedestrian re-recognition model, otherwise, selecting one source domain data set from the unselected source domain data sets, selecting one representation learning network from the unselected representation learning networks, and returning to the step of inputting the selected source domain data set into the selected representation learning network to obtain a plurality of first pedestrian characteristics.
Wherein,
the calculation formula of the pedestrian characteristic distance is as follows:
in the formula,is composed ofAndthe characteristic distance of the pedestrian is calculated,for the ith pedestrian sample in the tth source domain data set,for the jth pedestrian sample in the tth source domain data set,is composed ofA corresponding second pedestrian characteristic is provided,is composed ofCorresponding second pedestrian characteristics, C is a metric learning network;
the loss value is calculated as follows:
Ltriple=[dp-dn+α]+
in the formula, LtripleTo a loss value, dpFor the negative sample to the corresponding pedestrian characteristic distance, dnIs the positive sample to the corresponding pedestrian characteristic distance, alpha is dpAnd dnA predetermined interval of [ d ]p-dn+α]+Is shown when dn<dpAt + alpha, the loss value remains unchanged, when dn≥dpAt + α, the loss value is 0. The purpose of setting alpha is to let dpAnd dnWith a small spacing such that dnShould always be greater than dp。
The invention also provides a cross-domain pedestrian re-identification system fusing a plurality of source domains, which comprises: the device comprises a pedestrian re-identification module and a training module.
The pedestrian re-identification module is used for acquiring a pedestrian sample pair to be identified; the pedestrian sample pair to be identified comprises two pedestrian samples to be identified; the method is also used for inputting the pedestrian sample pair to be identified into the trained cross-domain pedestrian re-identification model to obtain a pedestrian re-identification result; the cross-domain pedestrian re-identification model comprises a plurality of representation learning networks and a measurement learning network, wherein the representation learning network adopts a ResNet-50 network structure, and the measurement learning network adopts a three-layer full-connection network structure.
The pedestrian re-identification module specifically comprises:
the pedestrian feature group generation unit is used for inputting a pedestrian sample pair to be identified into the trained cross-domain pedestrian re-identification model to obtain a plurality of pedestrian feature groups to be identified; each pedestrian feature group comprises two pedestrian features;
the pedestrian feature distance calculation unit is used for calculating the distance between two pedestrian features of each pedestrian feature group to be recognized to obtain a plurality of pedestrian feature distances to be recognized;
the first judging unit is used for judging whether the number of the characteristic distances of the pedestrians to be identified, which is smaller than the preset distance, exceeds half of the total number of the characteristic distances of the pedestrians to be identified; if so, determining that the pedestrian samples in the to-be-identified pedestrian sample pair are the same; and if not, determining that the pedestrian samples in the to-be-identified pedestrian sample pair are different.
The training module is used for training the cross-domain pedestrian re-recognition model to obtain a trained cross-domain pedestrian re-recognition model; the training method specifically comprises the following steps:
acquiring a plurality of groups of source domain data sets; each group of source domain data sets comprises a plurality of pedestrian samples to be trained, and each pedestrian sample to be trained corresponds to a pedestrian label; the number of the source domain data sets is the same as the number of the representation learning networks;
and training the cross-domain pedestrian re-recognition model by adopting a method of training a representation learning network and a metric learning network by adopting a group of source domain data sets to obtain the trained cross-domain pedestrian re-recognition model.
The training module specifically comprises:
a selecting unit for selecting a group of source domain data sets and a representation learning network; the source domain data set comprises a positive sample pair and a negative sample pair, the positive sample pair comprises two pedestrian samples with the same pedestrian, the negative sample pair comprises two pedestrian samples with different pedestrians, and one pedestrian sample in the negative sample pair is the same as one pedestrian sample in the positive sample pair;
the first pedestrian characteristic generation unit is used for inputting the selected source domain data set into the selected representation learning network to obtain a plurality of first pedestrian characteristics;
the second pedestrian characteristic generating unit is used for inputting the first pedestrian characteristics into the metric learning network to obtain second pedestrian characteristics;
the pedestrian characteristic distance calculation unit of the sample pair is used for calculating the pedestrian characteristic distance corresponding to the positive sample pair and the pedestrian characteristic distance corresponding to the negative sample pair according to the second pedestrian characteristic;
the model optimization unit is used for calculating a loss value according to the pedestrian characteristic distance corresponding to the positive sample pair and the pedestrian characteristic distance corresponding to the negative sample pair, and optimizing parameters in the cross-domain pedestrian re-identification model by adopting a gradient descent method with the minimum loss value as an optimization target;
and the second judgment unit is used for judging whether all the source domain data sets are completely selected, if so, obtaining a trained cross-domain pedestrian re-recognition model, otherwise, selecting one source domain data set from the unselected source domain data sets, selecting one representation learning network from the unselected representation learning networks, and then executing the first pedestrian feature generation unit.
Wherein,
the calculation formula of the pedestrian characteristic distance is as follows:
in the formula,is composed ofAndthe characteristic distance of the pedestrian is calculated,for the ith pedestrian sample in the tth source domain data set,for the jth pedestrian sample in the tth source domain data set,is composed ofA corresponding second pedestrian characteristic is provided,is composed ofCorresponding second pedestrian characteristics, C is a metric learning network;
the loss value is calculated as follows:
Ltriple[dp-dn+α]+
in the formula, LtripleTo a loss value, dpFor the negative sample to the corresponding pedestrian characteristic distance, dnIs the positive sample to the corresponding pedestrian characteristic distance, alpha is dpAnd dnA predetermined interval of [ d ]p-dn+α]+Is shown when dn<dpAt + alpha, the loss value remains unchanged, when dn≥dpAt + α, the loss value is 0.
The present invention further illustrates a cross-domain pedestrian re-identification method that merges multiple source domains by the following description.
Acquiring m labeled source domain datasets:
wherein S istRepresenting a source domain data set;representing pedestrian samples in the tth source domain data setAnd corresponding pedestrian label
Acquisition of test data (target domain):
T={xi|i=1,2,...,n}
wherein x isiIs a pedestrian sample in the target domain data set.
The goal of the cross-domain pedestrian re-recognition model fusing a plurality of source domains is to couple any pedestrian sample in a target domain T to xiAnd xjAnd inputting the model to obtain corresponding pedestrian features, and determining whether the pedestrian features are the same pedestrian by judging whether the distance between the pedestrian features is smaller than a threshold value.
The cross-domain pedestrian re-identification model fusing a plurality of source domains comprises two core modules: a representation learning network and a converged feature network (i.e., a metric learning network).
M representing learning networks, implemented using ResNet-50
For a source domain data set StLet the tth expression learning network be expressed as Ht(St,ωt),HtIs composed of the tth source domain data set StTrained on a (Cross-Domain pedestrian re-recognition model fusing multiple Source domains), omegatIs HtIn the source domain data set StAnd (5) training the learned parameters after completion of the training.
Pedestrian samples in the tth Source DomainInput to the presentation learning network Ht(St,ωt) The pedestrian characteristics obtained are as follows:
2. a converged feature network implemented using 3 full connectivity layers
Let the converged feature network be represented asWhereinAs a pedestrian sample in the tth source domainInput representation learning network Ht(St,ωt) And theta is a parameter of the metric learning network after the model is trained.
Pedestrian samples in the tth Source DomainThe pedestrian characteristics obtained after the input into the cross-domain pedestrian re-recognition model fusing a plurality of source domains are as follows:
any pedestrian sample in the target domain T is paired with xiAnd xjThe distance of the pedestrian features obtained after the cross-domain pedestrian re-recognition model fusing a plurality of source domains is input is as follows:
the cross-domain pedestrian re-identification model fused with a plurality of source domains is designed as follows:
the cross-domain pedestrian re-identification model fusing a plurality of source domains mainly comprises m representation learning networks and 1 measurement learning network. The expression learning network is mainly responsible for obtaining pedestrian characteristics input into the model, and measures the distance measurement of the pedestrian characteristics by the learning network.
A training stage:
as shown in FIG. 2, m representation learning networks all adopt a ResNet-50 network structure, and the metric learning network adopts a 3-layer fully-connected network structure. When the t-th source domain data set is used for training the model, the pedestrian sample triplesWhereinFor the pair of positive samples, the number of positive samples,for negative sample pairs, input into the corresponding representation learning network Ht(St,ωt) Will get the corresponding 3 pedestrian characteristicsAndinputting the pedestrian characteristics into a metric learning network, and obtaining new 3 pedestrian characteristics through the metric learning networkAndthe loss function adopts a triple loss function (Wherein d isnIs the distance between the pair of negative examples, dpIs the distance between the positive sample pairs), and a gradient descent strategy is adopted to carry out optimization training of a cross-domain pedestrian re-identification model fusing a plurality of source domains. And finally, parameter learning is carried out on the cross-domain pedestrian re-identification model fusing the source domains through the triple loss function. And finishing training after the m source domain data sets are input into the model.
For example, the training step (1 st training of the source domain data set (m total))
1) Inputting a first source domain triplet pedestrian sample into the model;
2) obtaining a group of pedestrian characteristics after the representing learning network;
3) obtaining a group of new pedestrian characteristics after fusing the characteristic network;
4) obtaining triple loss by calculating the distance between the positive sample pair and the negative sample pair;
5) updating the network weight through a gradient descent strategy, and optimizing a training model;
and (3) a testing stage:
as shown in FIG. 3, two pedestrian samples (x) of the target domain are inputi,xj) Outputting m groups of pedestrian features after passing through m expression learning networks, obtaining a new m groups of pedestrian features by passing through a metric learning network by each group of pedestrian features, obtaining the distance between two pedestrian features through calculation, artificially setting a threshold value, and considering as the same pedestrian if the calculated distance is less than the set threshold value, otherwise, not being the same pedestrian. And finally, determining whether the input pictures are the same person or not through a voting mechanism, namely minority obedience majority.
The method effectively improves the robustness of the training model, and the trained model is directly used in actual work, thereby effectively improving the performance of the model. The main reason for improving the performance of the model is that data of a plurality of source domains are fused in the training process, a measurement network is introduced, the feature representation of pedestrians can be better learned, and the performance of the model can be further improved through a voting mechanism in the testing process.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In summary, this summary should not be construed to limit the present invention.
Claims (6)
1. A pedestrian re-identification method is characterized by comprising the following steps:
acquiring a pedestrian sample pair to be identified; the pedestrian sample pair to be identified comprises two pedestrian samples to be identified;
inputting the to-be-recognized pedestrian sample pair into a trained cross-domain pedestrian re-recognition model to obtain a pedestrian re-recognition result; the cross-domain pedestrian re-identification model comprises a plurality of representation learning networks and a measurement learning network, wherein the representation learning networks adopt a ResNet-50 network structure, and the measurement learning networks adopt a three-layer full-connection network structure;
the training method of the cross-domain pedestrian re-recognition model specifically comprises the following steps:
acquiring a plurality of groups of source domain data sets; each group of source domain data sets comprises a plurality of pedestrian samples to be trained, and each pedestrian sample to be trained corresponds to a pedestrian label; the number of the source domain data sets is the same as that of the representation learning networks;
training the cross-domain pedestrian re-recognition model by adopting a method of training a representation learning network and a metric learning network by adopting a group of source domain data sets to obtain a trained cross-domain pedestrian re-recognition model;
the method for training the cross-domain pedestrian re-recognition model by adopting a group of source domain data sets to train an expression learning network and a measurement learning network is used for training the cross-domain pedestrian re-recognition model to obtain the trained cross-domain pedestrian re-recognition model, and specifically comprises the following steps:
selecting a group of source domain data sets and a representation learning network; the source domain dataset comprises a positive sample pair and a negative sample pair, the positive sample pair comprising two pedestrian samples that are the same as a pedestrian, the negative sample pair comprising two pedestrian samples that are different from a pedestrian, and one pedestrian sample of the negative sample pair being the same as one pedestrian sample of the positive sample pair;
inputting the selected source domain data set into the selected representation learning network to obtain a plurality of first pedestrian characteristics;
inputting a plurality of first pedestrian features into the metric learning network to obtain a plurality of second pedestrian features;
calculating the pedestrian characteristic distance corresponding to the positive sample pair and the pedestrian characteristic distance corresponding to the negative sample pair according to the second pedestrian characteristic;
calculating a loss value according to the pedestrian characteristic distance corresponding to the positive sample pair and the pedestrian characteristic distance corresponding to the negative sample pair, and optimizing parameters in the cross-domain pedestrian re-identification model by adopting a gradient descent method with the minimum loss value as an optimization target;
and judging whether all the source domain data sets are completely selected, if so, obtaining a trained cross-domain pedestrian re-recognition model, otherwise, selecting one source domain data set from the unselected source domain data sets, selecting one representation learning network from the unselected representation learning networks, and returning to the step of inputting the selected source domain data set into the selected representation learning network to obtain a plurality of first pedestrian characteristics.
2. The method according to claim 1, wherein the step of inputting the to-be-recognized pedestrian sample pair into the trained cross-domain pedestrian re-recognition model to obtain a pedestrian re-recognition result specifically comprises:
inputting the to-be-recognized pedestrian sample pair into a trained cross-domain pedestrian re-recognition model to obtain a plurality of to-be-recognized pedestrian feature groups; each pedestrian feature group comprises two pedestrian features;
calculating the distance between two pedestrian features of each to-be-identified pedestrian feature group to obtain a plurality of to-be-identified pedestrian feature distances;
judging whether the number of the characteristic distances of the pedestrians to be identified, which is smaller than the preset distance, exceeds half of the total number of the characteristic distances of the pedestrians to be identified; if so, determining that the pedestrian samples in the to-be-identified pedestrian sample pair are the same; and if not, determining that the pedestrian samples in the to-be-identified pedestrian sample pair are different.
3. The pedestrian re-identification method according to claim 1,
the calculation formula of the pedestrian characteristic distance is as follows:
in the formula,is composed ofAndthe characteristic distance of the pedestrian is calculated,for the ith pedestrian sample in the tth source domain data set,for the jth pedestrian sample in the tth source domain data set,is composed ofA corresponding second pedestrian characteristic is provided,is composed ofCorresponding second pedestrian characteristics, C is a metric learning network;
the loss value is calculated as follows:
Ltriple=[dp-dn+α]+
in the formula, LtripleTo a loss value, dpFor the negative sample to the corresponding pedestrian characteristic distance, dnIs the positive sample to the corresponding pedestrian characteristic distance, alpha is dpAnd dnA predetermined interval of [ d ]p-dn+α]+Is shown when dn<dpAt + alpha, the loss value remains unchanged, when dn≥dpAt + α, the loss value is 0.
4. A pedestrian re-identification system, comprising:
the pedestrian re-identification module is used for acquiring a pedestrian sample pair to be identified; the pedestrian sample pair to be identified comprises two pedestrian samples to be identified; the pedestrian re-recognition method is also used for inputting the to-be-recognized pedestrian sample pair into a trained cross-domain pedestrian re-recognition model to obtain a pedestrian re-recognition result; the cross-domain pedestrian re-identification model comprises a plurality of representation learning networks and a measurement learning network, wherein the representation learning networks adopt a ResNet-50 network structure, and the measurement learning networks adopt a three-layer full-connection network structure;
the training module is used for training the cross-domain pedestrian re-recognition model to obtain a trained cross-domain pedestrian re-recognition model; the training method specifically comprises the following steps:
acquiring a plurality of groups of source domain data sets; each group of source domain data sets comprises a plurality of pedestrian samples to be trained, and each pedestrian sample to be trained corresponds to a pedestrian label; the number of the source domain data sets is the same as that of the representation learning networks;
training the cross-domain pedestrian re-recognition model by adopting a method of training a representation learning network and a metric learning network by adopting a group of source domain data sets to obtain a trained cross-domain pedestrian re-recognition model;
the training module specifically comprises:
a selecting unit for selecting a group of source domain data sets and a representation learning network; the source domain dataset comprises a positive sample pair and a negative sample pair, the positive sample pair comprising two pedestrian samples that are the same as a pedestrian, the negative sample pair comprising two pedestrian samples that are different from a pedestrian, and one pedestrian sample of the negative sample pair being the same as one pedestrian sample of the positive sample pair;
the first pedestrian characteristic generation unit is used for inputting the selected source domain data set into the selected representation learning network to obtain a plurality of first pedestrian characteristics;
the second pedestrian characteristic generating unit is used for inputting a plurality of first pedestrian characteristics into the metric learning network to obtain a plurality of second pedestrian characteristics;
the pedestrian characteristic distance calculation unit of the sample pair is used for calculating the pedestrian characteristic distance corresponding to the positive sample pair and the pedestrian characteristic distance corresponding to the negative sample pair according to the second pedestrian characteristic;
the model optimization unit is used for calculating a loss value according to the pedestrian characteristic distance corresponding to the positive sample pair and the pedestrian characteristic distance corresponding to the negative sample pair, and optimizing parameters in the cross-domain pedestrian re-identification model by adopting a gradient descent method with the minimum loss value as an optimization target;
and the second judgment unit is used for judging whether all the source domain data sets are completely selected, if so, obtaining a trained cross-domain pedestrian re-recognition model, otherwise, selecting one source domain data set from the unselected source domain data sets, selecting one representation learning network from the unselected representation learning networks, and then executing the first pedestrian feature generation unit.
5. The pedestrian re-identification system according to claim 4, wherein the pedestrian re-identification module specifically comprises:
the pedestrian feature group generation unit is used for inputting the to-be-identified pedestrian sample pair into a trained cross-domain pedestrian re-identification model to obtain a plurality of to-be-identified pedestrian feature groups; each pedestrian feature group comprises two pedestrian features;
the pedestrian feature distance calculation unit is used for calculating the distance between two pedestrian features of each pedestrian feature group to be recognized to obtain a plurality of pedestrian feature distances to be recognized;
the first judging unit is used for judging whether the number of the characteristic distances of the pedestrians to be identified, which is smaller than the preset distance, exceeds half of the total number of the characteristic distances of the pedestrians to be identified; if so, determining that the pedestrian samples in the to-be-identified pedestrian sample pair are the same; and if not, determining that the pedestrian samples in the to-be-identified pedestrian sample pair are different.
6. The pedestrian re-identification system according to claim 4,
the calculation formula of the pedestrian characteristic distance is as follows:
in the formula,is composed ofAndthe characteristic distance of the pedestrian is calculated,for the ith pedestrian sample in the tth source domain data set,for the jth pedestrian sample in the tth source domain data set,is composed ofA corresponding second pedestrian characteristic is provided,is composed ofCorresponding second pedestrian characteristics, C is a metric learning network;
the loss value is calculated as follows:
Ltriple=[dp-dn+α]+
in the formula, LtripleTo a loss value, dpFor the negative sample to the corresponding pedestrian characteristic distance, dnIs the positive sample to the corresponding pedestrian characteristic distance, alpha is dpAnd dnA predetermined interval of [ d ]p-dn+α]+Is shown when dn<dpAt + alpha, the loss value remains unchanged, when dn≥dpAt + α, the loss value is 0.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011399651.4A CN112381056B (en) | 2020-12-02 | 2020-12-02 | Cross-domain pedestrian re-identification method and system fusing multiple source domains |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011399651.4A CN112381056B (en) | 2020-12-02 | 2020-12-02 | Cross-domain pedestrian re-identification method and system fusing multiple source domains |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112381056A CN112381056A (en) | 2021-02-19 |
CN112381056B true CN112381056B (en) | 2022-04-01 |
Family
ID=74590207
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011399651.4A Active CN112381056B (en) | 2020-12-02 | 2020-12-02 | Cross-domain pedestrian re-identification method and system fusing multiple source domains |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112381056B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113221770B (en) * | 2021-05-18 | 2024-06-04 | 青岛根尖智能科技有限公司 | Cross-domain pedestrian re-recognition method and system based on multi-feature hybrid learning |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109977893A (en) * | 2019-04-01 | 2019-07-05 | 厦门大学 | Depth multitask pedestrian recognition methods again based on the study of level conspicuousness channel |
CN110110642A (en) * | 2019-04-29 | 2019-08-09 | 华南理工大学 | A kind of pedestrian's recognition methods again based on multichannel attention feature |
CN111126360A (en) * | 2019-11-15 | 2020-05-08 | 西安电子科技大学 | Cross-domain pedestrian re-identification method based on unsupervised combined multi-loss model |
CN111476168A (en) * | 2020-04-08 | 2020-07-31 | 山东师范大学 | Cross-domain pedestrian re-identification method and system based on three stages |
CN111695531A (en) * | 2020-06-16 | 2020-09-22 | 天津师范大学 | Cross-domain pedestrian re-identification method based on heterogeneous convolutional network |
CN111738039A (en) * | 2019-05-10 | 2020-10-02 | 北京京东尚科信息技术有限公司 | Pedestrian re-identification method, terminal and storage medium |
CN111860678A (en) * | 2020-07-29 | 2020-10-30 | 中国矿业大学 | Unsupervised cross-domain pedestrian re-identification method based on clustering |
CN111967294A (en) * | 2020-06-23 | 2020-11-20 | 南昌大学 | Unsupervised domain self-adaptive pedestrian re-identification method |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105989369B (en) * | 2015-02-15 | 2020-07-31 | 中国科学院西安光学精密机械研究所 | Pedestrian re-identification method based on metric learning |
WO2018119411A1 (en) * | 2016-12-23 | 2018-06-28 | Trustees Of Boston University | Classification of diffuse large b-cell lymphoma |
CN107145827A (en) * | 2017-04-01 | 2017-09-08 | 浙江大学 | Across the video camera pedestrian recognition methods again learnt based on adaptive distance metric |
US10938839B2 (en) * | 2018-08-31 | 2021-03-02 | Sophos Limited | Threat detection with business impact scoring |
CN110796057A (en) * | 2019-10-22 | 2020-02-14 | 上海交通大学 | Pedestrian re-identification method and device and computer equipment |
-
2020
- 2020-12-02 CN CN202011399651.4A patent/CN112381056B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109977893A (en) * | 2019-04-01 | 2019-07-05 | 厦门大学 | Depth multitask pedestrian recognition methods again based on the study of level conspicuousness channel |
CN110110642A (en) * | 2019-04-29 | 2019-08-09 | 华南理工大学 | A kind of pedestrian's recognition methods again based on multichannel attention feature |
CN111738039A (en) * | 2019-05-10 | 2020-10-02 | 北京京东尚科信息技术有限公司 | Pedestrian re-identification method, terminal and storage medium |
CN111126360A (en) * | 2019-11-15 | 2020-05-08 | 西安电子科技大学 | Cross-domain pedestrian re-identification method based on unsupervised combined multi-loss model |
CN111476168A (en) * | 2020-04-08 | 2020-07-31 | 山东师范大学 | Cross-domain pedestrian re-identification method and system based on three stages |
CN111695531A (en) * | 2020-06-16 | 2020-09-22 | 天津师范大学 | Cross-domain pedestrian re-identification method based on heterogeneous convolutional network |
CN111967294A (en) * | 2020-06-23 | 2020-11-20 | 南昌大学 | Unsupervised domain self-adaptive pedestrian re-identification method |
CN111860678A (en) * | 2020-07-29 | 2020-10-30 | 中国矿业大学 | Unsupervised cross-domain pedestrian re-identification method based on clustering |
Non-Patent Citations (2)
Title |
---|
Unsupervised Cross-Dataset Person Re-identification by Transfer Learning of Spatial-Temporal Patterns;Jianming Lv等;《2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition》;20180623;第7948页-7956页 * |
弱监督场景下的行人重识别研究综述;祁磊等;《软件学报》;20200915;第2883-2902页 * |
Also Published As
Publication number | Publication date |
---|---|
CN112381056A (en) | 2021-02-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | SaliencyGAN: Deep learning semisupervised salient object detection in the fog of IoT | |
CN110569886B (en) | Image classification method for bidirectional channel attention element learning | |
CN110852267B (en) | Crowd density estimation method and device based on optical flow fusion type deep neural network | |
CN108427921A (en) | A kind of face identification method based on convolutional neural networks | |
CN111611847A (en) | Video motion detection method based on scale attention hole convolution network | |
CN107247952B (en) | Deep supervision-based visual saliency detection method for cyclic convolution neural network | |
CN108197669B (en) | Feature training method and device of convolutional neural network | |
CN114462489A (en) | Training method of character recognition model, character recognition method and equipment, electronic equipment and medium | |
CN115146761B (en) | Training method and related device for defect detection model | |
CN112085055A (en) | Black box attack method based on migration model Jacobian array feature vector disturbance | |
CN111507184B (en) | Human body posture detection method based on parallel cavity convolution and body structure constraint | |
CN111476307A (en) | Lithium battery surface defect detection method based on depth field adaptation | |
CN105678381A (en) | Gender classification network training method, gender classification method and related device | |
CN114528762B (en) | Model training method, device, equipment and storage medium | |
CN113807214A (en) | Small target face recognition method based on deit attached network knowledge distillation | |
CN112381056B (en) | Cross-domain pedestrian re-identification method and system fusing multiple source domains | |
CN115457365A (en) | Model interpretation method and device, electronic equipment and storage medium | |
CN115797808A (en) | Unmanned aerial vehicle inspection defect image identification method, system, device and medium | |
CN114596609A (en) | Audio-visual counterfeit detection method and device | |
CN114676637A (en) | Fiber channel modeling method and system for generating countermeasure network based on conditions | |
CN113411566A (en) | No-reference video quality evaluation method based on deep learning | |
CN113159071A (en) | Cross-modal image-text association anomaly detection method | |
CN117058716A (en) | Cross-domain behavior recognition method and device based on image pre-fusion | |
CN111508024A (en) | Method for estimating pose of robot based on deep learning | |
CN115797646A (en) | Multi-scale feature fusion video denoising method, system, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20230316 Address after: 030000 Communication Industry Building, No. 273, Changzhi Road, Taiyuan Xuefu Park, Shanxi Comprehensive Reform Demonstration Zone, Taiyuan City, Shanxi Province Patentee after: Taiyuan Communication Industry Co.,Ltd. Address before: 030006 No. 92, Hollywood Road, Taiyuan, Shanxi Patentee before: SHANXI University |