CN113869193A - Training method of pedestrian re-identification model, and pedestrian re-identification method and system - Google Patents
Training method of pedestrian re-identification model, and pedestrian re-identification method and system Download PDFInfo
- Publication number
- CN113869193A CN113869193A CN202111131114.6A CN202111131114A CN113869193A CN 113869193 A CN113869193 A CN 113869193A CN 202111131114 A CN202111131114 A CN 202111131114A CN 113869193 A CN113869193 A CN 113869193A
- Authority
- CN
- China
- Prior art keywords
- domain
- pedestrian
- training
- recognition
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012549 training Methods 0.000 title claims abstract description 90
- 238000000034 method Methods 0.000 title claims abstract description 58
- 239000013598 vector Substances 0.000 claims abstract description 177
- 238000000354 decomposition reaction Methods 0.000 claims abstract description 18
- 230000006870 function Effects 0.000 claims description 58
- 238000012545 processing Methods 0.000 claims description 20
- 238000004590 computer program Methods 0.000 claims description 13
- 238000003860 storage Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 4
- 230000004044 response Effects 0.000 claims description 3
- 238000009826 distribution Methods 0.000 abstract description 11
- 238000005516 engineering process Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 8
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000005286 illumination Methods 0.000 description 6
- 238000010606 normalization Methods 0.000 description 6
- 230000006798 recombination Effects 0.000 description 6
- 238000005215 recombination Methods 0.000 description 6
- 238000013473 artificial intelligence Methods 0.000 description 4
- 238000007781 pre-processing Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 239000004576 sand Substances 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 230000007704 transition Effects 0.000 description 3
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000008521 reorganization Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the technical field of image recognition, and provides a training method of a pedestrian re-recognition model, a pedestrian re-recognition method and a system, wherein a source domain and a target domain original feature vector of a training sample are respectively extracted, and a domain invariant identity feature and a domain specific enhancement feature are obtained through decomposition of the pedestrian re-recognition model; the original characteristic vector, the domain invariant identity characteristic and the domain specific enhancement characteristic are repeated to obtain a reconstructed characteristic vector group; inputting the reconstructed feature vector group into a cross-domain face recognition loss function and a domain classification loss function; and (5) finishing the training of all training samples according to the loop iteration, and selecting the model with the minimum sum of cross-domain face recognition loss and domain classification loss as the trained pedestrian re-recognition model. The reconstruction feature set increases the diversity of samples used in training, inherits the reliable identity labels in the source domain, can well represent the data distribution of the source domain and the target domain, and trains the pedestrian re-recognition model with high-efficiency recognition under the condition of less samples.
Description
Technical Field
The invention relates to the technical field of image recognition, in particular to a training method of a pedestrian re-recognition model, a pedestrian re-recognition method and a system.
Background
Pedestrian re-identification (ReID) is a technique that uses computer vision techniques to determine whether a particular pedestrian is present in an image or video sequence, such as retrieving images of the pedestrian in a plurality of camera surveillance videos given a pedestrian.
An unsupervised Domain adaptive technology UDA (unsupervised Domain adaptation) transfers knowledge from a source Domain with a label to a target Domain without a label, so that the knowledge obtains better performance in a new environment, and is widely applied to a pedestrian re-recognition scene.
Disclosure of Invention
In view of the above drawbacks of the prior art, an object of the present invention is to provide a training method for a pedestrian re-recognition model, a pedestrian re-recognition method and a system thereof, which are used to solve the problems of high training complexity and poor recognition effect in the prior art.
A first aspect of the present invention provides a training method of a pedestrian re-recognition model, including:
respectively extracting original feature vectors of character images of a source domain and a target domain of a training sample, and decomposing the original feature vectors through a pedestrian re-recognition model to obtain domain invariant identity features and domain specific enhancement features;
reconstructing the original characteristic vector, the domain invariant identity characteristic and the domain specific enhancement characteristic to obtain a reconstructed characteristic vector group;
inputting the reconstructed feature vector group into a cross-domain face recognition loss function and a domain classification loss function to calculate corresponding cross-domain face recognition loss and domain classification loss;
and circularly iterating the steps until the training of all the training samples is completed, and selecting the model with the minimum sum of cross-domain face recognition loss and domain classification loss as the trained pedestrian re-recognition model.
In an embodiment of the present invention, the step of respectively extracting original feature vectors of the human images of a source domain and a target domain of the training sample, and obtaining the domain-invariant identity features and the domain-specific enhanced features from the original feature vectors through a pedestrian re-recognition model includes:
respectively extracting original feature vectors of the character images of the source domain and the target domain in a training sample;
obtaining the domain-invariant identity features and the domain-specific enhancement features through full-scale network OSNet decomposition:
B=(1-O(F))⊙F,
E=O(F)⊙F,
wherein F is a feature vector; b is the identity characteristic of the invariable domain; e is a domain-specific enhancement feature; element-by-element multiplication; o (-) is a response of the OSNet network, and
wherein, T is 4; g (F)t) For length spanning input FtA vector of the entire channel dimension.
In an embodiment of the present invention, the step of reconstructing the original feature vector, the domain-invariant identity feature, and the domain-specific enhancement feature to obtain a reconstructed feature vector set includes:
recombining the domain-invariant identity features and the domain-specific enhancement features of the person images of the source domain and the target domain to obtain a first reconstructed feature vector and a second reconstructed feature vector;
and rearranging and combining the original characteristic vector of the character image of the source domain, the original characteristic vector of the character image of the target domain, the first reconstruction characteristic vector and the second reconstruction characteristic vector according to different orders to obtain a reconstruction characteristic vector group.
In an embodiment of the present invention, the step of recombining the domain-invariant identity features and the domain-specific enhanced features of the human images of the source domain and the target domain to obtain the first reconstructed feature vector and the second reconstructed feature vector includes:
recombining the domain-invariant identity features of the person image of the source domain and the domain-specific enhancement features of the person image of the target domain to obtain the first reconstructed feature vector;
recombining the domain-specific enhanced features of the person image of the source domain and the domain-invariant identity features of the person image of the target domain to obtain the second reconstructed feature vector.
In an embodiment of the present invention, the cross-domain face recognition loss function is:
wherein m isThe index number of the element in (1);representing the (cosine) similarity of the corresponding alignment;representing the corresponding positive pair of the nth pair of the negative pairs; τ represents a trainable temperature value initialized to 1.
In an embodiment of the present invention, the domain classification loss function is:
where p (-) represents the probability that the trained domain classifier classified it as the source domain.
The second aspect of the present invention also provides a pedestrian re-identification method, including:
acquiring a figure image to be identified;
inputting a character image to be recognized into the pedestrian re-recognition model in any one of the first aspect, extracting a feature vector in the character image to be recognized, calculating the similarity between the feature vector in the character image to be recognized and the feature vector of the character image in the sample library, comparing the similarity with a set threshold, if the similarity is greater than the threshold, judging that the face image is the same person, otherwise, judging that the face image is not the same person, and obtaining the recognition result of the pedestrian re-recognition model.
The third aspect of the present invention also provides a pedestrian re-recognition system including:
the processing module is used for respectively extracting original feature vectors of character images of a source domain and a target domain of a training sample, and decomposing the original feature vectors through a pedestrian re-recognition model to obtain domain invariant identity features and domain specific enhancement features;
the reconstruction module is used for reconstructing the original characteristic vector, the domain invariant identity characteristic and the domain specific enhancement characteristic to obtain a reconstructed characteristic vector group;
the calculation module is used for inputting the reconstruction feature group into a cross-domain face recognition loss function and a domain classification loss function to calculate corresponding cross-domain face recognition loss and domain classification loss;
and the control training module is used for controlling all training samples to carry out circular iterative training, and selecting the model with the minimum sum of cross-domain face recognition loss and domain classification loss as the trained pedestrian re-recognition model.
The fourth aspect of the present invention also provides a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the training method of the pedestrian re-recognition model according to any one of the first aspect when executing the computer program, or implements the pedestrian re-recognition method according to the second aspect when executing the computer program.
The fifth aspect of the present invention also provides a computer-readable storage medium storing a computer program, wherein the computer program is configured to implement a training method of a pedestrian re-recognition model according to any one of the first aspect when executed by a processor, or to implement a pedestrian re-recognition method according to the second aspect when executed by a processor.
As described above, the training method, the pedestrian re-recognition method and the system of the pedestrian re-recognition model according to the present invention have the following advantages:
according to the method, the characteristic vectors of the character images from the source domain and the target domain are extracted, each characteristic vector is decomposed into a domain-invariant identity characteristic and a domain-specific enhancement characteristic, cross-domain characteristic recombination is performed, the obtained reconstruction characteristic group not only increases the diversity of samples used in training, but also inherits the reliable identity label in the source domain, and the data distribution of the source domain and the target domain can be well represented; and a pedestrian re-recognition model with high recognition efficiency can be trained under the condition of less samples by combining the supervision of a target loss function (a cross-domain face recognition loss function and a domain classification loss function).
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments of the present application will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive exercise.
FIG. 1 is a schematic flow chart illustrating a training method of a pedestrian re-identification model according to an embodiment of the present invention;
FIG. 2 is a schematic sub-flow chart of a training method for a pedestrian re-identification model according to an embodiment of the present invention;
FIG. 3 is a schematic sub-flow chart of a training method for a pedestrian re-identification model according to an embodiment of the present invention;
FIG. 4 is a schematic sub-flow chart of a training method for a pedestrian re-identification model according to an embodiment of the present invention;
FIG. 5 is a flow chart illustrating a method for pedestrian re-identification according to an embodiment of the present invention;
FIG. 6 shows a schematic block diagram of a training system for a pedestrian re-identification model provided for an embodiment of the present invention;
FIG. 7 shows a schematic block diagram of a computer apparatus provided for an embodiment of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the drawings only show the components related to the present invention rather than the number, shape and size of the components in practical implementation, and the type, quantity and proportion of the components in practical implementation can be changed freely, and the layout of the components can be more complicated.
The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Referring to fig. 1, a first embodiment of the present invention relates to a training method of a pedestrian re-recognition model, wherein the pedestrian re-recognition model is used to extract a feature vector of a to-be-recognized character image, compare the feature vector of the to-be-recognized character image with a feature vector of a character image in a sample library, calculate a similarity between the feature vector and the feature vector, compare the similarity with a set threshold, and if the similarity is greater than the threshold, determine that the face images are the same person.
As shown in fig. 1, the training method of the pedestrian re-identification model of the embodiment includes:
and step 100, respectively extracting original feature vectors of the character images of the source domain and the target domain of a training sample, and decomposing the original feature vectors through a pedestrian re-recognition model to obtain domain-invariant identity features and domain-specific enhanced features.
In particular, as shown in figure 2,
step 110, preprocessing the character images of the source domain and the target domain of a training sample, extracting an original characteristic vector:
the source domain data is data carrying a label, and the label is pre-labeled on the classification result of the source domain data; the target domain data is data without carrying a label, and the source domain data and the target domain data have certain commonality and certain difference. In this embodiment, the source domain data and the target domain data are both human images.
The figure image is a group of figure images continuously collected by the camera equipment, and before the figure images are used, the figure images need to be preprocessed to obtain preprocessed figure images, wherein the preprocessing comprises the following steps: adjusting illumination, histogram equalization processing and normalization processing. Wherein, should satisfy when adjusting illumination: reducing the brightness of the image in the highlight area, improving the brightness of the image in the shadow area, and keeping the brightness of the image in the transition area; carrying out gray level transformation on the figure image by adopting histogram equalization processing so as to facilitate smooth operation of the system; and carrying out normalization processing on pixel values in the human image to finally obtain a standard image in the same form. In addition, the image capturing device may be a person image obtained through a camera or the internet, for example, the image to be detected may be an image obtained by an electronic device through a camera of a smart phone, a tablet computer, an electronic eye, or the like; alternatively, the image may be an image acquired by the electronic device through the internet, for example, an image captured randomly from the internet, or an image transmitted by another device and received by the electronic device through a social application installed on the electronic device, and the source of the person image is not limited here.
It should be understood that the tagged personal images in the source domain may be self-marked by the user as needed, or may be obtained from an existing personal image library.
Continuing to explain, after the human image is preprocessed, the feature vectors of the source domain and the target domain, namely the original feature vectors, are extracted. The feature vector extraction can be performed in various ways, for example, it can be a residual network (ResNet) with different depths, such as ResNet-50, ResNet-34, ResNet-152, or other depth residual networks; alternatively, it may be a deep convolutional Neural Network (VGG), or it may also be a dense convolutional Network (DenseNet) or a Neural Architecture Search Network on neurons (NASNet), etc. In addition, the full scale network in the present embodiment may also be used to extract the original feature vector. Since extracting feature vectors from an image is a conventional technical means in the art, it is not described herein again.
it should be understood that the domain-invariant identity feature is a feature that is independent of the domain to which the training data belongs, and is a feature that does not vary due to domain differences. Taking the application scenario of pedestrian re-identification as an example, the identification information of the pedestrian is not changed along with the change of the outside, such as the wearing, posture and hair style of the pedestrian; meanwhile, the target object of the target detection task is the pedestrian in the character image, and therefore the identification information of the pedestrian is the domain-invariant identity feature to be extracted. In the learning scene of target detection, the purpose is to accurately extract the identification information of pedestrians from the acquired person image to complete the target detection task.
The domain-specific enhanced features are features characterizing a domain to which the training data belongs, are features specific to the domain to which the training data belongs, and can change due to domain differences; for example, in the application scenario of pedestrian re-recognition, the background of the pedestrian is irrelevant to the identification information of the pedestrian, the pedestrian recognition does not need to know the features, and the features vary with the domain difference.
The domain-invariant features and the domain-specific enhancement features together characterize the data distribution of the source domain and the target domain, and the domain-invariant identity features of different domains are exchangeable between domains without disrupting the distribution of each domain.
Further, in the embodiment, the pedestrian re-identification model adopts a full Scale Network (Omni-Scale Network, OSNet) to extract the pedestrian features.
It should be appreciated that pedestrian re-identification (ReID) relies on features with identification capabilities that can not only capture different spatial scales, but also encapsulate any combination of multiple scales, these isomorphic and heterogeneous scale features being referred to as full scale features; the OSNet network can be used for full-scale feature learning of the ReID, the OSNet network is realized by designing a residual block composed of a plurality of convolution feature streams, and each residual block detects features of a certain scale. Importantly, the OSNet network also introduces a new unified aggregation gate to perform dynamic multi-scale feature combination by using each channel weight which is depended by the input; to effectively learn spatial channel correlations, avoiding overfitting, the building blocks use both point and depth convolutions. By stacking these blocks layer by layer, the OSNet network is very lightweight and can be trained from scratch on an existing ReID basis.
Specifically, the process of extracting the pedestrian features through the OSNet network comprises the following steps: a person image of a given source domain and a given target domain is extracted from the person image, and an original feature vector F epsilon R with the channel number of C and the spatial resolution of H multiplied by W is extracted from the person imageC×H×WDecomposing the original feature vector F to obtain a domain-invariant identity feature B and a domain-specific enhancement feature E, wherein the expression is as follows:
F=B+E;
the domain invariant identity feature B is a basic feature of the identity of a person and is dominant in the process of identifying the identity of the person; the domain-specific enhancement feature E is complementary to the former.
The extracted calculation formula is as follows:
B=(1-O(F))⊙F,
E=O(F)⊙F,
wherein F is an original feature vector; b is the identity characteristic of the invariable domain; e is a domain-specific enhancement feature; element-by-element multiplication; o (-) is a response of the OSNet network, and
wherein, G (F)t) For length spanning input FtThe t index represents the characteristic scale.
G is implemented as a mini network consisting of nonparametric global averaging pooling layers and multi-layered perceptron (MLP) with a ReLU activation hidden layer, followed by Sigmoid activation.
For data from source domain DsExtracting the original characteristic vector F of the character image ii sAnd the original feature vector F is processed by an OSNet networki sDecomposition into domain-invariant identity features BiAnd domain specific enhancement featuresAnd the expression after decomposition is
For a target domain DtThe figure image j of which the original feature vector is extractedAnd the original feature vector is processed by an OSNet networkDecomposition into domain-invariant identity features BjAnd domain specific enhancement featuresAnd the expression after decomposition is
And 200, reconstructing the original characteristic vector, the domain invariant identity characteristic and the domain specific enhancement characteristic to obtain a reconstructed characteristic vector group.
In particular, since domain-invariant identity signatures of different domains are exchangeable between domains, without disrupting the distribution of each domain, cross-domain signature reorganization may be performed in order to increase the diversity of training samples.
As shown in fig. 3, the step 200 includes:
and recombining the domain invariant identity feature and the domain specific enhancement feature of the character image of the source domain with the domain invariant identity feature and the domain specific enhancement feature of the character image of the target domain to obtain a first reconstruction feature vector and a second reconstruction feature vector. It is noted that the first and second reconstructed feature vectors each comprise a domain-invariant identity feature vector and a domain-specific enhancement feature.
As shown in fig. 4, specifically:
identity feature B of source domainiDomain specific enhancement features with target domainsCombining to obtain a first reconstructed feature vectorAndthe expression is as follows:
enhancing features of domain-specific source domainsDomain invariant identity feature with target domain BjObtaining a second reconstructed feature vectorThe expression is as follows:
for the first reconstructed feature vectorOr second reconstructed feature vectorIts identification information will be inherited from domain-invariant identity feature B, while the domain information will be inherited from domain-specific enhanced feature E.
Through the steps ofS210, obtaining an original characteristic vector F of the source domain for two person images of two persons from the source domain and the target domaini sOriginal feature vector of target domainFirst reconstructed feature vectorAnd a second reconstructed feature vectorAnd 4 eigenvectors are mutually arranged and combined in different orders to obtain a reconstructed eigenvector group. By the aid of the reconstructed feature vector group, diversity of training samples is increased, reliable identity labels are inherited, target and source domain data distribution can be well represented, and after training in a loss function, the recombined features support reliable identity inheritance and approximate actual distribution.
It should be noted that in this embodiment, the number of reconstructed feature vector groups obtained by recombination in different permutation and combination manners is 24. However, in actual use, the permutation and combination method may be set according to actual needs, and the number of reconstructed feature vector groups varies accordingly. For example: one reconstructed feature vector set in this embodiment isAnother reconstructed set of feature vectors is
And 300, inputting the reconstructed feature vector group into a cross-domain face recognition loss function and a domain classification loss function to calculate corresponding cross-domain face recognition loss and domain classification loss.
Specifically, in the process of training the deep neural network, because the output of the deep neural network is expected to be as close as possible to the value really expected to be predicted, the weight vector of each layer of the neural network can be updated according to the difference between the predicted value of the current network and the target value really expected (of course, an initialization process is usually performed before the first update, that is, parameters are configured in advance for each layer in the deep neural network). For example, if the predicted value of the network is high, the weight vector is adjusted to make the predicted value lower, and the adjustment is continued until the deep neural network can predict the real desired target value or a value very close to the real desired target value. Therefore, it is necessary to define in advance "how to compare the difference between the predicted value and the target value", which are loss functions (loss functions), which are important equations for measuring the difference between the predicted value and the target value. Taking the loss function as an example, if the higher the output value (loss) of the loss function indicates the larger the difference, the training of the deep neural network becomes the process of reducing the loss as much as possible.
In this embodiment, the objective loss function used includes: cross-domain face recognition loss function LCIDSum domain classification loss function LDomainAnd respectively supervising the training of the pedestrian re-recognition model from the character similarity degree and the prediction probability by simultaneously using the cross-domain face recognition loss function and the domain classification loss function.
The number of the reconstructed feature vector groups obtained for one training sample is 24, and the reconstructed feature vector group input to the target loss function may be all 24 reconstructed feature vector groups or may be some of the 24 reconstructed feature vector groups extracted at random. Further, since the objective function includes a cross-domain face recognition loss function and a domain classification loss function, the reconstructed feature vector groups input to the cross-domain face recognition loss function and the domain classification loss function may be the same or different.
Further, the cross-domain face recognition loss function is used for measuring the similarity of the people in the two images, and the expression is as follows:
where m is a set of reconstructed feature vectors input into a cross-domain face recognition penalty function, e.g.The index number of the element(s) in (1),representing the (cosine) similarity of the corresponding subtends,representing the corresponding positive and negative n-th pair. τ represents a trainable temperature value initialized to 1.
Reconstructing the feature vector setAnd inputting a cross-domain face recognition loss function, taking each element in the set as an anchor point, sequentially pulling out the features of the same identity and pushing in the features of different identities, calculating to obtain a first loss value, and adjusting a first parameter of the to-be-trained pedestrian re-recognition model according to the first loss value by adopting a back propagation algorithm.
It should be noted that the anchor point image is a person image carrying a label in the source domain, and the positive sample image corresponding to the anchor point image is a training image having the same pedestrian identification information as that in the anchor point image; the negative sample image corresponding to the anchor point image is a training image different from the pedestrian identification information in the anchor point image. Thus, there are one pair of positive samples and two pairs of negative samples in the combined set of reconstructed feature vectors that are related to the identity of the person.
The domain classification loss function is a cross entropy-based domain classification loss function LDomainFor calculating the probability of predicting the target domain sample as the source domain sample, the expression is:
where p (-) represents the probability that the trained domain classifier classified it as the source domain.
In this embodiment, another set of reconstructed feature vectors is usedAnd inputting the second loss value into the domain classification loss function, calculating to obtain a second loss value, and adjusting a second parameter of the pedestrian re-recognition model to be trained according to the second loss value by adopting a back propagation algorithm.
and (3) performing loop iterative training on all training samples according to the steps 100-300 until the iteration of all training samples is completed, and taking the weight value corresponding to the minimum sum of the first loss value and the second loss value in multiple times of training as the weight value of the trained pedestrian re-recognition model.
In the optimization process, a cross-domain face recognition loss function LCIDSum domain classification loss function LDomainConstrained with respect to each other to suppress trivial solutions of re-identification and domain classification. Meanwhile, cross-domain face recognition loss function LCIDSum domain classification loss function LDomainIs driven by joint learning of BiAnd BjLearning domain sharing base features, drivingAndenhanced features specific to the field of learning.
It can be seen that, in the embodiment, by extracting feature vectors of character images from a source domain and a target domain, decomposing each feature vector into a domain-invariant identity feature and a domain-specific enhancement feature, and performing cross-domain feature recombination, an obtained reconstructed feature vector group not only increases the diversity of samples used in training, but also inherits reliable identity labels in the source domain, and can well represent the data distribution of the source domain and the target domain; through the decomposition and combination of the target loss function, a pedestrian re-recognition model with high recognition efficiency can be trained under the condition of fewer samples.
Referring to fig. 5, a second embodiment of the present invention relates to a pedestrian re-identification method, including:
Specifically, the character image is a group of character images continuously acquired by the camera device, and before the character image is used, the character image needs to be preprocessed to obtain a preprocessed character image, wherein the preprocessing comprises: adjusting illumination, histogram equalization processing and normalization processing. Wherein, should satisfy when adjusting illumination: reducing the brightness of the image in the highlight area, improving the brightness of the image in the shadow area, and keeping the brightness of the image in the transition area; carrying out gray level transformation on the figure image by adopting histogram equalization processing so as to facilitate smooth operation of the system; and carrying out normalization processing on pixel values in the human image to finally obtain a standard image in the same form. In addition, the image capturing device may be a person image obtained through a camera or the internet, for example, the image to be detected may be an image obtained by an electronic device through a camera of a smart phone, a tablet computer, an electronic eye, or the like; alternatively, the image may be an image acquired by the electronic device through the internet, for example, an image captured randomly from the internet, or an image transmitted by another device and received by the electronic device through a social application installed on the electronic device, and the source of the person image is not limited here.
Specifically, the pedestrian re-identification model is obtained by pre-training, wherein the training step comprises the following steps:
for data from source domain DsExtracting the original characteristic vector F of the character image ii sAnd the original feature vector F is processed by an OSNet networki sDecomposition into domain-invariant identity features BiAnd domain specific enhancement featuresAnd the expression after decomposition is
For a target domain DtThe figure image j of which the original feature vector is extractedAnd the original feature vector is processed by an OSNet networkDecomposition into domain-invariant identity features BjAnd domain specific enhancement featuresAnd the expression after decomposition is
Identity feature B of source domainiAnd domain specific enhancement featuresDomain invariant identity feature with target domain BjAnd domain specific enhancement featuresCombining to obtain a reconstructed feature vectorAndthe expression is as follows:
the original feature vector Fi s、And reconstructing the feature vectorMutually arranging and combining in different orders to obtain a reconstructed feature vector group
Reconstructing the feature vector setInputting the cross-domain face recognition loss function, calculating to obtain a first loss value, and adjusting a first parameter of a pedestrian re-recognition model to be trained according to the first loss value by adopting a back propagation algorithm; the expression of the cross-domain face recognition loss function is as follows:
wherein m isThe index number of the element(s) in (1),representing the (cosine) similarity of the corresponding subtends,representing the corresponding positive and negative n-th pair. τ represents a trainable temperature value initialized to 1.
Reconstructing another set of feature vectorsInputting the second loss value into a domain classification loss function, calculating to obtain a second loss value, and adjusting a second parameter of the pedestrian re-recognition model to be trained according to the second loss value by adopting a back propagation algorithm; wherein, the expression of the domain classification loss function is:
where p (-) represents the probability that the trained domain classifier classified it as the source domain.
And repeating the steps, carrying out circulating iterative training until the set iteration times is finished, and taking the model corresponding to the minimum sum of the first loss value and the second loss value in multiple times of training as the re-identification model of the trained pedestrian.
Inputting the preprocessed figure image to be recognized into a pedestrian re-recognition model, calculating the similarity between the feature vector in the figure image to be recognized and the feature vector of the figure image in the sample library, comparing the similarity with a set threshold, judging that the face image is the same person if the similarity is larger than the threshold, and obtaining the recognition result of the pedestrian re-recognition model if the face image is not the same person if the similarity is not larger than the threshold. In this embodiment, the sample library may be a source domain, and the person image to be identified may be from a target domain. In the testing process, the target image j is input into a trained pedestrian re-identification model, and features consisting of base information shared by the fields and enhancement information specific to the fields are usedAnd performing character matching to obtain a matching result.
Therefore, in the embodiment, the pedestrian re-recognition result is obtained by inputting the acquired figure image to be recognized into the trained pedestrian re-recognition model; the pedestrian re-recognition model decomposes each feature vector into a domain-invariant identity feature and a domain-specific enhancement feature by extracting the feature vectors of the figure images from the source domain and the target domain, and performs cross-domain feature recombination, so that the obtained reconstructed feature vector group not only increases the diversity of samples used in training, but also inherits reliable identity labels in the source domain, and can well represent the data distribution of the source domain and the target domain; through the decomposition and combination of the objective loss function, fewer samples are required and the efficiency of identification is high.
Referring to fig. 6, a third embodiment of the present invention relates to a training system of a pedestrian re-identification model, including:
the processing module 601 is configured to extract original feature vectors of character images of a source domain and a target domain of a training sample, and decompose the original feature vectors through a pedestrian re-recognition model to obtain a domain invariant identity feature and a domain specific enhancement feature;
the character images comprise character images with labels in the source domain and character images without labels in the target domain, and the character images with the labels in the source domain can be marked by a user according to needs and can also be obtained from an existing character image library.
Further, the character image is a group of character images continuously acquired by the camera device, and before the character image is used, the character image needs to be preprocessed to obtain a preprocessed character image, where the preprocessing includes: adjusting illumination, histogram equalization processing and normalization processing. Wherein, should satisfy when adjusting illumination: reducing the brightness of the image in the highlight area, improving the brightness of the image in the shadow area, and keeping the brightness of the image in the transition area; carrying out gray level transformation on the figure image by adopting histogram equalization processing so as to facilitate smooth operation of the system; and carrying out normalization processing on pixel values in the human image to finally obtain a standard image in the same form. In addition, the image capturing device may be a person image obtained through a camera or the internet, for example, the image to be detected may be an image obtained by an electronic device through a camera of a smart phone, a tablet computer, an electronic eye, or the like; alternatively, the image may be an image acquired by the electronic device through the internet, for example, an image captured randomly from the internet, or an image transmitted by another device and received by the electronic device through a social application installed on the electronic device, and the source of the person image is not limited here.
Further explanation, for D from the source domainsExtracting the original characteristic vector F of the character image ii sAnd the original feature vector F is processed by an OSNet networki sDecomposition into domain-invariant identity features BiAnd domain specific enhancement featuresAnd the expression after decomposition is
For a target domain DtThe figure image j of which the original feature vector is extractedAnd the original feature vector is processed by an OSNet networkDecomposition into domain-invariant identity features BjAnd domain specific enhancement featuresAnd the expression after decomposition is
A reconstructing module 602, configured to reconstruct the original feature vector, the domain-invariant identity feature, and the domain-specific enhancement feature to obtain a reconstructed feature vector group;
combining the domain invariant identity feature and the domain specific enhancement feature of the source domain with the domain invariant identity feature and the domain specific enhancement feature of the target domain to obtain a plurality of reconstructed feature vectors; wherein each reconstructed feature vector comprises a domain-invariant identity feature and a domain-specific enhancement feature.
Further explaining, the domain invariant identity characteristic B of the source domainiAnd domain specific enhancement featuresDomain invariant identity feature with target domain BjAnd domain specific enhancement featuresCombining to obtain a reconstructed feature vectorAndthe expression is as follows:
further arranging and combining the original characteristic vectors and the reconstructed characteristic vectors in different orders to obtain a reconstructed characteristic vector group;
further, the original feature vector F is describedi s、And reconstructing the feature vectorMutually arranging and combining in different orders to obtain a plurality of different reconstructed feature vector groups, for example:
a calculating module 603, configured to input the reconstructed feature group into a cross-domain face recognition loss function and a domain classification loss function to calculate a corresponding cross-domain face recognition loss and a corresponding domain classification loss;
in this embodiment, one of the feature vector groups is reconstructedInputting the cross-domain face recognition loss function, calculating to obtain a first loss value, and adjusting a first parameter of a pedestrian re-recognition model to be trained according to the first loss value by adopting a back propagation algorithm; the expression of the cross-domain face recognition loss function is as follows:
wherein m isThe index number of the element(s) in (1),representing the (cosine) similarity of the corresponding subtends,representing the corresponding positive and negative n-th pair. τ represents a trainable temperature value initialized to 1.
Reconstructing another set of feature vectorsInputting the second loss value into a domain classification loss function, calculating to obtain a second loss value, and adjusting a second parameter of the pedestrian re-recognition model to be trained according to the second loss value by adopting a back propagation algorithm; wherein, the expression of the domain classification loss function is:
where p (-) represents the probability that the trained domain classifier classified it as the source domain.
And the control training module 604 is used for controlling all training samples to carry out the training of loop iteration, and selecting the model with the minimum sum of cross-domain face recognition loss and domain classification loss as the trained pedestrian re-recognition model.
And repeating the steps, carrying out the cycle iteration training until all the training samples finish the cycle iteration, and taking the model corresponding to the minimum sum of the first loss value and the second loss value in the multiple times of training as the trained pedestrian re-identification model.
As can be seen, in the present embodiment, the person image is acquired by the acquisition module; extracting feature vectors in the character image through a processing module, and decomposing the original feature vectors through a pedestrian re-identification model to obtain domain invariant identity features and domain specific enhancement features; performing feature recombination through a reconstruction module to obtain a reconstructed feature vector group; and finally, training to obtain a pedestrian re-recognition model through a calculation module and a control training module. The pedestrian re-recognition model decomposes each feature vector into a domain-invariant identity feature and a domain-specific enhancement feature by extracting the feature vectors of the figure images from the source domain and the target domain, and performs cross-domain feature recombination, so that the obtained reconstructed feature vector group not only increases the diversity of samples used in training, but also inherits reliable identity labels in the source domain, and can well represent the data distribution of the source domain and the target domain; through the decomposition and combination of the objective loss function, fewer samples are required and the efficiency of identification is high.
Referring to fig. 7, a fourth embodiment of the present invention relates to a computer device, which includes a memory 701, a processor 702, and a computer program stored in the memory 701 and executable on the processor 702, wherein the processor 702 implements the method for training the pedestrian re-recognition model according to any one of the first embodiment when executing the computer program, or the processor 702 implements the method for recognizing the pedestrian according to the second embodiment when executing the computer program.
The memory 701 and the processor 702 are coupled by a bus, which may comprise any number of interconnecting buses and bridges that couple one or more of the various circuits of the processor 702 and the memory 701 together. The bus may also connect various other circuits such as peripheral devices 703, voltage regulators 704, and power management circuits, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. Data processed by the processor 702 may be transmitted over a wireless medium through an antenna, which may receive the data and transmit the data to the processor 702.
The processor 702 is responsible for managing the bus and general processing, and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And memory 701 may be used for storing data used by processor 702 in performing operations.
Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
A fifth embodiment of the present invention relates to a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a training method for a pedestrian re-recognition model as described in any one of the first embodiments above, or which, when executed by a processor, implements a pedestrian re-recognition method as described in the second embodiments above.
The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
In summary, the training method, the pedestrian re-recognition method and the system for the pedestrian re-recognition model of the present invention extract the feature vectors of the character images from the source domain and the target domain, decompose each feature vector into a domain-invariant identity feature and a domain-specific enhancement feature, and perform cross-domain feature reorganization to increase the diversity of the samples used in the training; the recombined characteristics inherit the reliable identity label and can well represent the data distribution of the source domain and the target domain; in addition, the reorganized features are decomposed and combined under the constraint of cross-domain face recognition loss and domain classification loss, and the recognition efficiency is further improved. Therefore, the invention effectively overcomes various defects in the prior art and has high industrial utilization value.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.
Claims (10)
1. A training method of a pedestrian re-identification model is characterized by comprising the following steps:
respectively extracting original feature vectors of character images of a source domain and a target domain of a training sample, and decomposing the original feature vectors through a pedestrian re-recognition model to obtain domain invariant identity features and domain specific enhancement features;
reconstructing the original characteristic vector, the domain invariant identity characteristic and the domain specific enhancement characteristic to obtain a reconstructed characteristic vector group;
inputting the reconstructed feature vector group into a cross-domain face recognition loss function and a domain classification loss function to calculate corresponding cross-domain face recognition loss and domain classification loss;
and circularly iterating the steps until the training of all the training samples is completed, and selecting the model with the minimum sum of cross-domain face recognition loss and domain classification loss as the trained pedestrian re-recognition model.
2. The training method of the pedestrian re-recognition model according to claim 1, characterized in that: the step of respectively extracting the original feature vectors of the character images of one source domain and one target domain of the training sample, and obtaining the domain-invariant identity features and the domain-specific enhancement features from the original feature vectors through a pedestrian re-recognition model comprises the following steps:
respectively extracting original feature vectors of the character images of the source domain and the target domain in a training sample;
obtaining the domain-invariant identity features and the domain-specific enhancement features through full-scale network OSNet decomposition:
B=(1-O(F))⊙F,
E=O(F)⊙F,
wherein F is a feature vector; b is the identity characteristic of the invariable domain; e is a domain-specific enhancement feature; element-by-element multiplication; o (-) is a response of the OSNet network, and
wherein, T is 4; g (Ft)) For length spanning input FtA vector of the entire channel dimension.
3. The training method of the pedestrian re-identification model according to claim 1, wherein the step of reconstructing the original feature vectors, the domain-invariant identity features and the domain-specific enhancement features to obtain the reconstructed feature vector set comprises:
recombining the domain-invariant identity features and the domain-specific enhancement features of the person images of the source domain and the target domain to obtain a first reconstructed feature vector and a second reconstructed feature vector;
and rearranging and combining the original characteristic vector of the character image of the source domain, the original characteristic vector of the character image of the target domain, the first reconstruction characteristic vector and the second reconstruction characteristic vector according to different orders to obtain a reconstruction characteristic vector group.
4. The method for training a pedestrian re-recognition model according to claim 3, wherein the step of recombining the domain-invariant identity features and the domain-specific enhancement features of the human images of the source domain and the target domain to obtain the first reconstructed feature vector and the second reconstructed feature vector comprises:
recombining the domain-invariant identity features of the person image of the source domain and the domain-specific enhancement features of the person image of the target domain to obtain the first reconstructed feature vector;
recombining the domain-specific enhanced features of the person image of the source domain and the domain-invariant identity features of the person image of the target domain to obtain the second reconstructed feature vector.
5. The training method of the pedestrian re-recognition model according to claim 1, wherein the cross-domain face recognition loss function is:
7. A pedestrian re-identification method is characterized in that: the method comprises the following steps:
acquiring a figure image to be identified;
inputting a character image to be recognized into the pedestrian re-recognition model according to any one of claims 1 to 6, extracting a feature vector in the character image to be recognized, calculating the similarity between the feature vector in the character image to be recognized and the feature vector of the character image in the sample library, comparing the similarity with a set threshold, if the similarity is larger than the threshold, judging that the face image is the same person, otherwise, judging that the face image is not the same person, and obtaining the recognition result of the pedestrian re-recognition model.
8. A training system for a pedestrian re-recognition model, comprising:
the processing module is used for respectively extracting original feature vectors of character images of a source domain and a target domain of a training sample, and decomposing the original feature vectors through a pedestrian re-recognition model to obtain domain invariant identity features and domain specific enhancement features;
the reconstruction module is used for reconstructing the original characteristic vector, the domain invariant identity characteristic and the domain specific enhancement characteristic to obtain a reconstructed characteristic vector group;
the calculation module is used for inputting the reconstruction characteristic vector group into a cross-domain face recognition loss function and a domain classification loss function to calculate corresponding cross-domain face recognition loss and domain classification loss;
and the control training module is used for controlling all training samples to carry out circular iterative training, and selecting the model with the minimum sum of cross-domain face recognition loss and domain classification loss as the trained pedestrian re-recognition model.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements a training method of a pedestrian re-recognition model according to any one of claims 1 to 6 when executing the computer program or implements a pedestrian re-recognition method according to claim 7 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out a method for training a pedestrian re-recognition model according to any one of claims 1 to 6, or which, when being executed by a processor, carries out a method for pedestrian re-recognition according to claim 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111131114.6A CN113869193B (en) | 2021-09-26 | 2021-09-26 | Training method of pedestrian re-recognition model, pedestrian re-recognition method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111131114.6A CN113869193B (en) | 2021-09-26 | 2021-09-26 | Training method of pedestrian re-recognition model, pedestrian re-recognition method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113869193A true CN113869193A (en) | 2021-12-31 |
CN113869193B CN113869193B (en) | 2024-06-28 |
Family
ID=78990805
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111131114.6A Active CN113869193B (en) | 2021-09-26 | 2021-09-26 | Training method of pedestrian re-recognition model, pedestrian re-recognition method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113869193B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114596581A (en) * | 2022-02-17 | 2022-06-07 | 复旦大学 | Method for confirming human body identity of intelligent unmanned supermarket |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110414462A (en) * | 2019-08-02 | 2019-11-05 | 中科人工智能创新技术研究院(青岛)有限公司 | A kind of unsupervised cross-domain pedestrian recognition methods and system again |
WO2019228358A1 (en) * | 2018-05-31 | 2019-12-05 | 华为技术有限公司 | Deep neural network training method and apparatus |
CN111476168A (en) * | 2020-04-08 | 2020-07-31 | 山东师范大学 | Cross-domain pedestrian re-identification method and system based on three stages |
WO2020186914A1 (en) * | 2019-03-20 | 2020-09-24 | 北京沃东天骏信息技术有限公司 | Person re-identification method and apparatus, and storage medium |
CN111881714A (en) * | 2020-05-22 | 2020-11-03 | 北京交通大学 | Unsupervised cross-domain pedestrian re-identification method |
US20210064907A1 (en) * | 2019-08-27 | 2021-03-04 | Nvidia Corporation | Cross-domain image processing for object re-identification |
DE102020122345A1 (en) * | 2019-08-27 | 2021-03-04 | Nvidia Corporation | CROSS-DOMAIN IMAGE PROCESSING FOR OBJECT REALIZATION |
CN112990120A (en) * | 2021-04-25 | 2021-06-18 | 昆明理工大学 | Cross-domain pedestrian re-identification method using camera style separation domain information |
CN113065516A (en) * | 2021-04-22 | 2021-07-02 | 中国矿业大学 | Unsupervised pedestrian re-identification system and method based on sample separation |
-
2021
- 2021-09-26 CN CN202111131114.6A patent/CN113869193B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019228358A1 (en) * | 2018-05-31 | 2019-12-05 | 华为技术有限公司 | Deep neural network training method and apparatus |
WO2020186914A1 (en) * | 2019-03-20 | 2020-09-24 | 北京沃东天骏信息技术有限公司 | Person re-identification method and apparatus, and storage medium |
CN110414462A (en) * | 2019-08-02 | 2019-11-05 | 中科人工智能创新技术研究院(青岛)有限公司 | A kind of unsupervised cross-domain pedestrian recognition methods and system again |
US20210064907A1 (en) * | 2019-08-27 | 2021-03-04 | Nvidia Corporation | Cross-domain image processing for object re-identification |
DE102020122345A1 (en) * | 2019-08-27 | 2021-03-04 | Nvidia Corporation | CROSS-DOMAIN IMAGE PROCESSING FOR OBJECT REALIZATION |
CN111476168A (en) * | 2020-04-08 | 2020-07-31 | 山东师范大学 | Cross-domain pedestrian re-identification method and system based on three stages |
CN111881714A (en) * | 2020-05-22 | 2020-11-03 | 北京交通大学 | Unsupervised cross-domain pedestrian re-identification method |
CN113065516A (en) * | 2021-04-22 | 2021-07-02 | 中国矿业大学 | Unsupervised pedestrian re-identification system and method based on sample separation |
CN112990120A (en) * | 2021-04-25 | 2021-06-18 | 昆明理工大学 | Cross-domain pedestrian re-identification method using camera style separation domain information |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114596581A (en) * | 2022-02-17 | 2022-06-07 | 复旦大学 | Method for confirming human body identity of intelligent unmanned supermarket |
Also Published As
Publication number | Publication date |
---|---|
CN113869193B (en) | 2024-06-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Misra et al. | Shuffle and learn: unsupervised learning using temporal order verification | |
Wang et al. | Unsupervised learning of visual representations using videos | |
CN111783831B (en) | Complex image accurate classification method based on multi-source multi-label shared subspace learning | |
CN108197326B (en) | Vehicle retrieval method and device, electronic equipment and storage medium | |
Boussaad et al. | Deep-learning based descriptors in application to aging problem in face recognition | |
CN114419671B (en) | Super-graph neural network-based pedestrian shielding re-identification method | |
CN113516227B (en) | Neural network training method and device based on federal learning | |
CN110647938B (en) | Image processing method and related device | |
Zhang et al. | IL-GAN: Illumination-invariant representation learning for single sample face recognition | |
CN113361549A (en) | Model updating method and related device | |
Barra et al. | Gait analysis for gender classification in forensics | |
CN116434347B (en) | Skeleton sequence identification method and system based on mask pattern self-encoder | |
Abdelrazik et al. | Efficient hybrid algorithm for human action recognition | |
CN112528788A (en) | Re-recognition method based on domain invariant features and space-time features | |
Shafiee et al. | Real-time embedded motion detection via neural response mixture modeling | |
Liu | Human face expression recognition based on deep learning-deep convolutional neural network | |
CN118135660A (en) | Cross-view gait recognition method for joint multi-view information bottleneck under view-angle deficiency condition | |
Nooruddin et al. | A multi-resolution fusion approach for human activity recognition from video data in tiny edge devices | |
Aufar et al. | Face recognition based on Siamese convolutional neural network using Kivy framework | |
CN113869193B (en) | Training method of pedestrian re-recognition model, pedestrian re-recognition method and system | |
CN113762331A (en) | Relational self-distillation method, apparatus and system, and storage medium | |
Zhao et al. | Research on human behavior recognition in video based on 3DCCA | |
Liang | Unrestricted Face Recognition Algorithm Based on Transfer Learning on Self‐Pickup Cabinet | |
Latha et al. | Human action recognition using deep learning methods (CNN-LSTM) without sensors | |
Ahsan et al. | Hrneto: Human action recognition using unified deep features optimization framework |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |