CN115205903A - Pedestrian re-identification method for generating confrontation network based on identity migration - Google Patents

Pedestrian re-identification method for generating confrontation network based on identity migration Download PDF

Info

Publication number
CN115205903A
CN115205903A CN202210890765.1A CN202210890765A CN115205903A CN 115205903 A CN115205903 A CN 115205903A CN 202210890765 A CN202210890765 A CN 202210890765A CN 115205903 A CN115205903 A CN 115205903A
Authority
CN
China
Prior art keywords
pedestrian
image
identity
training
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210890765.1A
Other languages
Chinese (zh)
Other versions
CN115205903B (en
Inventor
朱容波
吴天
张�浩
李松泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong Agricultural University
Original Assignee
Huazhong Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong Agricultural University filed Critical Huazhong Agricultural University
Priority to CN202210890765.1A priority Critical patent/CN115205903B/en
Publication of CN115205903A publication Critical patent/CN115205903A/en
Application granted granted Critical
Publication of CN115205903B publication Critical patent/CN115205903B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a pedestrian re-identification method for generating an antagonistic network based on identity migration, which comprises the following steps of: acquiring a pedestrian image data set, and generating a semantic graph corresponding to a pedestrian image through a human body semantic analysis model; constructing an integral model of pedestrian re-identification, which comprises a generator, a discriminator and a pedestrian re-identification network; the generator and the discriminator form a generated confrontation network based on semantic graph identity migration, and the generator and the discriminator are trained in a confrontation learning mode; constructing a gradient enhancement method based on a local quality attention mechanism, and improving an antagonistic network; establishing a joint training mode for generating a confrontation network and a pedestrian re-recognition network; and inputting a pedestrian image to be recognized, and outputting a pedestrian re-recognition result through the trained pedestrian re-recognition network. The invention improves the diversity of the pedestrian re-identification data set, can effectively improve the quality of the generated image and improve the identification precision of the pedestrian re-identification model.

Description

Pedestrian re-identification method for generating confrontation network based on identity migration
Technical Field
The invention relates to the technical field of computer vision, in particular to a pedestrian re-identification method for generating an anti-network based on identity migration.
Background
Pedestrian re-identification is an important task in the field of computer vision, and aims to establish identity association of pedestrians in a cross-camera scene. Pedestrian re-identification has wide application in the fields of video surveillance and security, etc., and extracts images containing interested persons from non-overlapping cameras according to given query images. However, the background, the view angle and the posture of images shot by different cameras have great difference, which brings great challenge to finding a target pedestrian in a cross-camera scene. Therefore, in order to cope with the difference between images, it is necessary to learn a feature expression having discriminative power from training data as much as possible. With the development of deep learning, many works are trained by using a deep metric learning or classification learning method by means of strong characterization capability of a convolutional neural network, so that the identification accuracy of the model is greatly improved. In order to further learn local features in the image, many works align pedestrian features by using local feature information such as horizontal division or attitude skeleton, and the like, so that the characterization capability of the model is enhanced.
The improvement of the model structure is one aspect of improving the accuracy of pedestrian re-identification, and another reason that the pedestrian re-identification model is difficult to learn the robust representation of the differences of the background, the visual angle, the posture and the like is that the data diversity of the data set is insufficient and the data scale is small. The postures of pedestrians are changeable and the backgrounds of pedestrians are disordered in the moving process, and it is impractical to collect images of pedestrians under different conditions in a real scene, so that data sets are difficult to include images of pedestrians under various changes, and the diversity of image data of pedestrians is insufficient. In addition, the increase in the amount of data causes an increase in the labeling cost, which makes it difficult to construct a large-scale data set. As generative models develop, and in particular, antagonistic networks are generated, the manner in which training data sets are augmented with generative models is adopted by more and more research. Some researchers have expanded the pedestrian re-recognition data set by synthesizing new pedestrian images using random noise or pose key points, increasing the diversity of pedestrian poses in the data set. However, random noise and attitude key points used in the method contain too little prior information, which cannot accurately guide the generation of pedestrian features, resulting in blur and artifacts in the generated image and inaccurate identity features. The generated images with poor quality mislead the learning of the model to the characteristics in the training process of the pedestrian re-identification network, so that the improvement of the model identification precision is hindered, and the training of the model is not facilitated.
Disclosure of Invention
The invention aims to solve the technical problem of providing a pedestrian re-identification method for generating a countermeasure network based on identity migration aiming at the defects in the prior art.
The technical scheme adopted by the invention for solving the technical problem is as follows:
the invention provides a pedestrian re-identification method for generating an antagonistic network based on identity migration, which comprises the following steps:
step 1, acquiring a pedestrian image data set, generating a semantic graph corresponding to a pedestrian image through a human body semantic analysis model, allocating a semantic category to each pixel in the pedestrian image through the human body semantic analysis model, and dividing the pedestrian image introduced with the semantic graph into a training set and a test set;
step 2, constructing an integral model of pedestrian re-identification, which comprises a generator G, a discriminator D and a pedestrian re-identification network R; the generator G comprises a structural encoder E s Identity information extractor E id Decoder G dec The generator G and the discriminator D form a generated confrontation network based on semantic graph identity migration, and the generator G and the discriminator D are trained in a confrontation learning mode;
step 3, constructing a gradient enhancement method based on a local quality attention mechanism, and improving a generation countermeasure network;
step 4, establishing a joint training mode for generating the confrontation network and the pedestrian re-recognition network, inputting a training set, outputting a new generated image through the generated confrontation network, using the generated image and the pedestrian image in the training set for training the pedestrian re-recognition network to obtain a trained integral model, and testing by using the test set;
and 5, inputting a pedestrian image to be recognized, and outputting a pedestrian re-recognition result through the trained pedestrian re-recognition network.
Further, the method in step 1 of the present invention comprises:
acquiring a pedestrian image data set, wherein each pedestrian in the pedestrian image has a pedestrian label, and dividing the pedestrian label into a training set and a testing set, wherein the training set and the testing set do not have repeated pedestrian labels; the semantic image corresponding to the pedestrian image is generated through a human body semantic analysis model, the human body semantic analysis model allocates a semantic category for each pixel in the image, and the generated semantic image comprises 20 semantic categories which are respectively background, hat, hair, gloves, sunglasses, jacket, one-piece dress, coat, socks, trousers, jumpsuit, scarf, skirt, face, left arm, right arm, left leg, right leg, left shoe and right shoe; dividing all semantic categories into 5 parts, namely a head part, an upper body, a lower body, shoes and a background according to the spatial position relation of the semantic categories; the semantic graph is used for independently extracting the features of each part, so that fine feature extraction is realized; and all images are scaled uniformly to a certain pixel size before training.
Further, the method in step 2 of the present invention comprises:
semantic graph-based identity migration generation countermeasure network routing structure encoder E s Identity information extractor E id Decoder G dec And a discriminator D, where E s 、E id And G dec The combination is a generator G which forms a generation countermeasure network with the discriminator D, and the countermeasure loss is used for training;
defining a training set as
Figure BDA0003767448010000031
Each training sample is formed by pedestrian images
Figure BDA0003767448010000041
Identity tag y of image n ∈[1,K]And the semantic map of a pedestrian
Figure BDA0003767448010000042
Composition, where N represents the number of images in the dataset, K represents the number of identities in the dataset, C represents the number of categories of semantic tags, H and W represent the height and width of the images, respectively;
in the process of training to generate the countermeasure network, two real samples are randomly taken out of the training set
Figure BDA0003767448010000043
And
Figure BDA0003767448010000044
wherein a ∈ [1, N ]]And b is ∈ [1, N ]]To convert an image x a Identity feature of (2) to image x b In the above, the generator G first uses the identity extractor E id Extracting an image x a Identity information of (I) a Then using a structural encoder E s Image x b And its corresponding semantic graph s b Coded as structural features F b (ii) a Finally using a decoder G dec Will I a And F b Decoding into a new pedestrian image
Figure BDA0003767448010000045
Namely, generating an image
Figure BDA0003767448010000046
With a pedestrian y b Structural feature of (1) and pedestrian y a The identity of (2).
Further, the method for performing identity feature migration in step 2 specifically includes:
in a position toImage x a Identity feature of (2) is migrated to image x b In the process of (1), firstly, the image x is processed a Corresponding semantic graph s a Carrying out pretreatment; semantic graph s a Includes pedestrian y a The semantic information of (1) is used by dividing all the semantic information into 5 parts of a head, an upper body, a lower body, shoes and a background according to the spatial position relationship of the semantic information
Figure BDA0003767448010000047
Represents; then, the network E is extracted by the identity feature id The identity characteristic of each part of the pedestrian is extracted and calculated as follows:
Figure BDA0003767448010000048
Figure BDA0003767448010000049
in the process of calculation
Figure BDA00037674480100000410
Is automatically expanded into 3-dimensional, an indicates that the corresponding element is multiplied; wherein
Figure BDA00037674480100000411
And
Figure BDA00037674480100000412
affine parameters containing identity information of each semantic part; the identity information injection of the pedestrian image is realized through self-adaptive example normalization operation, and the self-adaptive example normalization operation is defined as follows:
Figure BDA00037674480100000413
wherein mu (-) is the operation of taking the mean value, and sigma (-) is the operation of taking the standard deviation; the self-adaptive example normalization operation replaces affine parameters with conditional style information on the basis of the example normalization operation so as to achieve the purpose of style transformation;
there are two cases of identity migration:
when identity label y a ≠y b If so, generating the cross identity, otherwise, generating the same identity; under the condition of identity generation, generating real images corresponding to the generated images in a training set; to generate an image
Figure BDA0003767448010000051
Not only can obtain the pedestrian y a Can also maintain clear structural features, utilize
Figure BDA0003767448010000056
Loss supervised training of the generated images:
Figure BDA0003767448010000052
when identity label y ab Time, image x a And image x b The generated images can be reconstructed by supervised learning, so that the generator learns complete structural information.
Further, the specific method of training in a counterlearning manner in step 2 of the present invention includes:
training is carried out between the generator G and the discriminator D in a counterstudy mode to generate images
Figure BDA0003767448010000053
More visually realistic, the penalty of generator G versus discriminator D is defined as follows:
Figure BDA0003767448010000054
Figure BDA0003767448010000055
and the WGAN-GP is used for optimizing the loss resistance in the training process, so that the training process is more stable.
Further, the method for constructing a gradient enhancement based on a local mass attention mechanism in step 3 specifically includes:
in the local quality attention mechanism, a no-reference image quality evaluation model BIECON is used for scoring non-overlapping patches in a generated image, after evaluation is completed, each non-overlapping patch area in the generated image can obtain a score between [0 and 1], the closer the score is to 0, the worse the quality is, and otherwise, the better the quality is; taking the mass fraction of each patch as the mass fraction of each pixel in the patch, and acquiring a mass fraction matrix Q with the same input size; finally, the local mass attention mechanism is realized by:
M=1-Q
the larger the median value of the attention matrix M, the worse the pixel quality, and the generator focuses on the region;
loss by formula during the gradient pass back stage
Figure BDA0003767448010000061
And the gradient Delta of the discriminator is calculated according to the parameters of the discriminator D Then from the gradient Δ of the arbiter D Computationally generating samples
Figure BDA0003767448010000062
Gradient of (2)
Figure BDA0003767448010000063
In a standard generative confrontation network, the gradient of the generative sample will be used directly to update the parameters of the generator, while the local quality attention based gradient enhancement method utilizes an attention matrix M versus the gradient of the generative sample
Figure BDA0003767448010000064
Modifying by using the product of the corresponding elementsNow:
Figure BDA0003767448010000065
where α is the hyperparameter of the tuning weight, the generator updates the parameters of the model using the modified gradient.
Further, the method for performing joint training in step 4 of the present invention includes:
different loss functions are adopted for the generated image and the real image, the triple loss function is applied to training of the generated image, and the formula is defined as follows:
Figure BDA0003767448010000066
wherein B and E represent the number of identities and instances in the mini-batch, respectively; f. of a 、f p 、f n Respectively representing feature vectors of an anchor point sample, a positive sample and a negative sample extracted from a pedestrian re-identification network, wherein gamma is a boundary hyper-parameter between an intra-class distance and an inter-class distance; the triple loss is characterized in that the distance between an anchor point sample and a positive sample is shortened, and the distance between a negative sample and the anchor point sample is lengthened, so that discriminant feature representation is learned; for real images, learning is done using ID loss:
Figure BDA0003767448010000071
where x represents the true image in the training dataset and p (y | x) represents the probability that x is predicted to be its true identity label y;
through an overall objective of optimizing the weighted sum of the losses, the combined training generates a confrontation network and a pedestrian re-recognition network:
Figure BDA0003767448010000072
wherein
Figure BDA0003767448010000073
Is to combat the loss, for ensuring that the generator generates a visually realistic image, λ id 、λ rec 、λ tri Is a hyper-parameter used to balance the associated loss term.
Further, the method in step 4 of the present invention further includes:
because the generation of the countermeasure network cannot generate new identities in the process of generating images, in order to prevent the pedestrian re-recognition model from being over-fitted, a two-stage training mode is adopted for the pedestrian re-recognition model; performing joint training by using an overall target in the first stage, and introducing an LSRO method to further fine-tune the model in the second stage; the LSRO method is used to reduce the likelihood of model overfitting, and assigns a uniformly distributed label to the generated image, which is defined as follows:
Figure BDA0003767448010000074
wherein
Figure BDA0003767448010000075
Denotes the generation of an image, k ∈ [1, K ]]Thus, therefore, it is
Figure BDA0003767448010000076
Representing a generated image
Figure BDA0003767448010000077
The probability of belonging to each type of identity is 1/K; the real images and the generated images are trained by using ID loss, and the loss of the real images and the loss of the generated images are unified as follows:
Figure BDA0003767448010000078
for real images, Z =0; for the generated image, Z =1.
The invention has the following beneficial effects:
(1) In order to solve the problem that random noise and attitude key points cannot accurately guide the generation of pedestrian features, a semantic graph is introduced into the generation process of a pedestrian image, and a semantic graph-guided identity migration generation countermeasure network is provided. By means of the accurate division of different areas of the pedestrian by the semantic graph, the accurate editing of the pedestrian image is achieved, and the generation quality of the pedestrian image is improved. The pedestrian identity in the pedestrian image is migrated to different pedestrian images through the identity migration generation countermeasure network, the diversity of the pedestrian re-identification data set is increased, and therefore the robustness of the model to differences of the background, the visual angle, the posture and the like is improved.
(2) In order to solve the problem of the generation quality imbalance of the local area of the generation countermeasure network, a gradient enhancement method based on a local quality attention mechanism is provided, so that the generation countermeasure network can not only adjust the generation quality of the image globally, but also improve the quality of the image locally.
(3) In order to enable the pedestrian re-recognition network to better utilize the generated image, a joint training mode of the generation countermeasure network and the pedestrian re-recognition network is provided, on one hand, the pedestrian re-recognition network is utilized to classify the generated image of the generation countermeasure network to promote the identity transfer capability of the generation countermeasure network, and on the other hand, the pedestrian re-recognition network learns the feature representation with more discriminative power by means of the image generated by the generation countermeasure network.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
FIG. 1 is an overall structure of a model of an embodiment of the invention;
FIG. 2 is a homonym migration of an embodiment of the present invention;
FIG. 3 is a two-stage pedestrian re-identification network training of an embodiment of the present invention;
FIG. 4 is a gradient enhancement method based on a local mass attention mechanism according to an embodiment of the present invention;
FIG. 5 shows the identity migration results of the model on the Market-1501 data set in accordance with an embodiment of the present invention;
FIG. 6 is a flowchart of the overall training of the model according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
The first embodiment is as follows:
the pedestrian re-identification method for generating the countermeasure network based on identity migration comprises the following steps:
(1) And constructing a semantic graph-based identity migration generation countermeasure network model.
Semantic graph-based identity migration generation countermeasure network routing structure encoder E s Identity information extractor E id Decoder G dec And a discriminator D, where E s 、E id And G dec The combination is a generator G, and forms a generation countermeasure network with a discriminator D, and training is carried out by using the countermeasure loss. Defining a training data set as
Figure BDA0003767448010000091
Each training sample is formed by pedestrian images
Figure BDA0003767448010000092
Identity tag y of image n ∈[1,K]And semantic image of pedestrian
Figure BDA0003767448010000093
Composition, where N represents the number of images in the dataset, K represents the number of identities in the dataset, C represents the number of categories of semantic tags, and H and W represent the height and width of the images, respectively. In the process of training to generate the confrontation network, two real samples are randomly taken out of the training data set
Figure BDA0003767448010000094
And
Figure BDA0003767448010000095
wherein a ∈ [1, N ]]And b ∈ [1, N ]]To convert the image x a Identity feature of (2) is migrated to image x b In the above, the generator G first uses the identity extractor E id Extracting an image x a Identity information of (I) a Then using a structural encoder E s Image x b And its corresponding semantic graph s b Coded as structural features F b . Finally using a decoder G dec Will I a And F b Decoding into new pedestrian images
Figure BDA0003767448010000096
Should have a pedestrian y b Structural feature of (a) and pedestrian y a The identity of (2).
Specifically, image x is to be displayed a Identity feature of (2) to image x b In the process of (2), firstly, the image x needs to be processed a Corresponding semantic image s a And (4) carrying out pretreatment. Semantic image s a Includes pedestrian y a The semantic information of (1) roughly divides all the semantic information into 5 parts of a head, an upper body, a lower body, shoes and a background according to the spatial position relationship of the semantic information, and uses
Figure BDA0003767448010000101
And (4) showing. Then, through the identity feature extraction network E id Extracting the identity characteristics of each part of the pedestrian, and calculating as follows:
Figure BDA0003767448010000102
Figure BDA0003767448010000103
in the process of calculation
Figure BDA0003767448010000104
Is automatically expanded to 3-dimensional, an indicates that the corresponding element is multiplied. Wherein
Figure BDA0003767448010000105
And
Figure BDA0003767448010000106
affine parameters containing identity information for each semantic part. The identity information injection of the pedestrian image is realized through self-adaptive example normalization operation, and the self-adaptive example normalization operation is defined as follows:
Figure BDA0003767448010000107
where μ (-) is the mean and σ (-) is the standard deviation. The adaptive instance normalization operation replaces affine parameters with conditional style information on the basis of the instance normalization operation, so that the purpose of style transformation is achieved.
By using the semantic labels, the identity features contain accurate feature information of each semantic part of the pedestrian image, and the identity information is accurately migrated to the target image by using the style migration capability of the adaptive instance normalization operation, so that the generator G has more accurate identity feature migration capability.
There are two cases of identity migration, when identity label y a ≠y b And if so, generating the cross identity, otherwise, generating the same identity. Under the condition of identity generation, the generated images have corresponding real images in the training data set. To generate an image
Figure BDA0003767448010000108
Not only can obtain the pedestrian y a Can also maintain clear structural features, utilize
Figure BDA00037674480100001010
Loss supervised training of the generated images:
Figure BDA0003767448010000109
when identity label y ab Time, image x a And image x b The generated images can be reconstructed by supervised learning, so that the generator learns complete structural information.
Generating an image
Figure BDA0003767448010000111
Should be able to correctly obtain the pedestrian y a For which a pedestrian re-identification network is used to constrain the generation of images
Figure BDA0003767448010000112
The identity of (c). Image generation through pedestrian re-identification network pair
Figure BDA0003767448010000113
Performing discrimination, and generating image by using identity loss function
Figure BDA0003767448010000114
The constraint is specifically expressed as follows:
Figure BDA0003767448010000115
wherein
Figure BDA0003767448010000116
To represent
Figure BDA0003767448010000117
Is predicted as image x a Class label y of a The probability of (c). Identity loss by minimizing generators
Figure BDA0003767448010000118
So that the generator learns the identity characteristic knowledge of the pedestrian re-identification network.
Between generators and discriminatorsTraining against learning to generate images
Figure BDA0003767448010000119
Is more visually realistic. The penalty of generator versus arbiter is defined as follows:
Figure BDA00037674480100001110
Figure BDA00037674480100001111
and the WGAN-GP is used for optimizing the loss resistance in the training process, so that the training process is more stable.
(2) A gradient enhancement method based on a local mass attention mechanism is constructed.
Training is carried out between the generator and the discriminator in a mode of resisting learning, the generator should generate images which are as real as possible to confuse the discriminator, and the discriminator needs to distinguish the generated images from the real images. In the training phase of the generator, the discriminator takes the generated image as input and predicts its authenticity. A loss value is then calculated based on the prediction, which is ultimately used by the arbiter to provide feedback information to the generator. The generator updates the parameters by using the feedback information, thereby improving the generation capability of the image and enabling the generated image to be more visually real. Based on the above analysis, it was observed that the feedback information provided by the discriminator was calculated only from a value representing the authenticity of the entire image, while ignoring the problem of local areas in the image creating an imbalance. The imbalance is represented by the phenomena of artifact, blurring and the like in a local area of a generated image, and the phenomena can further influence the identity discrimination of a pedestrian re-identification network on the generated image.
The proposed method consists of two parts, local mass attention mechanism and gradient enhancement. The effect of the local quality attention mechanism is to find the area with poor local generation in the generated image, so that the generator is more interested in the generation of the local area. And (3) scoring the non-overlapping patches in the generated image by using a non-reference image quality evaluation model BIECON, and after the evaluation is finished, each non-overlapping patch area in the generated image obtains a score between [0 and 1], wherein the score is closer to 0, the quality is worse, and otherwise, the quality is better. The quality score of each patch is taken as the quality score of each pixel in the patch, so that a quality score matrix Q with the same size as the input can be obtained. Finally, the local mass attention mechanism is realized by:
M=1-Q#(8)
the larger the value of the attention matrix M, the worse the quality of the pixels, the generator should be given an important focus on this area. Loss by formula during the gradient pass back stage
Figure BDA0003767448010000121
And calculating the gradient Delta of the discriminator according to the parameters of the discriminator D Then the gradient Delta from the discriminator D Computationally generating samples
Figure BDA0003767448010000122
Gradient of (2)
Figure BDA0003767448010000123
In a standard generative confrontation network, the gradient of the generative sample will be used directly to update the parameters of the generator, while the local quality attention based gradient enhancement method utilizes an attention matrix M versus the gradient of the generative sample
Figure BDA0003767448010000124
The modification is made using the product of the corresponding elements to achieve:
Figure BDA0003767448010000125
where α is a hyperparameter that adjusts the weights, α =0.2 is set following XAI-GAN. The generator updates the parameters of the model by using the modified gradient, and intuitively, the attention matrix guides the generator to pay more attention to the generation condition of the local area by increasing the gradient of the poor quality area, so that the model not only can improve the overall quality of the image, but also can further optimize the image quality from the local part.
(3) And establishing a joint training mode for generating a confrontation network and a pedestrian re-recognition network.
Training of the pedestrian re-identification network is combined with generation of the countermeasure network, and a new pedestrian image generated by the generation of the countermeasure network is used for training of the pedestrian re-identification network together with the real image in the training data set. The identity information of the generated image is derived from the image providing the identity feature, and therefore the identity label of the generated image should ideally coincide with the image providing the identity feature. However, training for generating the countermeasure network is a gradual process, and in the early stage of training, the quality of generated images is not perfect, and accurate identity migration cannot be realized. Therefore, the direct application of the identity label to the generated image can mislead the learning of the identity characteristics by the people re-identification network, further influence the accuracy of identity migration, and cause the instability and even collapse of training. To avoid the above problem, different loss functions are employed for the generated image and the real image. The hard sample mining triplet loss function is applied to the training to generate the image, and the formula is defined as follows:
Figure BDA0003767448010000131
where B and E represent the identity and number of instances in the mini-batch, respectively. f. of a 、f p 、f n Respectively representing the feature vectors of an anchor point sample, a positive sample and a negative sample extracted from the pedestrian re-identification network, wherein gamma is a boundary hyperparameter between the intra-class distance and the inter-class distance and is set to be 0.3 in the experiment. The triplet loss learns the discriminative feature representation by narrowing the distance between the anchor sample and the positive sample and by narrowing the distance between the negative sample and the anchor sample. For real images, learning is done using ID loss:
Figure BDA0003767448010000132
where x represents the true image in the training dataset and p (y | x) represents the probability that x is predicted to be its true identity label y.
By optimizing the overall goal consisting of the weighted sum of the losses (4), (5), (6), (7), (10), and (11), the joint training generates a countermeasure network and a pedestrian re-recognition network:
Figure BDA0003767448010000133
wherein
Figure BDA0003767448010000134
Is to combat the loss, for ensuring that the generator generates a visually realistic image, λ id 、λ rec 、λ tri Is a hyperparameter for balancing the relevant loss terms.
Since the generation of the countermeasure network does not generate new identities in the process of generating the image, in order to prevent the pedestrian re-recognition model from being over-fitted, a two-stage training mode as shown in fig. 3 is adopted for the pedestrian re-recognition model. The above mentioned overall targets are used for joint training in the first stage, and the LSRO method is introduced in the second stage to further fine-tune the model. The LSRO method is used to reduce the likelihood of model overfitting, and assigns a uniformly distributed label to the generated image, which is defined as follows:
Figure BDA0003767448010000141
wherein
Figure BDA0003767448010000142
Representing the generation of an image, k ∈ [1, K ]]Thus, it is possible to
Figure BDA0003767448010000143
Representing a generated image
Figure BDA0003767448010000144
The probability of belonging to each class of identity is 1/K. The real image and the generated image are both trained by using ID loss, and the loss of the real image and the loss of the generated image are unified as follows by combining a formula (5):
Figure BDA0003767448010000145
for real images, Z =0. For the generated image, Z =1.
Example two:
the pedestrian re-identification method for generating the countermeasure network based on identity migration comprises the following steps:
(1) Training data set preparation
A Market-1501 data set is obtained, 6 cameras of a self-clearing university campus are collected, and 1501 pedestrians are marked in total. The 751 pedestrians are marked to be used in a training set, the 750 pedestrians are marked to be used in a testing set, and the repetitive pedestrian labels are not arranged in the training set and the testing set. The semantic image corresponding to the pedestrian image is generated through a Human body semantic analysis model (Self Correction for Human matching), the Human body semantic analysis model allocates a semantic category to each pixel in the image, and the generated semantic image comprises 20 semantic categories which are respectively background, hat, hair, gloves, sunglasses, jacket, one-piece dress, coat, socks, trousers, jumpsuits, scarf, skirt, face, left arm, right arm, left leg, right leg, left shoe and right shoe. All semantic categories are roughly divided into 5 parts, namely a head part, an upper body, a lower body, shoes and a background according to the spatial position relation of the semantic categories. In the identity migration process, the semantic graph is used for independently extracting the features of each part to realize fine feature extraction, and then the features are respectively injected into the generation countermeasure network to generate a pedestrian image with more accurate features. All input images are uniformly scaled to a pixel size of 256 × 128 before training.
(2) Model construction
All models are realized through a deep learning frame Pythrch, and the overall structure of the model is shown in FIG. 1 and comprises a generator G, a discriminator D and a pedestrian re-identification network R. The generator G adopts an encoder-decoder architecture, the structure encoder E s Is a shallow network of three convolutional layers, and, in contrast, a decoder G dec Is a network composed of three layers of transposed convolutions. Identity information extractor E id Five convolutional layers are adopted, and global average pooling is used at the last layer of the network to obtain an adaptive instance normalization parameter I, all E id The network parameters are shared. The generator G uses five residual blocks to inject the identity information of different semantic regions into the structural feature F, respectively, following the paper MUNIT that each residual block contains two adaptive instance normalization layers. Discriminator D follows the popular PatchGAN structure. The structure of the pedestrian re-identification network R is based on ResNet50, the pre-training parameters on ImageNet are used for initializing the pedestrian re-identification network R, the dimensionality of the full connection layer is modified into K, and the K represents the number of identities in the training data set.
(3) Joint training generation of confrontation network and pedestrian re-recognition network
During training, generation of a countermeasure network and a pedestrian re-recognition network are trained by using an Adam optimizer, and a parameter beta is set 1 =0.5,β 2 =0.999. Parameter setting in total loss is λ id =1、λ rec =10、λ tri =1. In the training of the first stage, the generation countermeasure network and the pedestrian re-recognition network are jointly trained, the learning rates of the generator and the discriminator are set to be 0.0001, and the learning rate of the pedestrian re-recognition network is set to be 0.00035. The batch size is set to 32, the number of identities B is set to 8 and the number of instances E is set to 4 in a batch. In the second stage, the training for generating the countermeasure network is stopped, and the pedestrian re-identification network is finely adjusted by using the LSRO loss. Throughout the experiment all input images were resized to 256 x 128, in order to remove the effect of the original identity information, the texture encoder E s Is converted into a grayscale image.
(4) Analysis of experiments
The evaluation of the model is divided into an image generation evaluation and a pedestrian re-recognition evaluation. The image generation evaluation is presented by migrating the identity of the pedestrian image to a different image using a generation countermeasure network, the result of which is shown in fig. 5. In fig. 5, the first column of images represents the source image of the identity, the first row represents the target image of the identity migration, and the target image provides structural information in the identity migration. The other images in fig. 5 are images after identity migration, and it can be seen from the images that the generated images better retain the structural information of the target image, and accurately complete the migration of the identity information, showing that the identity migration generation countermeasure network in the present invention has better image generation capability and identity migration capability. The pedestrian re-identification evaluation criteria of the model comprise (1) a Rank-n value, which represents the probability that at least 1 image in the first n images of the query result meets the query result; (2) mAP (mean average precision), which reflects how well the retrieved person is in the query database with all correct pictures in front of the query result. The pedestrian re-identification network achieves 93.9% of accuracy on a Rank-1 value on a Market-1501 test data set and achieves 83.5% on an mAP. The identity of the pedestrian image is transferred to different images by using the generated countermeasure network, so that the diversity of the training data set is effectively expanded, and the robustness of the pedestrian re-recognition network to differences of backgrounds, visual angles, postures and the like is improved.
It will be understood that modifications and variations can be made by persons skilled in the art in light of the above teachings and all such modifications and variations are intended to be included within the scope of the invention as defined in the appended claims.

Claims (8)

1. A pedestrian re-identification method for generating a confrontation network based on identity migration is characterized by comprising the following steps:
step 1, acquiring a pedestrian image data set, generating a semantic graph corresponding to a pedestrian image through a human body semantic analysis model, allocating a semantic category to each pixel in the pedestrian image through the human body semantic analysis model, and dividing the pedestrian image introduced with the semantic graph into a training set and a test set;
step 2, constructing an integral model of pedestrian re-identification, which comprises a generator G, a discriminator D and a pedestrian re-identification network R; the generator G comprises a structural encoder E s Identity information extractor E id Decoder G dec The generator G and the discriminator D form a generated confrontation network based on semantic graph identity migration, and the generator G and the discriminator D are trained in a confrontation learning mode;
step 3, constructing a gradient enhancement method based on a local quality attention mechanism, and improving a generation countermeasure network;
step 4, establishing a joint training mode for generating the confrontation network and the pedestrian re-recognition network, inputting a training set, outputting a new generated image through the generated confrontation network, using the generated image and the pedestrian image in the training set for training the pedestrian re-recognition network to obtain a trained integral model, and testing by using the test set;
and 5, inputting the pedestrian image to be recognized, and outputting a pedestrian re-recognition result through the trained pedestrian re-recognition network.
2. The pedestrian re-identification method for generating the countermeasure network based on the identity migration as claimed in claim 1, wherein the method in the step 1 comprises:
acquiring a pedestrian image data set, wherein each pedestrian in the pedestrian image has a pedestrian label, and dividing the pedestrian label into a training set and a testing set, wherein the training set and the testing set do not have repeated pedestrian labels; the semantic image corresponding to the pedestrian image is generated through a human body semantic analysis model, the human body semantic analysis model allocates a semantic category to each pixel in the image, and the generated semantic image comprises 20 semantic categories which are respectively a background, a hat, hair, gloves, sunglasses, a coat, a one-piece dress, a coat, socks, trousers, jumpsuits, a scarf, a skirt, a face, a left arm, a right arm, a left leg, a right leg, a left shoe and a right shoe; dividing all semantic categories into 5 parts, namely a head part, an upper body, a lower body, shoes and a background according to the spatial position relation of the semantic categories; the semantic graph is used for independently extracting the features of each part, so that fine feature extraction is realized; and all images are scaled uniformly to a certain pixel size before training.
3. The pedestrian re-identification method for generating the countermeasure network based on the identity migration as claimed in claim 1, wherein the method in the step 2 comprises:
semantic graph-based identity migration generation countermeasure network routing structure encoder E s Identity information extractor E id Decoder G dec And a discriminator D, where E s 、E id And G dec The combination is a generator G which forms a generation countermeasure network with the discriminator D, and the countermeasure loss is used for training;
define a training set as
Figure FDA0003767448000000021
Each training sample is formed by pedestrian images
Figure FDA0003767448000000022
Identity label y of image n ∈[1,K]And the semantic map of a pedestrian
Figure FDA0003767448000000023
Composition, where N represents the number of images in the dataset, K represents the number of identities in the dataset, C represents the number of categories of semantic tags, H and W represent the height and width of the images, respectively;
in the process of training to generate the confrontation network, two real samples are randomly taken out of the training set
Figure FDA0003767448000000024
And
Figure FDA0003767448000000025
wherein a is [1, N ]]And b is ∈ [1, N ]]To convert an image x a Identity feature of (2) is migrated to image x b In the above, the generator G first uses the identity extractor E id Extracting an image x a Of (1)Identity information I a Then using a structural encoder E s Image x b And its corresponding semantic graph s b Coded as structural features F b (ii) a Finally using a decoder G dec Will I a And F b Decoding into a new pedestrian image
Figure FDA0003767448000000026
Namely, generating an image
Figure FDA0003767448000000027
With a pedestrian y b Structural feature of (1) and pedestrian y a The identity of (2).
4. The pedestrian re-identification method based on the identity migration generation countermeasure network of claim 3, wherein the method for identity feature migration in the step 2 specifically comprises:
in the image x a Identity feature of (2) to image x b In the process of (2), firstly, the image x is processed a Corresponding semantic graph s a Carrying out pretreatment; semantic graph s a Includes pedestrian y a The semantic information of (1) is used by dividing all the semantic information into 5 parts of a head, an upper body, a lower body, shoes and a background according to the spatial position relationship of the semantic information
Figure FDA0003767448000000031
Represents; then, the network E is extracted by the identity feature id The identity characteristic of each part of the pedestrian is extracted and calculated as follows:
Figure FDA0003767448000000032
Figure FDA0003767448000000033
in the process of calculation
Figure FDA0003767448000000034
Is automatically extended to 3-dimension, an indicates that the corresponding element is multiplied; wherein
Figure FDA0003767448000000035
And
Figure FDA0003767448000000036
affine parameters for identity information containing each semantic portion; the identity information injection of the pedestrian image is realized through self-adaptive example normalization operation, and the self-adaptive example normalization operation is defined as follows:
Figure FDA0003767448000000037
wherein mu (-) is an operation of taking a mean value, and sigma (-) is an operation of taking a standard deviation; the self-adaptive example normalization operation replaces affine parameters with conditional style information on the basis of the example normalization operation so as to achieve the purpose of style conversion;
there are two cases of identity migration:
when identity label y a ≠y b If so, generating the cross identity, otherwise, generating the same identity; under the condition of identity generation, generating real images corresponding to the generated images in a training set; to generate an image
Figure FDA0003767448000000038
Not only can obtain the pedestrian y a Can also maintain clear structural characteristics, using 1 Loss supervised training of the generated images:
Figure FDA0003767448000000039
when identity label y a =y b When the utility model is used, the water is discharged,image x a And image x b The generated images can be reconstructed by supervised learning, so that the generator learns complete structural information.
5. The pedestrian re-identification method for generating the countermeasure network based on identity migration according to claim 4, wherein the specific method of training in the countermeasure learning manner in the step 2 comprises:
training is carried out between the generator G and the discriminator D in a mode of counterstudy, so that an image is generated
Figure FDA0003767448000000047
More visually realistic, the penalty of generator G versus discriminator D is defined as follows:
Figure FDA0003767448000000041
Figure FDA0003767448000000042
and the WGAN-GP is used for optimizing the loss resistance in the training process, so that the training process is more stable.
6. The pedestrian re-identification method based on identity migration to generate the countermeasure network according to claim 1, wherein the step 3 of constructing the gradient enhancement method based on the local quality attention mechanism specifically comprises:
in the local quality attention mechanism, a no-reference image quality evaluation model BIECON is used for scoring non-overlapping patches in a generated image, after evaluation is completed, each non-overlapping patch area in the generated image can obtain a score between [0 and 1], the closer the score is to 0, the worse the quality is, and otherwise, the better the quality is; taking the mass fraction of each patch as the mass fraction of each pixel in the patch, and acquiring a mass fraction matrix Q with the same input size; finally, the local mass attention mechanism is realized by:
M=1-Q
the larger the median value of the attention matrix M, the worse the pixel quality, and the generator focuses on the region;
loss by formula during the gradient pass back stage
Figure FDA0003767448000000043
And calculating the gradient Delta of the discriminator according to the parameters of the discriminator D Then the gradient Delta from the discriminator D Computationally generated samples
Figure FDA0003767448000000044
Gradient of (2)
Figure FDA0003767448000000045
In a standard generative confrontation network, the gradient of the generative sample will be used directly to update the parameters of the generator, while the local quality attention based gradient enhancement method utilizes an attention matrix M versus the gradient of the generative sample
Figure FDA0003767448000000046
The modification is made using the product of the corresponding elements to achieve:
Figure FDA0003767448000000051
where α is the hyperparameter of the tuning weight, the generator updates the parameters of the model using the modified gradient.
7. The pedestrian re-recognition method for generating the countermeasure network based on identity migration as claimed in claim 1, wherein the method for performing the joint training in the step 4 comprises:
different loss functions are adopted for the generated image and the real image, the triple loss function is applied to training of the generated image, and the formula is defined as follows:
Figure FDA0003767448000000052
wherein B and E represent the number of identities and instances in the mini-batch, respectively; f. of a 、f p 、f n Respectively representing feature vectors of an anchor point sample, a positive sample and a negative sample extracted from a pedestrian re-identification network, wherein gamma is a boundary hyper-parameter between an intra-class distance and an inter-class distance; the triple loss is characterized in that the distance between an anchor point sample and a positive sample is shortened, and the distance between a negative sample and the anchor point sample is lengthened, so that discriminant feature representation is learned; for real images, learning is done using ID loss:
Figure FDA0003767448000000053
where x represents the true image in the training dataset and p (y | x) represents the probability that x is predicted to be its true identity label y;
by optimizing the overall objective of the weighted sum of the losses, the joint training generates a countermeasure network and a pedestrian re-recognition network:
Figure FDA0003767448000000054
wherein
Figure FDA0003767448000000055
Is to combat the loss, for ensuring that the generator generates a visually realistic image, λ id 、λ rec 、λ tri Is a hyper-parameter used to balance the associated loss term.
8. The pedestrian re-identification method for generating the countermeasure network based on the identity migration as claimed in claim 7, wherein the method in the step 4 further comprises:
because the generation of the countermeasure network cannot generate new identities in the process of generating the images, in order to prevent the pedestrian re-recognition model from being over-fitted, a two-stage training mode is adopted for the pedestrian re-recognition model; performing joint training by using an overall target in the first stage, and introducing an LSRO method to further fine-tune the model in the second stage; the LSRO method is used to reduce the likelihood of model overfitting, and assigns a uniformly distributed label to the generated image, which is defined as follows:
Figure FDA0003767448000000061
wherein
Figure FDA0003767448000000062
Representing the generation of an image, k ∈ [1, K ]]Thus, it is possible to
Figure FDA0003767448000000063
Representing a generated image
Figure FDA0003767448000000064
The probability of belonging to each type of identity is 1/K; the real image and the generated image are trained by using ID loss, and the loss of the real image and the loss of the generated image are unified as follows:
Figure FDA0003767448000000065
for real images, Z =0; for the generated image, Z =1.
CN202210890765.1A 2022-07-27 2022-07-27 Pedestrian re-recognition method based on identity migration generation countermeasure network Active CN115205903B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210890765.1A CN115205903B (en) 2022-07-27 2022-07-27 Pedestrian re-recognition method based on identity migration generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210890765.1A CN115205903B (en) 2022-07-27 2022-07-27 Pedestrian re-recognition method based on identity migration generation countermeasure network

Publications (2)

Publication Number Publication Date
CN115205903A true CN115205903A (en) 2022-10-18
CN115205903B CN115205903B (en) 2023-05-23

Family

ID=83583415

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210890765.1A Active CN115205903B (en) 2022-07-27 2022-07-27 Pedestrian re-recognition method based on identity migration generation countermeasure network

Country Status (1)

Country Link
CN (1) CN115205903B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116276956A (en) * 2022-12-01 2023-06-23 北京科技大学 Method and device for simulating and learning operation skills of customized medicine preparation robot
CN117351522A (en) * 2023-12-06 2024-01-05 云南联合视觉科技有限公司 Pedestrian re-recognition method based on style injection and cross-view difficult sample mining

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110659586A (en) * 2019-08-31 2020-01-07 电子科技大学 Cross-view gait recognition method based on identity maintenance cyclic generation type countermeasure network
CN110688966A (en) * 2019-09-30 2020-01-14 华东师范大学 Semantic-guided pedestrian re-identification method
CN110688897A (en) * 2019-08-23 2020-01-14 深圳久凌软件技术有限公司 Pedestrian re-identification method and device based on joint judgment and generation learning
CN111126155A (en) * 2019-11-25 2020-05-08 天津师范大学 Pedestrian re-identification method for generating confrontation network based on semantic constraint
CN111666851A (en) * 2020-05-28 2020-09-15 大连理工大学 Cross domain self-adaptive pedestrian re-identification method based on multi-granularity label
WO2020186914A1 (en) * 2019-03-20 2020-09-24 北京沃东天骏信息技术有限公司 Person re-identification method and apparatus, and storage medium
CN112949608A (en) * 2021-04-15 2021-06-11 南京邮电大学 Pedestrian re-identification method based on twin semantic self-encoder and branch fusion
CN113592982A (en) * 2021-09-29 2021-11-02 北京奇艺世纪科技有限公司 Identity migration model construction method and device, electronic equipment and readable storage medium
WO2021258920A1 (en) * 2020-06-24 2021-12-30 百果园技术(新加坡)有限公司 Generative adversarial network training method, image face swapping method and apparatus, and video face swapping method and apparatus

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020186914A1 (en) * 2019-03-20 2020-09-24 北京沃东天骏信息技术有限公司 Person re-identification method and apparatus, and storage medium
CN110688897A (en) * 2019-08-23 2020-01-14 深圳久凌软件技术有限公司 Pedestrian re-identification method and device based on joint judgment and generation learning
CN110659586A (en) * 2019-08-31 2020-01-07 电子科技大学 Cross-view gait recognition method based on identity maintenance cyclic generation type countermeasure network
CN110688966A (en) * 2019-09-30 2020-01-14 华东师范大学 Semantic-guided pedestrian re-identification method
CN111126155A (en) * 2019-11-25 2020-05-08 天津师范大学 Pedestrian re-identification method for generating confrontation network based on semantic constraint
CN111666851A (en) * 2020-05-28 2020-09-15 大连理工大学 Cross domain self-adaptive pedestrian re-identification method based on multi-granularity label
WO2021258920A1 (en) * 2020-06-24 2021-12-30 百果园技术(新加坡)有限公司 Generative adversarial network training method, image face swapping method and apparatus, and video face swapping method and apparatus
CN112949608A (en) * 2021-04-15 2021-06-11 南京邮电大学 Pedestrian re-identification method based on twin semantic self-encoder and branch fusion
CN113592982A (en) * 2021-09-29 2021-11-02 北京奇艺世纪科技有限公司 Identity migration model construction method and device, electronic equipment and readable storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HUA GAO ET AL.: "Part Semantic Segmentation Aware Representation Learning for Person Re-Identification" *
游文婧: "基于局部特征和对抗生成的行人重识别算法研究及应用" *
高靖宇: "基于深度网络的行人识别方法研究" *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116276956A (en) * 2022-12-01 2023-06-23 北京科技大学 Method and device for simulating and learning operation skills of customized medicine preparation robot
CN116276956B (en) * 2022-12-01 2023-12-08 北京科技大学 Method and device for simulating and learning operation skills of customized medicine preparation robot
CN117351522A (en) * 2023-12-06 2024-01-05 云南联合视觉科技有限公司 Pedestrian re-recognition method based on style injection and cross-view difficult sample mining

Also Published As

Publication number Publication date
CN115205903B (en) 2023-05-23

Similar Documents

Publication Publication Date Title
CN108846358B (en) Target tracking method for feature fusion based on twin network
Masoud et al. A method for human action recognition
CN108416266B (en) Method for rapidly identifying video behaviors by extracting moving object through optical flow
CN108537743A (en) A kind of face-image Enhancement Method based on generation confrontation network
CN108960059A (en) A kind of video actions recognition methods and device
CN108198207A (en) Multiple mobile object tracking based on improved Vibe models and BP neural network
CN112418095A (en) Facial expression recognition method and system combined with attention mechanism
CN112784736B (en) Character interaction behavior recognition method based on multi-modal feature fusion
CN108363973B (en) Unconstrained 3D expression migration method
CN115205903B (en) Pedestrian re-recognition method based on identity migration generation countermeasure network
CN107016689A (en) A kind of correlation filtering of dimension self-adaption liquidates method for tracking target
CN113963032A (en) Twin network structure target tracking method fusing target re-identification
CN112364791B (en) Pedestrian re-identification method and system based on generation of confrontation network
Olague et al. Evolving head tracking routines with brain programming
CN114782977B (en) Pedestrian re-recognition guiding method based on topology information and affinity information
CN113870157A (en) SAR image synthesis method based on cycleGAN
CN112801019B (en) Method and system for eliminating re-identification deviation of unsupervised vehicle based on synthetic data
Li et al. Foldover features for dynamic object behaviour description in microscopic videos
AU2020102476A4 (en) A method of Clothing Attribute Prediction with Auto-Encoding Transformations
Tian et al. End-to-end thorough body perception for person search
Duan et al. An approach to dynamic hand gesture modeling and real-time extraction
Liu et al. Fast tracking via spatio-temporal context learning based on multi-color attributes and pca
Gong et al. Person re-identification based on two-stream network with attention and pose features
Karmakar et al. Pose invariant person re-identification using robust pose-transformation gan
CN112183215B (en) Human eye positioning method and system combining multi-feature cascading SVM and human eye template

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant