CN111428650A - Pedestrian re-identification method based on SP-PGGAN style migration - Google Patents

Pedestrian re-identification method based on SP-PGGAN style migration Download PDF

Info

Publication number
CN111428650A
CN111428650A CN202010226128.5A CN202010226128A CN111428650A CN 111428650 A CN111428650 A CN 111428650A CN 202010226128 A CN202010226128 A CN 202010226128A CN 111428650 A CN111428650 A CN 111428650A
Authority
CN
China
Prior art keywords
pedestrian
discriminator
pggan
picture
generator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010226128.5A
Other languages
Chinese (zh)
Other versions
CN111428650B (en
Inventor
孙艳丰
胡芸萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202010226128.5A priority Critical patent/CN111428650B/en
Publication of CN111428650A publication Critical patent/CN111428650A/en
Application granted granted Critical
Publication of CN111428650B publication Critical patent/CN111428650B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention provides a pedestrian re-identification method based on SP-PGGAN style migration. The method comprises the following steps: constructing an SP-PGGAN model based on the cycleGAN; simultaneously inputting the training set of the marked pedestrian re-identification data set and the training set of the unmarked pedestrian re-identification data set into an SP-PGGAN model for training, namely obtaining the training set of the marked pedestrian re-identification data set after migration through a generator G; training a classification network on the migrated training set with the labeled pedestrian re-recognition data set by using a pedestrian re-recognition model IDE to obtain a trained IDE model; and inputting the test set of the non-labeled pedestrian re-identification data set into the trained IDE model to realize the pedestrian re-identification of the non-labeled data set. The SP-PGGAN migration model designed by the invention is more accurate in the style migration process, so that the pedestrian re-identification effect without the labeled data set can be improved to a great extent.

Description

Pedestrian re-identification method based on SP-PGGAN style migration
Technical Field
The invention belongs to the field of computer vision, and particularly relates to technologies of deep learning, countermeasure network, image processing, feature extraction and the like. The pedestrian re-recognition method based on SP-PGGAN style migration can realize style migration from a labeled data set to a non-labeled data set, and the pedestrian re-recognition network trained after the labeled data set is migrated can improve the effect of pedestrian re-recognition of the non-labeled data set.
Background
Pedestrian Re-identification (Person Re-identification), also known as pedestrian Re-identification, is a technique that uses computer vision techniques to determine whether a particular pedestrian is present in an image or video sequence. Is widely considered as a sub-problem for image retrieval. Given a monitored pedestrian image, the pedestrian image is retrieved across the device. The camera aims to make up the visual limitation of the existing fixed camera, can be combined with a pedestrian detection/pedestrian tracking technology, and can be widely applied to the fields of intelligent video monitoring, intelligent security and the like. Pedestrian re-identification has gained increasing attention in recent years in the field of computer vision. The pedestrian re-identification has wide application prospects, including pedestrian retrieval, pedestrian tracking, street event detection, pedestrian action and behavior analysis and the like. The invention provides an effective migration model aiming at the challenge, so that the model trained by the marked data can be better applied to a non-marked data set, and the accuracy of pedestrian re-identification without marked data is improved.
The existing pedestrian re-identification data set aiming at non-labeled data mostly adopts training on a labeled data set, and a model after training is directly used for testing the non-labeled data set. In this way, the pedestrian re-identification accuracy rate directly used for testing is not high due to different bottom layer distributions of the two data sets. In recent years, with the development of anti-network and the progress of migration technology, it has become possible to perform style migration using unpaired datasets, and even datasets distributed in different underlying layers can be well subjected to style conversion. Under the background, how to perform more accurate migration on the basis of CycleGAN is one of the hot spots of image recognition research, and the method has a wide application prospect.
Disclosure of Invention
Aiming at the challenge of difficult data annotation in pedestrian re-identification, the invention provides a new migration model SP-PGGAN by utilizing a deep learning technology, and realizes pedestrian re-identification on the basis of the migration model. The existing pedestrian re-identification data set aiming at non-labeled data mostly adopts training on a labeled data set, and a model after training is directly used for testing the non-labeled data set. In this way, the pedestrian re-identification accuracy rate directly used for testing is not high due to different bottom layer distributions of the two data sets. The method comprises the steps of firstly constructing an SP-PGGAN model, then simultaneously inputting a training set with a labeled pedestrian re-identification data set and a training set without the labeled pedestrian re-identification data set into the constructed SP-PGGAN model for style migration, carrying out IDE training on the obtained training set with the labeled pedestrian re-identification data set after migration, and then realizing pedestrian re-identification on a test set with the unlabeled pedestrian re-identification data set. The main process of the invention is shown in the attached figure 1 and can be divided into the following three steps: constructing an SP-PGGAN model, and realizing style migration and pedestrian re-identification of the SP-PGGAN model.
(1) Construction of SP-PGGAN model
The invention firstly constructs an SP-PGGAN model, the model structure of which is improved on the basis of the cycleGAN which is essentially two mirror symmetric GANs to form an annular network. Two GANs share two generators and each have a local arbiter, i.e. there are two local arbiters and two generators in total. The SP-PGGAN is a generator which retains the CycleGAN on the basis of the CycleGAN, and a twin network is added after the generator is generated to guide the generation process of the generator. Meanwhile, two discriminators are replaced by a local discriminator and a global discriminator in parallel. Fig. 2 is a schematic diagram illustrating the invention of the SP-PGGAN model proposed by the present invention.
(2) Style migration for SP-PGGAN model
In order to better test on a label-free data set test set, the invention simultaneously inputs a training set of a labeled pedestrian re-identification data set and a training set of a label-free pedestrian re-identification data set into a constructed SP-PGGAN model for style migration. Two similarities are maintained during migration: firstly, if the style of the image from the labeled data set is transferred to the unlabeled data set, the style of the transferred image is consistent with the style of the unlabeled data set; and secondly, the ID information of the pedestrians in the image needing to be kept in the annotation data set before and after the image migration is unchanged. Such ID information is not the background of the image or the style of the image, but is the pedestrian region of the image having a potential relationship with the ID information other than the background, i.e., the annotation information of the pedestrian.
(3) Implementation of pedestrian re-identification
After the style is transferred, the invention needs further experiments of pedestrian re-identification. The invention utilizes the IDE model for pedestrian re-identification to train the classification network of the training set with the labeled data set obtained after the SP-PGGAN style migration to obtain the trained IDE model, and then realizes the pedestrian re-identification on the test set without the labeled data set.
Drawings
In order to more clearly illustrate the technical solution of the embodiments of the present invention, the drawings required in the description of the embodiments will be briefly introduced as follows:
FIG. 1 is a flow chart of a SP-PGGAN style migration method for pedestrian re-identification;
FIG. 2 is a graph comparing the SP-PGGAN model with the CycleGAN model;
FIG. 3 is a schematic diagram of a CycleGAN model network;
FIG. 4 is a network architecture diagram based on SP-PGGAN style migration;
FIG. 5 is a network architecture diagram of a pedestrian re-identification method (IDE);
advantageous effects
According to the invention, a twin network is added on the basis of the CycleGAN, so that two pictures with similar pedestrian information are more similar, and two pictures with dissimilar pedestrian information are more dissimilar. Meanwhile, in order to consider local and overall information in the pedestrian migration process and improve the effect of the pedestrian migration process, the invention uses the global discriminator and the local discriminator as the discriminators of the SP-PGGAN. The improved migration network can enable pictures generated in the pedestrian migration process to be better used for training of a classification network, and experiments show that the method realizes the improvement of mAP (mean average recognition precision) which is 12% more than that of pedestrians which are directly migrated.
Detailed Description
In light of the above description, a specific implementation flow is as follows, but the scope of protection of this patent is not limited to this implementation flow.
Step 1: construction of SP-PGGAN model
The invention firstly constructs a migration model of SP-PGGAN. The model can be used for the migration process of pedestrian re-identification, but is not limited to the migration process, the pedestrian re-identification is an application scene of the invention, and other similar scenes related to style migration can be general.
The SP-PGGAN model is an improvement on the basis of cycleGAN. The CycleGAN is bidirectional, and is formed by two mirror symmetric GANs to form a ring network, wherein the two GANs share two generators and are respectively provided with a local discriminator, namely the two local discriminators and the two generators are shared. As shown in FIG. 3, G and F are two producers of cycleGAN, DxAnd DyAre two local discriminators of CycleGAN. In this model, the purpose of the generator is to make the pictures generated during the training more and more realistic, in an attempt to fool the discriminator, which is to make the training more and more profitable, to identify the true or false of the pictures.
As shown in FIG. 4, the SP-PGGAN model proposed by the present invention is composed of two parts, a generator and a discriminator. The generator of the SP-PGGAN consists of two parts: a generator G and a generator F of CycleGAN are arranged on one part, and each generator is sequentially provided with two convolution layers with the step length of 2, six residual blocks and two deconvolution layers with the step length of 1/2 from shallow to deep; the other part is a twin network which comprises four convolution layers with the step length of 2, a maximum pooling layer with the step length of 2 and a full-connection layer from shallow to deep in sequence; the SP-PGGAN discriminator includes a discriminator DTAnd a discriminator DsWherein the discriminator DTIncluding a local discriminator DTAnd a global discriminator DTD, discriminator DsIncluding a local discriminator DsAnd a global discriminator Ds(ii) a Discriminator DTAnd a discriminator DsSame structure, ginsengNumber not shared, wherein the global arbiter DTAnd a local discriminator DTThe convolution layer sharing four layers with the step length of 2 is divided into two paths from the end of the fourth layer, and one path is DTThe global discriminator finally outputs a binary number which determines whether the whole picture is judged to be true or false; another path is DTAnd the local discriminator outputs a 256-dimensional vector through the full connection layer, and each number determines whether the image block at the corresponding position is judged to be true or false. The specific network structure of the SP-PGGAN model is shown in Table 1.
TABLE 1
Figure BDA0002426050680000031
Figure BDA0002426050680000041
Step 2: style migration for SP-PGGAN model
After the model is built, in order to better test on a test set of a non-labeled data set, the method simultaneously inputs a training set of a labeled pedestrian re-identification data set and a training set of a non-labeled pedestrian re-identification data set into the built SP-PGGAN model for style migration. As shown in the fourth drawing, in the process of migration, for the positive direction, the picture x from the labeled data set generates a picture G (x) by the generator G, and then the picture G (x) generates a picture F (G (x)) by another generator F, a part of loss of the generators is difference loss between x and F (G (x)), another part of loss is from the twin network, the distance of the picture x and the picture G (x) passing through the twin network becomes shorter, the distance of the picture G (x) and the picture y from the unlabeled data set passing through the twin network becomes longer, and the loss of the part is from' distance loss of two pairs of samples. The loss of the arbiter is the global arbiter D of the generated pictures G (x) and yTAnd a local discriminator DTCross entropy loss below. Also in the opposite direction, a picture y from the unlabelled data set is generated by a generator F to form a picture F (y), and then a picture F (y) is generated by anotherThe device G generates a picture G (F (y)), one part of loss of the generator is difference loss between y and G (F (y)), the other part of loss comes from the twin network, the distance of the picture y and the picture F (y) after passing through the twin network is shortened, the distance of the picture F (y) and the picture x with the marked data set after passing through the twin network is lengthened, and the part of loss comes from the distance loss of two pairs of samples. The loss of the discriminator is that the generated pictures F (y) and x are in the global discriminator DsAnd a local discriminator DsCross entropy loss below. In the training process, the parameters of the discriminant are trained and generated while keeping the discriminant unchanged, and then the generator is fixed and the parameters of the discriminant are trained. Repeating the above process to make the generator and the discriminator gradually evolve respectively. The overall loss in this process is shown in equation 1:
L=LTadv+LSadv+LPTadv+LPSadv1Lcyc2Lide3Lcon(1)
LTadvand LSadvIs two mirror symmetric local discriminators D in SP-PGGAN modelTAnd DsThe losses of (2) and (3) are cross entropy losses, as shown in formula (2) and formula (3); wherein x and y respectively represent pictures in the training set with the labeled data set and pictures in the training set without the labeled data set, Px、PyRefers to the distribution of x and y obedients, G, DTGenerators and local discriminators, F, D, representing positive directions, respectivelysRespectively representing the generator and local arbiter in opposite directions. G (x) represents the image generated by the marked training set image x after passing through the generator G, DT(G (x)) and DT(y) shows that G (x) and y pass through a local discriminator DTThe result of the determination is then obtained. F (y) represents the picture generated after the image y of the label-free training set passes through the generator F, Ds(x) And Ds(F (y)) means that x and F (y) pass through a local discriminant DsThe result of the determination is then obtained.
Figure BDA0002426050680000051
Figure BDA0002426050680000052
LPTadvAnd LPSadvIs two mirror symmetric global discriminators D in SP-PGGAN modelTAnd DsAnd a local discriminator DTAnd DsIs the same, i.e. LPTadv=LTadvAnd LPSadv=LSadv
LcycIs the sum of losses of two mirror symmetric generators in the SP-PGGAN model, i.e. the sum of Euclidean distances between x and F (G (x)), y and G (F (y)); as shown in equation (4). For the positive direction, x represents the picture in the labeled data set training set, G (x) represents the picture obtained after passing through the generator G, F (G (x)) represents the picture obtained after passing through the generator G and the picture obtained after passing through the generator F, and the euclidean distance between the picture x and the generated picture F (G (x)) represents the generation loss. In the reverse direction, y represents a picture in the training set without the labeled data set, F (y) represents a picture obtained after the generator F, G (F (y)) represents a picture obtained after the generator F and the generator G, and the euclidean distance between the picture x and the generated picture G (F (y)) represents the generation loss.
Figure BDA0002426050680000053
LideThe color consistency loss in the forward and backward directions is shown in the formula (5). For the positive direction, x represents the picture in the labeled dataset training set, F (x) represents the picture generated by the generator F, and the euclidean distance between the picture x and the generated picture F (x) represents the color matching loss during the generation process. For the opposite direction, y represents the picture in the labeled data set training set, g (y) represents the picture generated by the generator F, and the euclidean distance between the picture y and the generated picture g (y) represents the color matching loss during the generation process.
Figure BDA0002426050680000054
LconIs the generation loss of the twin network, which is composed of the Euclidean distance of the positive sample and the Euclidean distance of the negative sample. As shown in equation (6). x is the number of1,x2Two input pictures representing a twin network, i represents the label of the input vector pair, x when i is 11,x2Is a positive sample pair, which refers to two pictures of the same pedestrian, i.e. x and g (x) in the positive direction and y and f (y) in the negative direction. When i is 0, it represents x1,x2Is a negative sample pair, the negative sample pair refers to two pictures of different pedestrians, namely y and G (x) in the positive direction, and x and F (y) d in the negative direction represent the Euclidean distance between the two input pictures m ∈ [0,2 ]]The value range of m is defined.
Lcon(i,x1,x2)=(1-i){max(0,m-d)}2+id2(6)
In the formula (1), γ1、γ2、γ3Respectively control Lcyc、Lide、LconThe parameter of importance has a value in the range of [1,10 ]]。
The invention uses Market-1501 and DukeMTMC-reiD as training sets to train SP-PGGAN, wherein the Market-1501 comprises 1501 pedestrians, 12,936 training set pictures and 19,732 searching set pictures. Of which 751 pedestrians were used for training and 750 were used for testing. Each pedestrian is captured under a maximum of six shots. The DukeMTMC-reiD includes 34,183 pictures of 1,404 pedestrians: 702 pedestrians were used for training, the remaining 702 pedestrians were used for testing. There were 2,228 images to be tested and 17,661 images to be found.
The invention uses TensorFlow as a framework and Market-1501 and DukeMTMC-reiD as training sets to train SP-PGGAN, and we do not use any ID information during the training process. In all migration experiments, we set γ in equation (1)1=10,γ2=5,γ33, m in formula (6) is 2. While setting the initial learning rate of SP-PGGAN to 0.0002, the model stopped training after 5 epochs. In the testing phase, we use the generator G for Market-1501 → DukeMTMC-reiDMigration, using generator F for DukeMTMC-reiD → Market-1501 migration.
And step 3: implementation of pedestrian re-identification
After the style is transferred, the invention needs further experiments of pedestrian re-identification. The invention is mainly an improvement in the style migration process, so in the second step, a basic IDE (ID-discrete embedding) method is adopted, which can be replaced by any pedestrian re-identification method.
In the pedestrian re-identification experiment, the invention utilizes the IDE model of pedestrian re-identification to train the classification network of the training set with the labeled data set obtained after the SP-PGGAN style migration, so as to obtain the trained classification network. And then, inputting the test set without the labeled data set into the trained classification network, thereby realizing the re-identification of the pedestrians. As shown in fig. five, the SP-PGGAN style migrated picture g (x) is trained by the IDE model to obtain a trained IDE network. In the testing process, the pedestrian pictures to be tested in the non-labeled data set and the searching set are input into the trained IDE network together, the pedestrian pictures identical to the pedestrian pictures to be tested are found in the searching set, and the process of pedestrian re-identification is achieved.
As shown in FIG. five, the IDE in the present invention uses ResNet-50 as a baseline model and adjusts the output dimension of the last fully connected layer to the number of pedestrians in the training data set. The IDE is a network for training a class, the first 5 convolutional layers, 6, 7 fully connected layers of 1024 neurons, and the 8 th layer is a class layer of ID number. The network is then trained as a classification task.
The present invention uses ResNet-50 pre-trained on ImageNet to adjust the training set. The final full-connect layers of outputs are adjusted 751 and 702, respectively, to be applied to Market-1501 and DukeMTMC-reID, respectively. The invention trains CNN models using small batches of SGD on a 1080Titan GPU at this step of the experiment. The batch, maximum number of units, momentum and gamma during training are set to 16, 50, 0.9 and 0.1, respectively. The initial learning rate was 0.001, decaying to 0.0001 after 40 batches.
In the invention, in order to further improve the effect of re-recognition of people on a target data set, a feature pooling method called local maximum pooling (L MP) is introduced, the feature pooling method can be well used on a trained IDE model and can reduce the influence of wrong pictures in the migration process on the re-recognition process, in the original ResNet-50, global average pooling is adopted on a fifth convolutional layer, after L MP is added, a feature map obtained by the fifth convolutional layer is firstly divided into P parts in the horizontal direction, then global/average pooling is used for each part after separation, finally, the results of global average pooling or global maximum pooling are spliced, the process can be directly used in the test process, in the experiment, the invention compares the global maximum pooling with the global average pooling, and selects the one with good effect to be used for L MP splicing.
The results of the inventive examples are shown in table 2:
TABLE 2
Figure BDA0002426050680000071
As can be seen from table 2, migration using SP-PGGAN (m ═ 2) is more effective than pedestrian re-recognition using CycleGAN, i.e. rank-1 and mAP tested on mark-1501 with SP-PGGAN (m ═ 2) are 11.2% and 6.2% more than CycleGAN, respectively, and rank-1 and mAP are 6.7% and 3.5% more, respectively, when tested on duke mtmc-reID, thus demonstrating the effectiveness of the SP-PGGAN model proposed by the present invention, and it also demonstrates a certain degree of improvement in pedestrian re-recognition using L MP on SP-PGGAN.

Claims (6)

1. The style migration method based on the SP-PGGAN is characterized by comprising an SP-PGGAN model, wherein the model is used for style migration between a marked pedestrian re-identification data set and a non-marked pedestrian re-identification data set, and the model structure is improved on the basis of cycleGANThe cycleGAN is two mirror symmetric GANs, which form a ring network, and the two GANs share a generator G and a generator F, each of which has a local discriminator DXAnd DYThe SP-PGGAN is a generator which reserves the cycleGAN on the basis of the cycleGAN, and a twin network is added after the generator is generated to guide the generation process of the generator, and simultaneously, a local discriminator D of the cycleGAN is usedXAnd DYInstead, a parallel local and global arbiter is used.
2. The SP-PGGAN-based style migration method according to claim 1, wherein: the generator of the SP-PGGAN consists of two parts: a generator G and a generator F of CycleGAN are arranged on one part, and each generator is sequentially provided with two convolution layers with the step length of 2, six residual blocks and two deconvolution layers with the step length of 1/2 from shallow to deep; the other part is a twin network which comprises four convolution layers with the step length of 2, a maximum pooling layer with the step length of 2 and a full-connection layer from shallow to deep in sequence;
the SP-PGGAN discriminator includes a discriminator DTAnd a discriminator DsWherein the discriminator DTIncluding a local discriminator DTAnd a global discriminator DTD, discriminator DsIncluding a local discriminator DsAnd a global discriminator Ds(ii) a Discriminator DTAnd a discriminator DsIdentical in structure and not shared in parameters, wherein the global arbiter DTAnd a local discriminator DTThe convolution layer sharing four layers with the step length of 2 is divided into two paths from the end of the fourth layer, and one path is DTA global discriminator; another path is DTAnd the local discriminator outputs a 256-dimensional vector through the full connection layer.
3. The SP-PGGAN-based style migration method according to claim 1, wherein: the training process of the SP-PGGAN comprises the following steps:
for the positive direction, the generation process is that a picture x from a training set with a labeled data set generates a picture through a generator GThe picture G (x) generates a picture F (G (x)) through a generator F, and then the pictures x and G (x), the picture G (x) and a picture y of a training set without a labeled data set are respectively input into a twin network, wherein the twin network is used for improving the accuracy of pedestrian ID information in labeled data in the style migration process; the discrimination process is that the pictures G (x) and the picture y of the training set without the labeled data set are simultaneously input into the global discriminator DTAnd a local discriminator DTThe global discriminator is used for discriminating the true and false of the whole picture, and the local discriminator is used for discriminating the local true and false of the picture;
in the opposite direction, the generation process is that the picture y from the training set without the labeled data set generates a picture F (y) through the generator F, the picture F (y) generates a picture G (F (y)) through the generator G, and then the pictures y and F (y), the picture F (y) and the picture x with the training set with the labeled data set are respectively input into the twin network; the discrimination process is that the picture F (y) and the picture x with the labeled data set training set are simultaneously input into the global discriminator DsAnd a local discriminator DsTraining a discriminator;
in the training process, firstly keeping the discriminator unchanged, training parameters in the generation process, then fixing the generator, training the parameters of the discriminator, repeating the above process, and making the generator and the discriminator gradually evolve respectively.
4. The SP-PGGAN style migration method for pedestrian re-recognition according to claim 1, wherein the training process of the SP-PGGAN model is to input the training set with the labeled pedestrian re-recognition data set and the training set without the labeled pedestrian re-recognition data set into the SP-PGGAN model at the same time for style migration, so as to obtain the training set with the labeled pedestrian re-recognition data set after migration; wherein the loss function is:
L=LTadv+LSadv+LPTadv+LPSadv1Lcyc2Lide3Lcon
LTadvand LSadvIs two in the SP-PGGAN modelLocal discriminator D with mirror symmetry in one directionTAnd DsL are cross entropy lossesPTadvAnd LPSadvIs a global discriminator D with mirror symmetry in two directions in an SP-PGGAN modelTAnd DsL, which is the same as the local arbiter, is a cross entropy losscycIs the sum of the losses of two mirror-symmetric generators in the SP-PGGAN model LideIndicating consistent loss of color during forward and reverse directions LconIs the loss of twinning network formation, gamma1、γ2、γ3Respectively control Lcyc、Lide、LconThe parameter of importance has a value in the range of [1,10 ]]。
5. The SP-PGGAN style migration method for pedestrian re-identification as claimed in claim 4, wherein:
the LTadvThe calculation formula of (a) is as follows:
Figure FDA0002426050670000021
wherein G (x) represents the image generated after the marked training set image x passes through the generator G, DT(G (x)) and DT(y) shows that G (x) and y pass through the local discriminator DTThe subsequent judgment result;
the LSadvThe calculation formula of (a) is as follows:
Figure FDA0002426050670000022
wherein F (y) represents the picture generated after the image y of the unlabeled training set passes through the generator F, Ds(x) And Ds(F (y)) means that x and F (y) pass through a local discriminant DsThe subsequent judgment result;
the LcycIs the sum of the Euclidean distances between x and F (G (x)), y and G (F (y));
the LideThe calculation formula of (a) is as follows:
Figure FDA0002426050670000023
wherein, F (x) represents the picture generated by the generator F, and g (y) represents the picture generated by the generator F;
the LconThe calculation formula of (a) is as follows:
Lcon(i,x1,x2)=(1-i){max(0,m-d)}2+id2
wherein x is1,x2Two input pictures representing a twin network, i represents the label of the input vector pair, x when i is 11,x2Is a positive sample pair, which refers to two pictures of the same pedestrian, i.e. x and g (x) in the positive direction and y and f (y) in the negative direction. When i is 0, it represents x1,x2Is a negative sample pair, the negative sample pair refers to two pictures of different pedestrians, namely y and G (x) in the positive direction, and x and F (y) d in the negative direction represent the Euclidean distance between the two input pictures m ∈ [0,2 ]]The value range of m is defined.
6. A pedestrian re-identification method based on SP-PGGAN style migration is characterized in that:
A. simultaneously inputting the training set of the marked pedestrian re-identification data set and the training set of the unmarked pedestrian re-identification data set into an SP-PGGAN model for training, namely obtaining the training set of the marked pedestrian re-identification data set after migration through a generator G;
B. and training the classification network of the training set of the migrated marked pedestrian re-identification data set by utilizing the pedestrian re-identification model IDE to obtain a trained IDE model, and then inputting the test set of the unmarked pedestrian re-identification data set into the trained IDE model to realize the pedestrian re-identification of the unmarked data set.
CN202010226128.5A 2020-03-26 2020-03-26 Pedestrian re-recognition method based on SP-PGGAN style migration Active CN111428650B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010226128.5A CN111428650B (en) 2020-03-26 2020-03-26 Pedestrian re-recognition method based on SP-PGGAN style migration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010226128.5A CN111428650B (en) 2020-03-26 2020-03-26 Pedestrian re-recognition method based on SP-PGGAN style migration

Publications (2)

Publication Number Publication Date
CN111428650A true CN111428650A (en) 2020-07-17
CN111428650B CN111428650B (en) 2024-04-02

Family

ID=71548862

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010226128.5A Active CN111428650B (en) 2020-03-26 2020-03-26 Pedestrian re-recognition method based on SP-PGGAN style migration

Country Status (1)

Country Link
CN (1) CN111428650B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112232422A (en) * 2020-10-20 2021-01-15 北京大学 Target pedestrian re-identification method and device, electronic equipment and storage medium
CN113569627A (en) * 2021-06-11 2021-10-29 北京旷视科技有限公司 Human body posture prediction model training method, human body posture prediction method and device
CN113658178A (en) * 2021-10-14 2021-11-16 北京字节跳动网络技术有限公司 Tissue image identification method and device, readable medium and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109670528A (en) * 2018-11-14 2019-04-23 中国矿业大学 The data extending method for blocking strategy at random based on paired samples towards pedestrian's weight identification mission
CN110163110A (en) * 2019-04-23 2019-08-23 中电科大数据研究院有限公司 A kind of pedestrian's recognition methods again merged based on transfer learning and depth characteristic

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109670528A (en) * 2018-11-14 2019-04-23 中国矿业大学 The data extending method for blocking strategy at random based on paired samples towards pedestrian's weight identification mission
CN110163110A (en) * 2019-04-23 2019-08-23 中电科大数据研究院有限公司 A kind of pedestrian's recognition methods again merged based on transfer learning and depth characteristic

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SATOSHI IIZUKA 等: "Globally and Locally Consistent Image Completion" *
WEIJIAN DENG 等: "Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification" *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112232422A (en) * 2020-10-20 2021-01-15 北京大学 Target pedestrian re-identification method and device, electronic equipment and storage medium
CN113569627A (en) * 2021-06-11 2021-10-29 北京旷视科技有限公司 Human body posture prediction model training method, human body posture prediction method and device
CN113658178A (en) * 2021-10-14 2021-11-16 北京字节跳动网络技术有限公司 Tissue image identification method and device, readable medium and electronic equipment

Also Published As

Publication number Publication date
CN111428650B (en) 2024-04-02

Similar Documents

Publication Publication Date Title
CN109948425B (en) Pedestrian searching method and device for structure-aware self-attention and online instance aggregation matching
Liu et al. Leveraging unlabeled data for crowd counting by learning to rank
CN108537136A (en) The pedestrian's recognition methods again generated based on posture normalized image
CN112396027A (en) Vehicle weight recognition method based on graph convolution neural network
CN111428650A (en) Pedestrian re-identification method based on SP-PGGAN style migration
Zhang et al. Dual mutual learning for cross-modality person re-identification
CN111625667A (en) Three-dimensional model cross-domain retrieval method and system based on complex background image
Wang et al. Face anti-spoofing using transformers with relation-aware mechanism
CN110390294A (en) Target tracking method based on bidirectional long-short term memory neural network
CN108154133A (en) Human face portrait based on asymmetric combination learning-photo array method
Jiang et al. Application of a fast RCNN based on upper and lower layers in face recognition
Zhu et al. Expression recognition method combining convolutional features and Transformer.
Wu et al. Transformer fusion and pixel-level contrastive learning for RGB-D salient object detection
Song et al. Dense face network: A dense face detector based on global context and visual attention mechanism
Qian et al. URRNet: A Unified Relational Reasoning Network for Vehicle Re-Identification
Tian et al. Domain adaptive object detection with model-agnostic knowledge transferring
CN112446305A (en) Pedestrian re-identification method based on classification weight equidistant distribution loss model
CN117011883A (en) Pedestrian re-recognition method based on pyramid convolution and transducer double branches
Gong et al. Person re-identification based on two-stream network with attention and pose features
Zhu et al. Road scene layout reconstruction based on CNN and its application in traffic simulation
Ling et al. Magnetic tile surface defect detection methodology based on self-attention and self-supervised learning
CN114140524A (en) Closed loop detection system and method for multi-scale feature fusion
Wang et al. Learning a layout transfer network for context aware object detection
Radulescu et al. Modeling 3D convolution architecture for actions recognition
Wu et al. Spatial-Temporal Hypergraph Based on Dual-Stage Attention Network for Multi-View Data Lightweight Action Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant