CN111428650A

CN111428650A - Pedestrian re-identification method based on SP-PGGAN style migration

Info

Publication number: CN111428650A
Application number: CN202010226128.5A
Authority: CN
Inventors: 孙艳丰; 胡芸萍
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2020-03-26
Filing date: 2020-03-26
Publication date: 2020-07-17
Anticipated expiration: 2040-03-26
Also published as: CN111428650B

Abstract

The invention provides a pedestrian re-identification method based on SP-PGGAN style migration. The method comprises the following steps: constructing an SP-PGGAN model based on the cycleGAN; simultaneously inputting the training set of the marked pedestrian re-identification data set and the training set of the unmarked pedestrian re-identification data set into an SP-PGGAN model for training, namely obtaining the training set of the marked pedestrian re-identification data set after migration through a generator G; training a classification network on the migrated training set with the labeled pedestrian re-recognition data set by using a pedestrian re-recognition model IDE to obtain a trained IDE model; and inputting the test set of the non-labeled pedestrian re-identification data set into the trained IDE model to realize the pedestrian re-identification of the non-labeled data set. The SP-PGGAN migration model designed by the invention is more accurate in the style migration process, so that the pedestrian re-identification effect without the labeled data set can be improved to a great extent.

Description

Pedestrian re-identification method based on SP-PGGAN style migration

Technical Field

The invention belongs to the field of computer vision, and particularly relates to technologies of deep learning, countermeasure network, image processing, feature extraction and the like. The pedestrian re-recognition method based on SP-PGGAN style migration can realize style migration from a labeled data set to a non-labeled data set, and the pedestrian re-recognition network trained after the labeled data set is migrated can improve the effect of pedestrian re-recognition of the non-labeled data set.

Background

Pedestrian Re-identification (Person Re-identification), also known as pedestrian Re-identification, is a technique that uses computer vision techniques to determine whether a particular pedestrian is present in an image or video sequence. Is widely considered as a sub-problem for image retrieval. Given a monitored pedestrian image, the pedestrian image is retrieved across the device. The camera aims to make up the visual limitation of the existing fixed camera, can be combined with a pedestrian detection/pedestrian tracking technology, and can be widely applied to the fields of intelligent video monitoring, intelligent security and the like. Pedestrian re-identification has gained increasing attention in recent years in the field of computer vision. The pedestrian re-identification has wide application prospects, including pedestrian retrieval, pedestrian tracking, street event detection, pedestrian action and behavior analysis and the like. The invention provides an effective migration model aiming at the challenge, so that the model trained by the marked data can be better applied to a non-marked data set, and the accuracy of pedestrian re-identification without marked data is improved.

The existing pedestrian re-identification data set aiming at non-labeled data mostly adopts training on a labeled data set, and a model after training is directly used for testing the non-labeled data set. In this way, the pedestrian re-identification accuracy rate directly used for testing is not high due to different bottom layer distributions of the two data sets. In recent years, with the development of anti-network and the progress of migration technology, it has become possible to perform style migration using unpaired datasets, and even datasets distributed in different underlying layers can be well subjected to style conversion. Under the background, how to perform more accurate migration on the basis of CycleGAN is one of the hot spots of image recognition research, and the method has a wide application prospect.

Disclosure of Invention

Aiming at the challenge of difficult data annotation in pedestrian re-identification, the invention provides a new migration model SP-PGGAN by utilizing a deep learning technology, and realizes pedestrian re-identification on the basis of the migration model. The existing pedestrian re-identification data set aiming at non-labeled data mostly adopts training on a labeled data set, and a model after training is directly used for testing the non-labeled data set. In this way, the pedestrian re-identification accuracy rate directly used for testing is not high due to different bottom layer distributions of the two data sets. The method comprises the steps of firstly constructing an SP-PGGAN model, then simultaneously inputting a training set with a labeled pedestrian re-identification data set and a training set without the labeled pedestrian re-identification data set into the constructed SP-PGGAN model for style migration, carrying out IDE training on the obtained training set with the labeled pedestrian re-identification data set after migration, and then realizing pedestrian re-identification on a test set with the unlabeled pedestrian re-identification data set. The main process of the invention is shown in the attached figure 1 and can be divided into the following three steps: constructing an SP-PGGAN model, and realizing style migration and pedestrian re-identification of the SP-PGGAN model.

(1) Construction of SP-PGGAN model

The invention firstly constructs an SP-PGGAN model, the model structure of which is improved on the basis of the cycleGAN which is essentially two mirror symmetric GANs to form an annular network. Two GANs share two generators and each have a local arbiter, i.e. there are two local arbiters and two generators in total. The SP-PGGAN is a generator which retains the CycleGAN on the basis of the CycleGAN, and a twin network is added after the generator is generated to guide the generation process of the generator. Meanwhile, two discriminators are replaced by a local discriminator and a global discriminator in parallel. Fig. 2 is a schematic diagram illustrating the invention of the SP-PGGAN model proposed by the present invention.

(2) Style migration for SP-PGGAN model

In order to better test on a label-free data set test set, the invention simultaneously inputs a training set of a labeled pedestrian re-identification data set and a training set of a label-free pedestrian re-identification data set into a constructed SP-PGGAN model for style migration. Two similarities are maintained during migration: firstly, if the style of the image from the labeled data set is transferred to the unlabeled data set, the style of the transferred image is consistent with the style of the unlabeled data set; and secondly, the ID information of the pedestrians in the image needing to be kept in the annotation data set before and after the image migration is unchanged. Such ID information is not the background of the image or the style of the image, but is the pedestrian region of the image having a potential relationship with the ID information other than the background, i.e., the annotation information of the pedestrian.

(3) Implementation of pedestrian re-identification

After the style is transferred, the invention needs further experiments of pedestrian re-identification. The invention utilizes the IDE model for pedestrian re-identification to train the classification network of the training set with the labeled data set obtained after the SP-PGGAN style migration to obtain the trained IDE model, and then realizes the pedestrian re-identification on the test set without the labeled data set.

Drawings

In order to more clearly illustrate the technical solution of the embodiments of the present invention, the drawings required in the description of the embodiments will be briefly introduced as follows:

FIG. 1 is a flow chart of a SP-PGGAN style migration method for pedestrian re-identification;

FIG. 2 is a graph comparing the SP-PGGAN model with the CycleGAN model;

FIG. 3 is a schematic diagram of a CycleGAN model network;

FIG. 4 is a network architecture diagram based on SP-PGGAN style migration;

FIG. 5 is a network architecture diagram of a pedestrian re-identification method (IDE);

advantageous effects

According to the invention, a twin network is added on the basis of the CycleGAN, so that two pictures with similar pedestrian information are more similar, and two pictures with dissimilar pedestrian information are more dissimilar. Meanwhile, in order to consider local and overall information in the pedestrian migration process and improve the effect of the pedestrian migration process, the invention uses the global discriminator and the local discriminator as the discriminators of the SP-PGGAN. The improved migration network can enable pictures generated in the pedestrian migration process to be better used for training of a classification network, and experiments show that the method realizes the improvement of mAP (mean average recognition precision) which is 12% more than that of pedestrians which are directly migrated.

Detailed Description

In light of the above description, a specific implementation flow is as follows, but the scope of protection of this patent is not limited to this implementation flow.

Step 1: construction of SP-PGGAN model

The invention firstly constructs a migration model of SP-PGGAN. The model can be used for the migration process of pedestrian re-identification, but is not limited to the migration process, the pedestrian re-identification is an application scene of the invention, and other similar scenes related to style migration can be general.

The SP-PGGAN model is an improvement on the basis of cycleGAN. The CycleGAN is bidirectional, and is formed by two mirror symmetric GANs to form a ring network, wherein the two GANs share two generators and are respectively provided with a local discriminator, namely the two local discriminators and the two generators are shared. As shown in FIG. 3, G and F are two producers of cycleGAN, D_xAnd D_yAre two local discriminators of CycleGAN. In this model, the purpose of the generator is to make the pictures generated during the training more and more realistic, in an attempt to fool the discriminator, which is to make the training more and more profitable, to identify the true or false of the pictures.

As shown in FIG. 4, the SP-PGGAN model proposed by the present invention is composed of two parts, a generator and a discriminator. The generator of the SP-PGGAN consists of two parts: a generator G and a generator F of CycleGAN are arranged on one part, and each generator is sequentially provided with two convolution layers with the step length of 2, six residual blocks and two deconvolution layers with the step length of 1/2 from shallow to deep; the other part is a twin network which comprises four convolution layers with the step length of 2, a maximum pooling layer with the step length of 2 and a full-connection layer from shallow to deep in sequence; the SP-PGGAN discriminator includes a discriminator D_TAnd a discriminator D_sWherein the discriminator D_TIncluding a local discriminator D_TAnd a global discriminator D_TD, discriminator D_sIncluding a local discriminator D_sAnd a global discriminator D_s(ii) a Discriminator D_TAnd a discriminator D_sSame structure, ginsengNumber not shared, wherein the global arbiter D_TAnd a local discriminator D_TThe convolution layer sharing four layers with the step length of 2 is divided into two paths from the end of the fourth layer, and one path is D_TThe global discriminator finally outputs a binary number which determines whether the whole picture is judged to be true or false; another path is D_TAnd the local discriminator outputs a 256-dimensional vector through the full connection layer, and each number determines whether the image block at the corresponding position is judged to be true or false. The specific network structure of the SP-PGGAN model is shown in Table 1.

TABLE 1

Step 2: style migration for SP-PGGAN model

After the model is built, in order to better test on a test set of a non-labeled data set, the method simultaneously inputs a training set of a labeled pedestrian re-identification data set and a training set of a non-labeled pedestrian re-identification data set into the built SP-PGGAN model for style migration. As shown in the fourth drawing, in the process of migration, for the positive direction, the picture x from the labeled data set generates a picture G (x) by the generator G, and then the picture G (x) generates a picture F (G (x)) by another generator F, a part of loss of the generators is difference loss between x and F (G (x)), another part of loss is from the twin network, the distance of the picture x and the picture G (x) passing through the twin network becomes shorter, the distance of the picture G (x) and the picture y from the unlabeled data set passing through the twin network becomes longer, and the loss of the part is from' distance loss of two pairs of samples. The loss of the arbiter is the global arbiter D of the generated pictures G (x) and y_TAnd a local discriminator D_TCross entropy loss below. Also in the opposite direction, a picture y from the unlabelled data set is generated by a generator F to form a picture F (y), and then a picture F (y) is generated by anotherThe device G generates a picture G (F (y)), one part of loss of the generator is difference loss between y and G (F (y)), the other part of loss comes from the twin network, the distance of the picture y and the picture F (y) after passing through the twin network is shortened, the distance of the picture F (y) and the picture x with the marked data set after passing through the twin network is lengthened, and the part of loss comes from the distance loss of two pairs of samples. The loss of the discriminator is that the generated pictures F (y) and x are in the global discriminator D_sAnd a local discriminator D_sCross entropy loss below. In the training process, the parameters of the discriminant are trained and generated while keeping the discriminant unchanged, and then the generator is fixed and the parameters of the discriminant are trained. Repeating the above process to make the generator and the discriminator gradually evolve respectively. The overall loss in this process is shown in equation 1:

L＝L_Tadv+L_Sadv+L_PTadv+L_PSadv+γ₁L_cyc+γ₂L_ide+γ₃L_con(1)

L_Tadvand L_SadvIs two mirror symmetric local discriminators D in SP-PGGAN model_TAnd D_sThe losses of (2) and (3) are cross entropy losses, as shown in formula (2) and formula (3); wherein x and y respectively represent pictures in the training set with the labeled data set and pictures in the training set without the labeled data set, P_x、P_yRefers to the distribution of x and y obedients, G, D_TGenerators and local discriminators, F, D, representing positive directions, respectively_sRespectively representing the generator and local arbiter in opposite directions. G (x) represents the image generated by the marked training set image x after passing through the generator G, D_T(G (x)) and D_T(y) shows that G (x) and y pass through a local discriminator D_TThe result of the determination is then obtained. F (y) represents the picture generated after the image y of the label-free training set passes through the generator F, D_s(x) And D_s(F (y)) means that x and F (y) pass through a local discriminant D_sThe result of the determination is then obtained.

L_PTadvAnd L_PSadvIs two mirror symmetric global discriminators D in SP-PGGAN model_TAnd D_sAnd a local discriminator D_TAnd D_sIs the same, i.e. L_PTadv＝L_TadvAnd L_PSadv＝L_Sadv；

L_cycIs the sum of losses of two mirror symmetric generators in the SP-PGGAN model, i.e. the sum of Euclidean distances between x and F (G (x)), y and G (F (y)); as shown in equation (4). For the positive direction, x represents the picture in the labeled data set training set, G (x) represents the picture obtained after passing through the generator G, F (G (x)) represents the picture obtained after passing through the generator G and the picture obtained after passing through the generator F, and the euclidean distance between the picture x and the generated picture F (G (x)) represents the generation loss. In the reverse direction, y represents a picture in the training set without the labeled data set, F (y) represents a picture obtained after the generator F, G (F (y)) represents a picture obtained after the generator F and the generator G, and the euclidean distance between the picture x and the generated picture G (F (y)) represents the generation loss.

L_ideThe color consistency loss in the forward and backward directions is shown in the formula (5). For the positive direction, x represents the picture in the labeled dataset training set, F (x) represents the picture generated by the generator F, and the euclidean distance between the picture x and the generated picture F (x) represents the color matching loss during the generation process. For the opposite direction, y represents the picture in the labeled data set training set, g (y) represents the picture generated by the generator F, and the euclidean distance between the picture y and the generated picture g (y) represents the color matching loss during the generation process.

L_conIs the generation loss of the twin network, which is composed of the Euclidean distance of the positive sample and the Euclidean distance of the negative sample. As shown in equation (6). x is the number of₁,x₂Two input pictures representing a twin network, i represents the label of the input vector pair, x when i is 1₁,x₂Is a positive sample pair, which refers to two pictures of the same pedestrian, i.e. x and g (x) in the positive direction and y and f (y) in the negative direction. When i is 0, it represents x₁,x₂Is a negative sample pair, the negative sample pair refers to two pictures of different pedestrians, namely y and G (x) in the positive direction, and x and F (y) d in the negative direction represent the Euclidean distance between the two input pictures m ∈ [0,2 ]]The value range of m is defined.

L_con(i,x₁,x₂)＝(1-i){max(0,m-d)}²+id²(6)

In the formula (1), γ₁、γ₂、γ₃Respectively control L_cyc、L_ide、L_conThe parameter of importance has a value in the range of [1,10 ]]。

The invention uses Market-1501 and DukeMTMC-reiD as training sets to train SP-PGGAN, wherein the Market-1501 comprises 1501 pedestrians, 12,936 training set pictures and 19,732 searching set pictures. Of which 751 pedestrians were used for training and 750 were used for testing. Each pedestrian is captured under a maximum of six shots. The DukeMTMC-reiD includes 34,183 pictures of 1,404 pedestrians: 702 pedestrians were used for training, the remaining 702 pedestrians were used for testing. There were 2,228 images to be tested and 17,661 images to be found.

The invention uses TensorFlow as a framework and Market-1501 and DukeMTMC-reiD as training sets to train SP-PGGAN, and we do not use any ID information during the training process. In all migration experiments, we set γ in equation (1)₁＝10，γ₂＝5，γ₃3, m in formula (6) is 2. While setting the initial learning rate of SP-PGGAN to 0.0002, the model stopped training after 5 epochs. In the testing phase, we use the generator G for Market-1501 → DukeMTMC-reiDMigration, using generator F for DukeMTMC-reiD → Market-1501 migration.

And step 3: implementation of pedestrian re-identification

After the style is transferred, the invention needs further experiments of pedestrian re-identification. The invention is mainly an improvement in the style migration process, so in the second step, a basic IDE (ID-discrete embedding) method is adopted, which can be replaced by any pedestrian re-identification method.

In the pedestrian re-identification experiment, the invention utilizes the IDE model of pedestrian re-identification to train the classification network of the training set with the labeled data set obtained after the SP-PGGAN style migration, so as to obtain the trained classification network. And then, inputting the test set without the labeled data set into the trained classification network, thereby realizing the re-identification of the pedestrians. As shown in fig. five, the SP-PGGAN style migrated picture g (x) is trained by the IDE model to obtain a trained IDE network. In the testing process, the pedestrian pictures to be tested in the non-labeled data set and the searching set are input into the trained IDE network together, the pedestrian pictures identical to the pedestrian pictures to be tested are found in the searching set, and the process of pedestrian re-identification is achieved.

As shown in FIG. five, the IDE in the present invention uses ResNet-50 as a baseline model and adjusts the output dimension of the last fully connected layer to the number of pedestrians in the training data set. The IDE is a network for training a class, the first 5 convolutional layers, 6, 7 fully connected layers of 1024 neurons, and the 8 th layer is a class layer of ID number. The network is then trained as a classification task.

The present invention uses ResNet-50 pre-trained on ImageNet to adjust the training set. The final full-connect layers of outputs are adjusted 751 and 702, respectively, to be applied to Market-1501 and DukeMTMC-reID, respectively. The invention trains CNN models using small batches of SGD on a 1080Titan GPU at this step of the experiment. The batch, maximum number of units, momentum and gamma during training are set to 16, 50, 0.9 and 0.1, respectively. The initial learning rate was 0.001, decaying to 0.0001 after 40 batches.

In the invention, in order to further improve the effect of re-recognition of people on a target data set, a feature pooling method called local maximum pooling (L MP) is introduced, the feature pooling method can be well used on a trained IDE model and can reduce the influence of wrong pictures in the migration process on the re-recognition process, in the original ResNet-50, global average pooling is adopted on a fifth convolutional layer, after L MP is added, a feature map obtained by the fifth convolutional layer is firstly divided into P parts in the horizontal direction, then global/average pooling is used for each part after separation, finally, the results of global average pooling or global maximum pooling are spliced, the process can be directly used in the test process, in the experiment, the invention compares the global maximum pooling with the global average pooling, and selects the one with good effect to be used for L MP splicing.

The results of the inventive examples are shown in table 2:

TABLE 2

As can be seen from table 2, migration using SP-PGGAN (m ═ 2) is more effective than pedestrian re-recognition using CycleGAN, i.e. rank-1 and mAP tested on mark-1501 with SP-PGGAN (m ═ 2) are 11.2% and 6.2% more than CycleGAN, respectively, and rank-1 and mAP are 6.7% and 3.5% more, respectively, when tested on duke mtmc-reID, thus demonstrating the effectiveness of the SP-PGGAN model proposed by the present invention, and it also demonstrates a certain degree of improvement in pedestrian re-recognition using L MP on SP-PGGAN.

Claims

1. The style migration method based on the SP-PGGAN is characterized by comprising an SP-PGGAN model, wherein the model is used for style migration between a marked pedestrian re-identification data set and a non-marked pedestrian re-identification data set, and the model structure is improved on the basis of cycleGANThe cycleGAN is two mirror symmetric GANs, which form a ring network, and the two GANs share a generator G and a generator F, each of which has a local discriminator D_XAnd D_YThe SP-PGGAN is a generator which reserves the cycleGAN on the basis of the cycleGAN, and a twin network is added after the generator is generated to guide the generation process of the generator, and simultaneously, a local discriminator D of the cycleGAN is used_XAnd D_YInstead, a parallel local and global arbiter is used.

2. The SP-PGGAN-based style migration method according to claim 1, wherein: the generator of the SP-PGGAN consists of two parts: a generator G and a generator F of CycleGAN are arranged on one part, and each generator is sequentially provided with two convolution layers with the step length of 2, six residual blocks and two deconvolution layers with the step length of 1/2 from shallow to deep; the other part is a twin network which comprises four convolution layers with the step length of 2, a maximum pooling layer with the step length of 2 and a full-connection layer from shallow to deep in sequence;

the SP-PGGAN discriminator includes a discriminator D_TAnd a discriminator D_sWherein the discriminator D_TIncluding a local discriminator D_TAnd a global discriminator D_TD, discriminator D_sIncluding a local discriminator D_sAnd a global discriminator D_s(ii) a Discriminator D_TAnd a discriminator D_sIdentical in structure and not shared in parameters, wherein the global arbiter D_TAnd a local discriminator D_TThe convolution layer sharing four layers with the step length of 2 is divided into two paths from the end of the fourth layer, and one path is D_TA global discriminator; another path is D_TAnd the local discriminator outputs a 256-dimensional vector through the full connection layer.

3. The SP-PGGAN-based style migration method according to claim 1, wherein: the training process of the SP-PGGAN comprises the following steps:

for the positive direction, the generation process is that a picture x from a training set with a labeled data set generates a picture through a generator GThe picture G (x) generates a picture F (G (x)) through a generator F, and then the pictures x and G (x), the picture G (x) and a picture y of a training set without a labeled data set are respectively input into a twin network, wherein the twin network is used for improving the accuracy of pedestrian ID information in labeled data in the style migration process; the discrimination process is that the pictures G (x) and the picture y of the training set without the labeled data set are simultaneously input into the global discriminator D_TAnd a local discriminator D_TThe global discriminator is used for discriminating the true and false of the whole picture, and the local discriminator is used for discriminating the local true and false of the picture;

in the opposite direction, the generation process is that the picture y from the training set without the labeled data set generates a picture F (y) through the generator F, the picture F (y) generates a picture G (F (y)) through the generator G, and then the pictures y and F (y), the picture F (y) and the picture x with the training set with the labeled data set are respectively input into the twin network; the discrimination process is that the picture F (y) and the picture x with the labeled data set training set are simultaneously input into the global discriminator D_sAnd a local discriminator D_sTraining a discriminator;

in the training process, firstly keeping the discriminator unchanged, training parameters in the generation process, then fixing the generator, training the parameters of the discriminator, repeating the above process, and making the generator and the discriminator gradually evolve respectively.

4. The SP-PGGAN style migration method for pedestrian re-recognition according to claim 1, wherein the training process of the SP-PGGAN model is to input the training set with the labeled pedestrian re-recognition data set and the training set without the labeled pedestrian re-recognition data set into the SP-PGGAN model at the same time for style migration, so as to obtain the training set with the labeled pedestrian re-recognition data set after migration; wherein the loss function is:

L＝L_Tadv+L_Sadv+L_PTadv+L_PSadv+γ₁L_cyc+γ₂L_ide+γ₃L_con

L_Tadvand L_SadvIs two in the SP-PGGAN modelLocal discriminator D with mirror symmetry in one direction_TAnd D_sL are cross entropy losses_PTadvAnd L_PSadvIs a global discriminator D with mirror symmetry in two directions in an SP-PGGAN model_TAnd D_sL, which is the same as the local arbiter, is a cross entropy loss_cycIs the sum of the losses of two mirror-symmetric generators in the SP-PGGAN model L_ideIndicating consistent loss of color during forward and reverse directions L_conIs the loss of twinning network formation, gamma₁、γ₂、γ₃Respectively control L_cyc、L_ide、L_conThe parameter of importance has a value in the range of [1,10 ]]。

5. The SP-PGGAN style migration method for pedestrian re-identification as claimed in claim 4, wherein:

the L_TadvThe calculation formula of (a) is as follows:

wherein G (x) represents the image generated after the marked training set image x passes through the generator G, D_T(G (x)) and D_T(y) shows that G (x) and y pass through the local discriminator D_TThe subsequent judgment result;

the L_SadvThe calculation formula of (a) is as follows:

wherein F (y) represents the picture generated after the image y of the unlabeled training set passes through the generator F, D_s(x) And D_s(F (y)) means that x and F (y) pass through a local discriminant D_sThe subsequent judgment result;

the L_cycIs the sum of the Euclidean distances between x and F (G (x)), y and G (F (y));

the L_ideThe calculation formula of (a) is as follows:

wherein, F (x) represents the picture generated by the generator F, and g (y) represents the picture generated by the generator F;

the L_conThe calculation formula of (a) is as follows:

L_con(i,x₁,x₂)＝(1-i){max(0,m-d)}²+id²

wherein x is₁,x₂Two input pictures representing a twin network, i represents the label of the input vector pair, x when i is 1₁,x₂Is a positive sample pair, which refers to two pictures of the same pedestrian, i.e. x and g (x) in the positive direction and y and f (y) in the negative direction. When i is 0, it represents x₁,x₂Is a negative sample pair, the negative sample pair refers to two pictures of different pedestrians, namely y and G (x) in the positive direction, and x and F (y) d in the negative direction represent the Euclidean distance between the two input pictures m ∈ [0,2 ]]The value range of m is defined.

6. A pedestrian re-identification method based on SP-PGGAN style migration is characterized in that:

A. simultaneously inputting the training set of the marked pedestrian re-identification data set and the training set of the unmarked pedestrian re-identification data set into an SP-PGGAN model for training, namely obtaining the training set of the marked pedestrian re-identification data set after migration through a generator G;

B. and training the classification network of the training set of the migrated marked pedestrian re-identification data set by utilizing the pedestrian re-identification model IDE to obtain a trained IDE model, and then inputting the test set of the unmarked pedestrian re-identification data set into the trained IDE model to realize the pedestrian re-identification of the unmarked data set.