CN112270228A

CN112270228A - Pedestrian re-identification method based on DCCA fusion characteristics

Info

Publication number: CN112270228A
Application number: CN202011109621.5A
Authority: CN
Inventors: 张凯兵; 唐瑞琪; 李敏奇; 卢健; 景军锋; 刘薇; 陈小改
Original assignee: Xian Polytechnic University
Current assignee: Xian Polytechnic University
Priority date: 2020-10-16
Filing date: 2020-10-16
Publication date: 2021-01-26

Abstract

The invention discloses a pedestrian re-identification method based on DCCA fusion characteristics, which is implemented according to the following steps: preprocessing the pedestrian re-identification data set, and adjusting the size of the image to be a proper size; respectively carrying out depth feature extraction on the pedestrian data set after processing based on vgg16 deep convolutional neural network and omni-scale deep convolutional neural network; performing typical correlation analysis on the extracted depth features, solving respective projection matrixes, and performing feature fusion on the projected features according to a feature fusion strategy; and finishing the whole pedestrian re-identification process by using the fused features. The pedestrian re-identification method based on DCCA fusion characteristics combines the advantages of vgg16 and the omni-scale depth network, improves the robustness of characteristics, effectively eliminates redundant information while fusing the characteristics, improves the characteristic discrimination capability and improves the accuracy rate of pedestrian re-identification.

Description

Pedestrian re-identification method based on DCCA fusion characteristics

Technical Field

The invention belongs to the technical field of computer vision, and relates to a pedestrian re-identification method based on DCCA fusion characteristics.

Background

Pedestrian re-identification is a very popular research topic in the field of computer vision in recent years, and can be regarded as a sub-problem of image retrieval, namely a technology for retrieving a specific pedestrian in an image or a video by utilizing a computer vision technology. That is, an image of the pedestrian of interest is given in advance and is found out in the non-overlapping monitoring device. The task is applied to the fields of intelligent monitoring, crime criminal investigation and the like.

The traditional method mainly solves the problem of pedestrian re-identification from the following two aspects: the method comprises the steps of 1, designing artificial features, extracting features with robustness to characterize pedestrians, and 2, learning a better distance measurement function, wherein the purpose is to learn the similarity between two images. The application of pedestrian re-identification is that on the basis of feature representation, a distance measurement function with strong discrimination is learned to discriminate the similarity between pedestrian images by utilizing the similarity between features, so that the distance between the same pedestrians is as small as possible, and the distance between different pedestrians is as large as possible. With the advent of deep learning, a typical algorithm (Convolutional Neural Network) represented by CNN has been highlighted in the field of computer vision, especially in the international famous ImageNet image classification tournament, which proves that the deep Neural Network has good recognition efficiency. The convolutional neural network is taken as a representative, the convolutional neural network can automatically focus on important areas of input images, extracts features of different network layers, and is more expressive compared with traditional artificial features. The existing deep learning-based method generally completes the pedestrian re-identification process in an end-to-end manner, the process completes network automatic image feature extraction and similarity matching between features at one time, however, due to the end-to-end manner, the redundancy and feature dimension of the features may be higher.

Disclosure of Invention

The invention aims to provide a pedestrian re-identification method based on DCCA fusion characteristics, which combines the advantages of vgg16 network and omni-scale network, improves the robustness of characteristics, and eliminates redundant information to a certain extent while fusing the characteristics, thereby improving the identification capability of the characteristics and improving the accuracy of pedestrian re-identification.

The technical scheme adopted by the invention is that a pedestrian re-identification method based on DCCA fusion characteristics is implemented according to the following steps:

step 1, preprocessing a pedestrian re-identification data set, and adjusting the size of an image to a proper size;

step 2, respectively extracting depth features of the pedestrian data set after processing based on vgg16 deep convolutional neural network and omni-scale deep convolutional neural network;

step 3, performing typical correlation analysis on the extracted depth features, solving respective projection matrixes, and performing feature fusion on the projected features according to a feature fusion strategy;

and 4, completing the whole pedestrian re-identification process by using the fused features.

The present invention is also characterized in that,

the pedestrian re-identification data set is a marker 1501 data set, the data set is divided into a training set train and a test set, and the test set comprises a query set probe and a candidate set galery.

The pedestrian re-recognition images in the training set train and the test set are each adjusted to a size of 224 × 224 pixels and a size of 256 × 128 pixels, respectively.

The step 2 specifically comprises the following steps:

step 2.1, constructing vgg16 deep convolutional neural network models, wherein the vgg16 deep convolutional neural network models comprise thirteen convolutional layers which are connected in sequence, and the output of the last convolutional layer is connected with three full-connection layers in sequence;

step 2.2, constructing an omni-scale deep convolutional neural network model, wherein the omni-scale deep convolutional neural network model comprises five convolutional layers which are sequentially connected, the output of the last convolutional layer is sequentially connected with two transition layers, and the output of the last transition layer is connected with a full connection layer;

step 2.3, migrating pre-training weight parameters for training the vgg16 deep convolutional neural network model constructed in the step 2.1 and the omni-scale deep convolutional neural network model constructed in the step 2.2 on the ImageNet data set respectively;

step 2.4, inputting the training set train with the image pixel size of 224 × 224 and the training set train with the image pixel size of 256 × 128 into the vgg16 deep convolutional neural network model and the omni-scale deep convolutional neural network model processed in step 2.3 respectively to train the models, initializing partial weight parameters of the vgg16 deep convolutional neural network model constructed in step 2.1 and the omni-scale deep convolutional neural network model constructed in step 2.2 by using pre-training weight parameters during training, then extracting the finally output depth features of the two vgg16 deep convolutional neural network models and the omni-scale deep convolutional neural network model respectively and recording the finally output depth features as H₁And H₂In which H is₁∈R^o×m，H₂∈R^o×mO represents the dimensions of the two features, m represents the number of samples of the data set, and R is the real number set.

In step 2.4, the pre-training weight parameters are used to initialize the vgg16 deep convolutional neural network model constructed in step 2.1 and part of the weight parameters of the omni-scale deep convolutional neural network model constructed in step 2.2, which specifically include: the pre-training weight parameters are applied to sequentially initialize vgg16 weight parameters of the first thirteen layers of the deep convolutional neural network model layer by layer, and then the weight parameters of the full-connection layer of the last three layers are endowed with random values; and (3) applying the pre-training weight parameters to sequentially and correspondingly initialize the weight parameters of the first seven layers of the omni-scale deep convolutional neural network model layer by layer, and then endowing the weight parameters of the fully-connected layer of the last layer with random values.

The step 3 specifically comprises the following steps:

step 3.1, adding H₁ H₁And H₂Normalized to obtain a mean of 0 and a variance of1, standard data;

step 3.2, calculate H₁Variance of (2)

H₂Variance of (2)

H₁And H₂Covariance of

Step 3.3, calculate the matrix

Step 3.4, performing singular value decomposition on the matrix T to obtain a maximum singular value rho and left and right singular vectors u, v corresponding to the maximum singular value;

step 3.5, calculate H₁And H₂Projection matrix A of₁And A₂，

The representation of the two depth features in the relevant subspace is then

And

step 3.6, fusion characteristics are expressed as

Or F₂＝H'₁+H'₂＝A₁H₁+A₂H₂And the dimension is r.

The step 4 specifically comprises the following steps:

step 4.1, respectively inputting the images in the query set probe and the candidate set galery into the vgg16 deep convolutional neural network model and the omni-scale deep convolutional neural network model processed in step 2.3 according to step 2.4, respectively training the images, respectively extracting depth features, and then calculating fusion features corresponding to the query set probe and the candidate set galery according to step 3, wherein the dimensions are r;

step 4.2, the fusion characteristics after train set train training and the corresponding training sample labels are used as input and input into an XQDA algorithm, and the output is a subspace mapping matrix W and a measurement matrix

Wherein ∑'_IIs an intra-class covariance matrix, sigma'_EIs an inter-class covariance matrix;

and 4.3, giving the fusion feature corresponding to the query set probe according to the pedestrian image represented by the fusion feature corresponding to the query set and the candidate set galery, and performing feature similarity measurement on all features in the fusion feature corresponding to the query set probe and the candidate set galery to finally obtain a similarity ranking result, wherein the ranking result is determined by similarity, and the higher the similarity is, the earlier the ranking result is, and the recognition is completed.

In step 4.3, the similarity measurement adopts the Mahalanobis distance, and the fusion characteristics corresponding to the query set probe and the candidate set galery are respectively input into the subspace mapping matrix W and the measurement matrix W

And obtaining the Mahalanobis distance of the corresponding features of the given query set probe and the candidate set galery in the subspace, wherein the smaller the distance is, the higher the similarity is.

The invention has the beneficial effects that:

the method is based on a fusion characteristic pedestrian re-identification method and CCA (typical correlation analysis), combines the advantages of vgg16 and the omni-scale deep network, and improves the characteristic robustness. Meanwhile, maximum correlation analysis is carried out on the two depths by using a DCCA (depth canonical correlation analysis) algorithm, and finally a feature fusion strategy is selected to fuse the two features. The method analyzes the maximum correlation of the features in different spaces in the public subspace, takes the maximum correlation feature between the two features as the discrimination information, effectively eliminates redundant information while fusing the features, improves the feature discrimination capability, and can improve the accuracy of pedestrian re-identification to a certain extent.

Drawings

FIG. 1 is a flowchart of the operation of a pedestrian re-identification method based on DCCA fusion features according to the present invention;

FIG. 2 is a process diagram of vgg16 network extraction of pedestrian features in the pedestrian re-identification method based on DCCA fusion features;

FIG. 3 is a process diagram of extracting pedestrian features by the omni-scale network in the pedestrian re-identification method based on DCCA fusion features of the present invention;

FIG. 4 is a schematic structural diagram of a bottleeck in the embodiment of the present invention;

FIG. 5 is a schematic structural diagram of Lite3 × 3 in the omni-scale deep convolutional neural network model in the embodiment of the present invention.

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

The invention relates to a pedestrian re-identification method based on DCCA fusion characteristics, the flow of which is shown in figure 1 and is implemented according to the following steps:

step 1, preprocessing a pedestrian re-identification data set, adjusting the size of an image to a proper size, selecting a marker 1501 data set from the pedestrian re-identification data set, dividing the data set into a training set train and a test set, wherein the test set comprises a query set probe and a candidate set galery; respectively adjusting the pedestrian re-identification images in the training set train and the testing set to be 224 multiplied by 224 pixel size and 256 multiplied by 128 pixel size;

step 2, respectively extracting depth features of the pedestrian data set after processing based on vgg16 deep convolutional neural network and omni-scale deep convolutional neural network; the method specifically comprises the following steps:

as shown in fig. 2, step 2.1, constructing vgg16 deep convolutional neural network model, vgg16 deep convolutional neural network model includes thirteen convolutional layers connected in sequence, and the output of the last convolutional layer is connected with three full-connection layers in sequence;

as shown in fig. 3, step 2.2, constructing an omni-scale deep convolutional neural network model, where the omni-scale deep convolutional neural network model includes five convolutional layers connected in sequence, the output of the last convolutional layer is connected with two transition layers in sequence, and the output of the last transition layer is connected with a full connection layer;

In step 2.4, the pre-training weight parameters are used to initialize the vgg16 deep convolutional neural network model constructed in step 2.1 and part of the weight parameters of the omni-scale deep convolutional neural network model constructed in step 2.2, which specifically include: the pre-training weight parameters are applied to sequentially initialize vgg16 weight parameters of the first thirteen layers of the deep convolutional neural network model layer by layer, and then the weight parameters of the full-connection layer of the last three layers are endowed with random values; applying the pre-training weight parameters to sequentially and correspondingly initialize the weight parameters of the first seven layers of the omni-scale deep convolutional neural network model layer by layer, and then endowing the weight parameters of the last layer of the full-connection layer with random values;

step 3, performing typical correlation analysis on the extracted depth features, solving respective projection matrixes, and performing feature fusion on the projected features according to a feature fusion strategy; the method specifically comprises the following steps:

step 3.1, adding H₁And H₂Standardizing to obtain standard data with the mean value of 0 and the variance of 1;

step 3.2, calculate H₁Variance of (2)

H₂Variance of (2)

H₁And H₂Covariance of

Step 3.3, calculate the matrix

step 3.5, calculate H₁And H₂Projection matrix A of₁And A₂，

The representation of the two depth features in the relevant subspace is then

And

step 3.6, fusion characteristics are expressed as

Or F₂＝H'₁+H'₂＝A₁H₁+A₂H₂The dimension is r;

step 4, completing the whole pedestrian re-identification process by using the fused features; the method specifically comprises the following steps:

and 4.3, giving a fusion feature corresponding to the query set probe and a candidate set galery to a pedestrian image represented by the fusion feature corresponding to the query set probe and the candidate set galery, performing feature similarity measurement on all features in the fusion feature corresponding to the candidate set galery, and finally obtaining a similarity ranking result, wherein the ranking result is determined by similarity, the higher the similarity is, the earlier the ranking result is, and the recognition is completed, wherein the similarity measurement adopts the Mahalanobis distance, and the fusion features corresponding to the query set probe and the candidate set galery are respectively input into a subspace mapping matrix W and a measurement matrix

Examples

The invention relates to a pedestrian re-identification method based on DCCA fusion characteristics, which is implemented according to the following steps:

the pedestrian re-identification data set is mark 1501, which is shot by 6 cameras (5 high-definition cameras and 1 low-definition camera) at Qinghua university in summer, and 1501 pedestrians and 32688 detected rectangular frames are shot in total, each pedestrian is captured by at least 2 cameras, and multiple images can be obtained in one camera, the training set comprises 751 persons, and the test set comprises 750 persons; each image is scaled to a size of 128 x 48 pixels and the picture resolution is resized in order to meet the input requirements of the depth network used. For the vgg16 network, the picture size is adjusted to 224 × 224, for the omni-scale network, the picture size is adjusted to 256 × 128; data set market1501 contains 1501 pedestrians, 32688 total images, and is divided into train (751 people) and test (750 people), wherein train contains 12936 images, test contains probe and galery, wherein probe contains 3368 images, and galery contains 19732 images;

step 2.1, constructing vgg16 deep convolutional neural network models, wherein the vgg16 deep convolutional neural network models comprise thirteen convolutional layers which are connected in sequence, and the output of the last convolutional layer is connected with three full-connection layers in sequence; as shown in fig. 2, after passing through thirteen convolutional layers and three full-link layers, the sizes of the characteristic diagrams are as follows: 224 × 224 × 64, 112 × 112 × 128, 28 × 28 × 256, 14 × 14 × 512, 7 × 7 × 512, 1 × 1 × 4096, and 1 × 1 × 751. The specific network structure parameters are set as follows: the convolution kernel size of the first layer convolution layer is 3 multiplied by 64, the step size is 1 multiplied by 1, and the filling mode is same; the convolution kernel size of the second layer of convolution layer is 3 multiplied by 64, the step size is 1 multiplied by 1, the filling mode is same, the pooling layer adopts the maximum pooling mode, and the pooling window size is 2 multiplied by 2; the convolution kernel size of the third layer of convolution layer is 3 multiplied by 128, the step size is 1 multiplied by 1, and the filling mode is same; the convolution kernel size of the fourth layer of convolution layer is 3 multiplied by 128, the step size is 1 multiplied by 1, the filling mode is same, the pooling layer adopts the maximum pooling mode, and the pooling window size is 2 multiplied by 2; the convolution kernel size of the fifth layer convolution layer is 3 multiplied by 256, the step size is 1 multiplied by 1, and the filling mode is same; the convolution kernel size of the sixth layer of convolution layer is 3 multiplied by 256, the step size is 1 multiplied by 1, and the filling mode is same; the convolution kernel size of the seventh layer of convolution layer is 3 multiplied by 256, the step size is 1 multiplied by 1, the filling mode is same, the pooling layer adopts the maximum pooling mode, and the pooling window size is 2 multiplied by 2; the convolution kernel size of the eighth layer of convolution layer is 3 × 3 × 512, the step size is 1 × 1, and the filling mode is same; the convolution kernel size of the ninth convolution layer is 3 × 3 × 512, the step size is 1 × 1, and the filling mode is same; the convolution kernel size of the tenth layer of convolution layer is 3 multiplied by 512, the step size is 1 multiplied by 1, the filling mode is same, the pooling layer adopts the maximum pooling mode, and the pooling window size is 2 multiplied by 2; the convolution kernel size of the eleventh convolution layer is 3 multiplied by 512, the step size is 1 multiplied by 1, and the filling mode is same; the convolution kernel size of the twelfth convolution layer is 3 multiplied by 512, the step size is 1 multiplied by 1, and the filling mode is same; the convolution kernel size of the thirteenth layer of convolution layer is 3 × 3 × 512, the step size is 1 × 1, the filling mode is same, the pooling layer adopts the maximum pooling mode, and the pooling window size is 2 × 2; then, the characteristic diagram of the thirteenth layer is flattened, the characteristic diagram is input into a full connection layer, the input neurons are randomly disconnected with a probability of 0.5 when the full connection layer of the fourteenth layer passes through a Dropout layer to update parameters so as to prevent overfitting and output 4096 neurons, the input neurons are randomly disconnected with a probability of 25% when the full connection layer of the twelfth layer passes through the Dropout layer to update parameters so as to prevent overfitting and output 4096 neurons, and the full connection layer of the sixteenth layer sets and outputs 751 neurons according to the category number of the data set;

step 2.2, constructing an omni-scale deep convolutional neural network model, wherein the omni-scale deep convolutional neural network model comprises five convolutional layers which are sequentially connected, the output of the last convolutional layer is sequentially connected with two transition layers, and the output of the last transition layer is connected with a full connection layer; as shown in fig. 3, after passing through the convolutional layer, the transition layer, and the full link layer, the characteristic diagram size is: 128 × 64 × 64, 64 × 32 × 256, 32 × 16 × 384, 16 × 8 × 512, and 1 × 1 × 512. The specific network structure parameters are set as follows: the convolution kernel size of the first layer of convolution layer is 7 multiplied by 64, the step length is 2 multiplied by 2, the pooling layer adopts the maximum pooling mode, and the pooling window size is 2 multiplied by 2; the second convolutional layer comprises two bottleeck block structures, the bottleeck structure is shown in fig. 4, an improved residual block structure is adopted, 4 convolutional stream branches are used, the Lite3 × 3 structure is a deep separable convolution, as shown in fig. 5, the deep separable convolution improves the standard convolution, the 3 × 3 convolution is changed into 1 × 1 point convolution and 3 × 3 deep convolution, and parameters to be updated by the network are reduced. The third layer of transition layer comprises a convolution layer and an average pooling layer, the size of convolution kernels of the convolution layer is 1 multiplied by 256, the step size is 1 multiplied by 1, the pooling layer adopts an average pooling mode, the pooling window size is 2 multiplied by 2, and the step size is 2 multiplied by 2; the fourth layer of convolution layer comprises two bottompiece structures; the fifth layer of transition comprises a convolution layer and an average pooling layer, the convolution kernel of the convolution layer has the size of 1 multiplied by 256, the step size is 1 multiplied by 1, the pooling layer adopts an average pooling mode, the pooling window size is 2 multiplied by 2, and the step size is 2 multiplied by 2; the sixth layer of convolution layer comprises two bottompiece structures; the convolution kernel size of the seventh layer of convolution layer is 1 × 1 × 512, and the step size is 1 × 1; the seventh fc layer sets and outputs 751 neurons according to the data set category number;

step 2.3, migrating pre-training weight parameters for training the vgg16 deep convolutional neural network model constructed in the step 2.1 and the omni-scale deep convolutional neural network model constructed in the step 2.2 on the ImageNet data set respectively; the network model can have a good initialization parameter by the transfer learning, so that the convergence of the network is accelerated, and the generalization capability of the network is improved;

step 2.4, the training set train with image pixel size of 224 × 224 and the training set with image pixel size of 256 × 128 are input to the vgg16 deep convolutional neural network model and omni-sc processed in step 2.3, respectivelyale, training the deep convolutional neural network models respectively, and during training, initializing part of weight parameters of the vgg16 deep convolutional neural network model constructed in the step 2.1 and the omni-scale deep convolutional neural network model constructed in the step 2.2 by using pre-training weight parameters, namely: the pre-training weight parameters are applied to sequentially initialize vgg16 weight parameters of the first thirteen layers of the deep convolutional neural network model layer by layer, and then the weight parameters of the full-connection layer of the last three layers are endowed with random values; the pre-training weight parameters are applied to sequentially and layer by layer correspondingly initializing the weight parameters of the first seven layers of the omni-scale deep convolutional neural network model, then the weight parameters of the full connection layer of the last layer are endowed with random values, and finally output depth features of the two vgg16 deep convolutional neural network models and the omni-scale deep convolutional neural network model are respectively extracted and recorded as H₁And H₂In which H is₁∈R^o×m，H₂∈R^o×mO represents the dimensions of two features, m represents the number of samples of the dataset, R is the set of real numbers, and both outputs are 751 dimensions;

step 3.1, adding H₁ H₁And H₂Standardizing to obtain standard data with the mean value of 0 and the variance of 1;

step 3.2, calculate H₁Variance of (2)

H₂Variance of (2)

H₁And H₂Covariance of

Step 3.3, calculate the matrix

step 3.5, calculate H₁And H₂Projection matrix A of₁And A₂，

The representation of the two depth features in the relevant subspace is then

And

step 3.6, fusion characteristics are expressed as

Or F₂＝H'₁+H'₂＝A₁H₁+A₂H₂The dimension is r;

The evaluation results used CMC curves, using rank1, rank5, rank10, rank20 as evaluation indexes, where the value of rank1 is particularly important in evaluating the effect of pedestrian re-identification.

The derivation process of step 3 is as follows:

solving for H using CCA (canonical correlation analysis)₁And H₂Is given by H₁And H₂Respectively is A₁And A₂Their representation in subspace is:

and

their correlation coefficient can be expressed as:

the objective function is:

i.e. solving the mapping matrix A corresponding to the maximum correlation coefficient₁ A₁And A₂；

Before projection, raw data is first normalized to obtain data with a mean of 0 and a variance of 1, such that:

by the same method

Wherein H₁And H₂Representing two network depth features, Cov represents covariance matrix, E represents expectation, and Var represents variance matrix.

Due to H₁And H₂All mean values of (a) are 0, then:

order to

The objective function is converted into

Since the numerator denominator is increased by the same factor, the optimization target result is unchanged. And (4) fixing denominators and optimizing numerators by adopting an optimization method similar to the SVM. Namely:

in solving the objective function in (6), a method of SVD (singular value decomposition) may be employed,

order to

u, v are unit vectors

Then

At the same time, from

The following can be obtained:

by

The following can be obtained:

at this time, the objective function is:

s.t.u^Tu＝1，v^Tv＝1

for the objective function, let the matrix

In this case, U and V may be regarded as left and right singular vectors corresponding to one singular value of the matrix T, and T ═ U Σ V may be obtained by singular value decomposition^TU and V are matrixes formed by a left singular vector and a right singular vector of T respectively, and sigma is a diagonal matrix formed by singular values of T. Since all columns of U, V are orthonormal bases, U^TU and V^Tv gets a vector with only one scalar 1 and the remaining scalars 0. Maximization

The corresponding maximum value is the maximum value of singular values corresponding to a group of left and right singular vectors, namely after sigma is subjected to singular value decomposition, the maximum singular value is the maximum value of an optimization target, namely H₁And H₂The maximum correlation coefficient therebetween. Using the left and right singular vectors to calculate H₁And H₂Projection matrix A of₁And A₂Are respectively

By projecting a matrix A₁And A₂Based on a feature fusion strategy, corresponding fusion is carried out on features of DCCA (deep canonical correlation analysis), and specific fusion modes include the following two modes:

F₂＝H'₁+H'₂＝A₁H₁+A₂H₂。

based on a fusion feature pedestrian re-identification method and CCA (canonical correlation analysis), the method improves feature robustness by combining the advantages of vgg16 and the omni-scale depth network, simultaneously utilizes DCCA (canonical correlation analysis) algorithm to carry out maximum correlation analysis on two depth features, and finally selects a feature fusion strategy to fuse the two features.

Claims

1. A pedestrian re-identification method based on DCCA fusion features is characterized by comprising the following steps:

2. The method as claimed in claim 1, wherein the pedestrian re-identification data set is a marker 1501 data set, and the data set is divided into a training set train and a test set, wherein the test set includes a query set probe and a candidate set galery.

3. The DCCA-fusion-feature-based pedestrian re-recognition method according to claim 2, wherein the pedestrian re-recognition images in the training set train and the testing set are adjusted to 224 x 224 pixel size and 256 x 128 pixel size, respectively.

4. The pedestrian re-identification method based on DCCA fusion characteristics according to claim 3, wherein said step 2 specifically is:

5. The pedestrian re-identification method based on DCCA fusion characteristics according to claim 4, wherein in step 2.4, the vgg16 deep convolutional neural network model constructed in step 2.1 and part of the weight parameters of the omni-scale deep convolutional neural network model constructed in step 2.2 are initialized with pre-trained weight parameters, specifically: the pre-training weight parameters are applied to sequentially initialize vgg16 weight parameters of the first thirteen layers of the deep convolutional neural network model layer by layer, and then the weight parameters of the full-connection layer of the last three layers are endowed with random values; and (3) applying the pre-training weight parameters to sequentially and correspondingly initialize the weight parameters of the first seven layers of the omni-scale deep convolutional neural network model layer by layer, and then endowing the weight parameters of the fully-connected layer of the last layer with random values.

6. The pedestrian re-identification method based on DCCA fusion characteristics according to claim 5, wherein said step 3 specifically is:

step 3.2, calculate H₁Variance of (2)

H₂Variance of (2)

H₁And H₂Covariance of

Step 3.3, calculate the matrix

step 3.5, calculate H₁And H₂Projection matrix A of₁And A₂，

The representation of the two depth features in the relevant subspace is then

And

step 3.6, fusion characteristics are expressed as

Or F₂＝H′₁+H′₂＝A₁H₁+A₂H₂And the dimension is r.

7. The pedestrian re-identification method based on DCCA fusion characteristics according to claim 6, wherein said step 4 specifically is:

8. A substrate according to claim 7The pedestrian re-identification method based on DCCA fusion characteristics is characterized in that the similarity measurement in the step 4.3 adopts the Mahalanobis distance, and the fusion characteristics corresponding to the query set probe and the candidate set galery are respectively input into the subspace mapping matrix W and the measurement matrix