Pedestrian re-identification method based on canonical correlation analysis fusion features
Technical Field
The invention belongs to the technical field of computer vision, and relates to a pedestrian re-identification method based on typical correlation analysis fusion characteristics.
Background
Pedestrian re-identification is a very popular research topic in the field of computer vision, and aims to provide an interested pedestrian which is found in non-overlapping monitoring equipment through computer vision technology. Most existing methods, when solving the problem of pedestrian re-identification, mainly start from two aspects: 1 developing a distinctive feature representation; 2 seek a discriminative distance measure. The method based on feature representation aims to extract features with robustness to represent pedestrians, and the features used for pedestrian re-identification can be divided into three categories: visual features, filter features, attribute features. The method of metric-based learning consists in learning the similarity between two pictures. The application of pedestrian re-identification is that on the basis of feature representation, the similarity between features is used for judging the similarity between pedestrian images, and the distance between the same pedestrians is made as small as possible and the distance between different pedestrians is made as large as possible by learning a distance measurement function with strong judgment force.
The characteristics are the basis of pedestrian re-identification, and the final result of pedestrian re-identification is directly influenced by the identification power of the characteristics. The color feature is a feature which is most widely applied and can represent the color distribution condition of the pedestrian image, the color feature has robustness on the change of the posture and the visual angle, but is easily influenced by illumination and shielding, and meanwhile, the color feature is difficult to distinguish for the similar pedestrian image. The texture features have robustness to illumination, and the color texture features are combined, so that the identification accuracy can be effectively improved. Generally, the artificial design features that a plurality of simple features are combined, the method combines the advantages of different features representing pedestrians, and the recognition effect is good. But as the number of combined features increases, the dimension of the combined features grows exponentially. The existing method for fusing features mostly fuses different features together through a serial or parallel strategy, and is simple and effective, and has the defects that the method does not consider the internal relation among the different features during combination, only different features are stacked, all feature information is reserved during fusion, a large amount of redundant information is reserved, the dimension of the combined features is high, the calculation complexity is increased, and certain influence is exerted on the identification accuracy and the real-time performance.
Disclosure of Invention
The invention aims to provide a pedestrian re-identification method based on typical correlation analysis fusion characteristics, and solves the problems of high result dimensionality, a large amount of redundant information and complex calculation of the fusion characteristics in the prior art.
The technical scheme includes that the pedestrian re-identification method based on the typical correlation analysis fusion features comprises three stages of a feature extraction stage, a mapping matrix solving stage and a pedestrian re-identification stage of the fusion features, two different features X and Y are extracted from a pedestrian image in the feature extraction stage, typical correlation analysis is respectively carried out on the two features X and Y in the mapping matrix solving stage to obtain a pair of mapping matrixes α and β, and new features are represented as X' ═ α
TX,Y'=β
TY,α
TFor the transpose of the mapping matrix α, β
TIs the transposition of the mapping matrix β, and during the pedestrian re-identification stage of the fusion feature, the fusion feature is expressed as
Or Z
2X '+ Y' fused feature Z
1Or Z
2Divided into training set and test set, and the training set is used to train pedestriansAnd identifying the model, and testing the trained model by using the test set.
The invention is also characterized in that:
the method comprises the following specific steps:
step 1, extracting two features of a pedestrian re-identification data set:
extracting features of the data set of the pedestrian image by using different feature extraction algorithms, and respectively recording the features as follows:
X∈Rp*N,Y∈Rq*N
p and q respectively represent the dimensionality of the two features, and N represents the number of pictures contained in the data set;
step 2, performing typical correlation analysis on the two features X and Y extracted in the step 1 respectively, solving by using a singular value decomposition method to obtain a pair of mapping matrixes α and β, and expressing the new feature as X' ═ αTX,Y'=βTY,αTFor the transpose of the mapping matrix α, βTIs a transpose of the mapping matrix β;
and 3, carrying out pedestrian re-identification by using the fusion features:
step 3.1, through the mapping matrices α and β obtained in step 2, a fused representation of typical relevant features is obtained by the following fusion strategy as
Or Z
2=X'+Y'=α
TX+β
TY, fused feature Z
1Or Z
2According to the division rule of different data sets in pedestrian re-identification, dividing the data sets into a training set I and a testing set I at a first visual angle, a training set II and a testing set II at a second visual angle, training a model for pedestrian re-identification by using the training set I and the training set II, and testing the trained models by using the testing set I and the testing set II;
and 3.2, evaluating the result tested in the step 3.1 by using the cumulative matching curve CMC, and taking the identification rate of rank1 as the most important evaluation index, wherein the identification effect is better when the value of rank1 is larger.
In the step 2, a projection matrix is solved by using a singular value decomposition method, and the solving process is as follows:
1) standardizing the two characteristics to obtain standard data with the mean value of 0 and the variance of 1;
2) calculating the variance S of XXXVariance S of YYYX and Y covariance SXY;
3) Calculating the matrix
4) Performing singular value decomposition on the matrix M to obtain a maximum singular value sigma and left and right singular vectors u and v corresponding to the maximum singular value;
5) calculating mapping matrices α and β for X and Y,
6) the representation of the two features in the relevant subspace is X' ═ αTX,Y'=βTY。
The specific process of solving the projection matrix by using the singular value decomposition method in the step 2 is as follows:
(1) let the mapping matrices for X and Y be α and β, respectively, which are denoted X' ═ α in subspaceTX and Y' βTY, their correlation coefficient can be expressed as:
the objective function is:
that is, the mapping matrixes α and β corresponding to the maximum correlation coefficient are solved;
(2) before projection, raw data are firstly normalized to obtain data with the mean value of 0 and the variance of 1,
Cov(αTX,βTY)=E(<αTX,βTY>)=E((αTX)(βTY)T)=αTE(XYT)β
similarly, Var (β)TY)=βTE(YYT)β,μxIs the mean of X;
(3) since the mean values of X and Y are both 0, then
Var(X)=Cov(X,X)=E(XXT)
Var(Y)=Cov(Y,Y)=E(YYT)
Cov(X,Y)=E(XYT)
Cov(Y,X)=E(YXT);
(4) Order SXX=Var(X,X),SYY=Var(Y,Y),SXYCov (X, Y), the objective function is converted to
(5) Because the denominator of the numerator is increased by the same times, the optimization target result is not changed, the denominator is fixed, and the numerator is optimized, namely:
s.t.αTSXXα=1,βTSYYβ=1;
(6) in the solution of the objective function in (5), a singular value decomposition method is adopted, u and v are two unit vectors,
At the same time, αTSXXα — 1, available:
from βTSYYβ — 1, available:
at this time, the objective function is:
s.t.uTu=1,vTv=1;
(7) for the objective function in (6), let the matrix
At this time, U and V represent left and right singular vectors corresponding to one singular value of the matrix M, and M ═ U Σ V is obtained by singular value decomposition
TWherein U and V are matrixes formed by a left singular vector and a right singular vector of M respectively, and sigma is a diagonal matrix formed by singular values of M; since all columns of U, V are orthonormal bases, U
TU and V
Tv, obtaining a vector with only one scalar being 1 and the other scalars being 0; at this time, the process of the present invention,
maximization
The corresponding maximum value is the maximum value of singular values corresponding to a group of left and right singular vectors, namely after singular value decomposition is carried out on M, the maximum singular value is the maximum value of an optimization target, namely the maximum correlation coefficient between X and Y;
(8) the original mapping matrix of X and Y is obtained by using the corresponding left and right singular vectors u, v
In step 3.1, the XQDA algorithm is used in the process of training the pedestrian re-identification model, the training set and the training sample labels are used as input, and the output is the subspace mapping matrix W and
wherein ∑'
IIs an intra-class covariance matrix, sigma'
EIs an inter-class covariance matrix;
during testing, the Mahalanobis distance is used for measuring the similarity between two pedestrian images, M and the mapping of the training features on the subspace W are input, and the Mahalanobis distance of the original features on the subspace is obtained.
The invention has the beneficial effects that: the invention mainly researches a pedestrian re-identification method based on feature fusion of typical correlation analysis. Aiming at the problems of high fusion result dimensionality, a large amount of redundant information and complex calculation of the current feature fusion method, a typical correlation analysis algorithm is used for analyzing the internal relation among different features of the same target and respectively searching a linear combination of the features, so that the new feature retains most of information of the original feature and has the maximum correlation with another new feature. The two new characteristics are fused according to a certain strategy, so that the purpose of characteristic fusion is achieved, and redundant information between the characteristics is eliminated.
Drawings
FIG. 1 is a diagram of a feature fusion process of a pedestrian re-identification method based on a typical correlation analysis fusion feature according to the present invention;
FIG. 2 is a graph of the results of two features of a pedestrian re-identification method based on representative correlation analysis fused features and fused features on a VIPeR data set in accordance with the present invention;
fig. 3 is a graph showing the specific results of fig. 2 when rank1, rank5, rank10, and rank20 are used as evaluation indexes.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention relates to a pedestrian re-identification method based on typical correlation analysis fusion characteristics, which comprises three stages as shown in figure 1, namely a characteristic extraction stage, a mapping matrix solving stage and a pedestrian re-identification stage of fusion characteristics, wherein in the characteristic extraction stage, two different characteristics X and Y are extracted from a pedestrian image, in the mapping matrix solving stage, typical correlation analysis is respectively carried out on the two characteristics X and Y to obtain a pair of mapping matrixes α and β, and the new characteristic is represented as X' ═ α
TX,Y'=β
TY,α
TFor the transpose of the mapping matrix α, β
TIs the transposition of the mapping matrix β, and during the pedestrian re-identification stage of the fusion feature, the fusion feature is expressed as
Or Z
2X '+ Y' fused feature Z
1Or Z
2And dividing the model into a training set and a testing set, training the model identified by the pedestrian by using the training set, and testing the trained model by using the testing set.
The invention relates to a pedestrian re-identification method based on typical correlation analysis fusion characteristics, which comprises the following specific steps of:
step 1, extracting two features of a pedestrian re-identification data set:
extracting features of the data set of the pedestrian image by using different feature extraction algorithms, and respectively recording the features as follows:
X∈Rp*N,Y∈Rq*N
p and q respectively represent the dimensionality of the two features, and N represents the number of pictures contained in the data set;
step 2, performing typical correlation analysis on the two features X and Y extracted in the step 1 respectively, solving by using a singular value decomposition method to obtain a pair of mapping matrixes α and β, and expressing the new feature as X' ═ αTX,Y'=βTY,αTFor the transpose of the mapping matrix α, βTIs a transpose of the mapping matrix β;
and 3, carrying out pedestrian re-identification by using the fusion features:
step 3.1, through the mapping matrices α and β obtained in step 2, a typical one is obtained by the following fusion strategyThe fusion of the relevant features is represented as
Or Z
2=X'+Y'=α
TX+β
TY, fused feature Z
1Or Z
2According to the division rule of different data sets in pedestrian re-identification, dividing the data sets into a training set I and a testing set I at a first visual angle, a training set II and a testing set II at a second visual angle, training a model for pedestrian re-identification by using the training set I and the training set II, and testing the trained models by using the testing set I and the testing set II;
and 3.2, evaluating the result tested in the step 3.1 by using the cumulative matching curve CMC, and taking the identification rate of rank1 as the most important evaluation index, wherein the identification effect is better when the value of rank1 is larger.
In the step 2, a projection matrix is solved by using a singular value decomposition method, and the solving process is as follows:
1) standardizing the two characteristics to obtain standard data with the mean value of 0 and the variance of 1;
2) calculating the variance S of XXXVariance S of YYYX and Y covariance SXY;
3) Calculating the matrix
4) Performing singular value decomposition on the matrix M to obtain a maximum singular value sigma and left and right singular vectors u and v corresponding to the maximum singular value;
5) calculating mapping matrices α and β for X and Y,
6) the representation of the two features in the relevant subspace is X' ═ αTX,Y'=βTY。
The specific process of solving the projection matrix by using the singular value decomposition method in the step 2 is as follows:
(1) let the mapping matrices for X and Y be α and β, respectively, which are denoted X' ═ α in subspaceTX andY'=βTy, their correlation coefficient can be expressed as:
the objective function is:
that is, the mapping matrixes α and β corresponding to the maximum correlation coefficient are solved;
(2) before projection, raw data are firstly normalized to obtain data with the mean value of 0 and the variance of 1,
Cov(αTX,βTY)=E(<αTX,βTY>)=E((αTX)(βTY)T)=αTE(XYT)β
similarly, Var (β)TY)=βTE(YYT)β,μxIs the mean of X;
(3) since the mean values of X and Y are both 0, then
Var(X)=Cov(X,X)=E(XXT)
Var(Y)=Cov(Y,Y)=E(YYT)
Cov(X,Y)=E(XYT)
Cov(Y,X)=E(YXT);
(4) Order SXX=Var(X,X),SYY=Var(Y,Y),SXYCov (X, Y), the objective function is converted to
(5) Because the denominator of the numerator is increased by the same times, the optimization target result is not changed, the denominator is fixed, and the numerator is optimized, namely:
s.t.αTSXXα=1,βTSYYβ=1;
(6) in the solution of the objective function in (5), a singular value decomposition method is adopted, u and v are two unit vectors,
At the same time, αTSXXα — 1, available:
from βTSYYβ — 1, available:
at this time, the objective function is:
s.t.uTu=1,vTv=1;
(7) for the objective function in (6), let the matrix
At this time, U and V represent left and right singular vectors corresponding to one singular value of the matrix M, and M ═ U Σ V is obtained by singular value decomposition
TWherein U and V are matrices composed of left singular vectors and right singular vectors of M, and Σ is diagonal matrix composed of singular values of M(ii) a Since all columns of U, V are orthonormal bases, U
TU and V
Tv, obtaining a vector with only one scalar being 1 and the other scalars being 0; at this time, the process of the present invention,
maximization
The corresponding maximum value is the maximum value of singular values corresponding to a group of left and right singular vectors, namely after singular value decomposition is carried out on M, the maximum singular value is the maximum value of an optimization target, namely the maximum correlation coefficient between X and Y;
(8) the original mapping matrix of X and Y is obtained by using the corresponding left and right singular vectors u, v
In step 3.1, the XQDA algorithm is used in the process of training the pedestrian re-identification model, the training set and the training sample labels are used as input, and the output is the subspace mapping matrix W and
wherein ∑'
IIs an intra-class covariance matrix, sigma'
EIs an inter-class covariance matrix;
during testing, the Mahalanobis distance is used for measuring the similarity between two pedestrian images, M and the mapping of the training features on the subspace W are input, and the Mahalanobis distance of the original features on the subspace is obtained.
The invention discloses a pedestrian re-identification method based on typical correlation analysis fusion characteristics, which has the advantages that: the method adopts a typical correlation analysis and fusion strategy in the feature fusion stage, analyzes the maximum correlation of different spatial features in a public subspace, takes the maximum correlation feature between the two features as discrimination information, effectively eliminates redundant information while fusing the features, and reduces the calculation amount and difficulty.
Example one
The invention relates to a pedestrian re-identification method based on typical correlation analysis fusion characteristics, which is implemented according to the following steps:
step 1: extraction of two features from a pedestrian re-identification dataset
Using a pedestrian re-identification dataset VIPeR, which contains 632 pairs of pedestrian images, for a total of 1264, each pair of images containing two pictures of a person from different perspectives, and each image being scaled to a size of 128 × 48 pixels, we (weighted Histogram of overlapping stripes) and lomo (local maximum occupancy) features were extracted from the dataset in combination with existing feature extraction methods.
Step 2: performing typical correlation analysis on the features, solving a mapping matrix, and solving by using a singular value decomposition method as follows:
1) standardizing the two characteristics to obtain standard data with the mean value of 0 and the variance of 1;
2) calculating the variance S of XXXVariance S of YYYX and Y covariance SXY;
3) Calculating the matrix
4) Performing singular value decomposition on the matrix M to obtain a maximum singular value sigma and left and right singular vectors u and v corresponding to the maximum singular value;
5) calculating mapping matrices α and β for X and Y,
6) the representation of the two features in the relevant subspace is X' ═ αTX,Y'=βTY。
And step 3: and re-identifying the pedestrian by using the fusion characteristics, wherein the specific process is as follows:
1) the fusion characteristic Z belongs to R
d*NWhere d is the dimension of the fused feature, N is the number of pictures in the data set, and the fused feature is represented as
Or Z
2=X'+Y'=α
TX+β
TY, for the VIPeR data set, N is 1264, 1-632 columns of characteristics are used as a query set, and 633-1264 columns of characteristics are used as a candidate set;
2) for the query set and the candidate set, respectively and randomly selecting 316 columns of characteristics as two training sets, and the rest 316 columns are two test sets;
3) the identification process uses an XQDA (Cross-view quantized invariant Analysis) algorithm, takes a training set and a training sample label as input, and outputs a subspace mapping matrix W and a subspace mapping matrix W
Wherein ∑'
IIs an intra-class covariance matrix, sigma'
EIs an inter-class covariance matrix;
4) during testing, the Mahalanobis distance is used for measuring the similarity between two pedestrian images, M and the mapping of the training features on the subspace W are input, and the Mahalanobis distance of the original features in the subspace can be obtained;
5) as shown in fig. 2 and 3, the evaluation results used CMC curves, using rank1, rank5, rank10, rank20 as evaluation indexes, where the value of rank1 is particularly important in evaluating the effect of pedestrian re-recognition.