CN109543602B - Pedestrian re-identification method based on multi-view image feature decomposition - Google Patents

Pedestrian re-identification method based on multi-view image feature decomposition Download PDF

Info

Publication number
CN109543602B
CN109543602B CN201811388865.4A CN201811388865A CN109543602B CN 109543602 B CN109543602 B CN 109543602B CN 201811388865 A CN201811388865 A CN 201811388865A CN 109543602 B CN109543602 B CN 109543602B
Authority
CN
China
Prior art keywords
pedestrian
image
feature
network
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811388865.4A
Other languages
Chinese (zh)
Other versions
CN109543602A (en
Inventor
杨晓峰
李海芳
邓红霞
姚蓉
郭浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taiyuan University of Technology
Original Assignee
Taiyuan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taiyuan University of Technology filed Critical Taiyuan University of Technology
Priority to CN201811388865.4A priority Critical patent/CN109543602B/en
Publication of CN109543602A publication Critical patent/CN109543602A/en
Application granted granted Critical
Publication of CN109543602B publication Critical patent/CN109543602B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/30Scenes; Scene-specific elements in albums, collections or shared content, e.g. social network photos or video

Abstract

The invention relates to the technical field of intelligent image retrieval, in particular to a pedestrian re-identification method based on multi-view image feature decomposition. The characteristics of the same visual angle are directly used for re-identification, and the characteristics of different visual angles need to be subjected to characteristic conversion. Experiments show that the decomposition of the pedestrian images greatly helps to improve the accuracy of pedestrian re-identification.

Description

Pedestrian re-identification method based on multi-view image feature decomposition
Technical Field
The invention relates to the technical field of intelligent image retrieval, in particular to a pedestrian re-identification method based on multi-view image feature decomposition.
Background
With the public security of society getting more and more attention, a large number of monitoring cameras are installed in many important places, and monitoring videos shot by the monitoring cameras can provide clues for the public security department to detect important criminal cases such as terrorist attacks and people fighting. Pedestrian re-identification is an automatic target identification technology, quickly locates an interested human target in a monitoring network, and is an important step in applications such as intelligent video monitoring and human behavior analysis. The pedestrian re-identification technology is mainly used for solving the pedestrian retrieval problem in the safety monitoring field.
Pedestrian re-identification is typically studied from both aspects of feature extraction and distance metric learning. The pedestrian re-identification research based on the feature extraction is a core problem for image identification. And appropriate and robust features are extracted, and the detection result and the execution efficiency are greatly improved. The commonly used features are mainly: HOG characteristics, SIFT characteristics, SURF characteristics, Covariance descriptors, ELF characteristics, Haar-like reproduction characteristics, LBP characteristics, Gabor filters, Co-occurence matrixes and the like.
Pedestrian re-identification studies based on distance metric learning. Due to the fact that the visual angle, the scale, the illumination, the clothing and posture change, the resolution ratio are different and shielding exists, continuous position and motion information may be lost among different cameras, and the similarity of the pedestrian apparent characteristics measured by using the Euclidean distance, the Papanicolaou distance and other standard distance measurement cannot achieve a good re-recognition effect, so that researchers propose a method for measuring the similarity of pedestrians in different images through a measurement learning method. Common distance metric learning algorithms are: LMNN algorithm, PRDC algorithm, LDML algorithm KISSME algorithm, LFDA algorithm, XQDA algorithm.
In 2014, the study of pedestrian re-identification is combined with deep learning, and the pedestrian re-identification based on the image mainly uses a deep convolution neural network method at present. McLaughlin et al performs transfer learning by using a method based on an AlexNet Convolutional Neural Network (CNN) structure, extracts color and optical flow features from an image, obtains high-level representation by using convolutional neural network processing, captures time information by using a Recurrent Neural Network (RNN), and performs pooling to obtain sequence features. Xiao et al trained the same Convolutional Neural Network (CNN) for data from various domains, some neurons learned the shared characterization of various domains, and others were effective for a certain specific region, to obtain a robust CNN feature representation.
In the image-based pedestrian re-identification study, VIPeR, the most widely adopted dataset, increased the accuracy of rank-1 from 12.0% in 2008 by 63.9% in 2015; meanwhile, rank-1 on the data set CUHK01 is improved to 79.9% of 2017; on Market-1501, the application of deep learning significantly improved the accuracy of rank-1, and when the data set was applied to the study of pedestrian re-identification from 2015, the accuracy of rank-1 improved from 44.42% to 82.21% in 2017. As can be seen from the above-described research results, the research of pedestrian re-identification based on images has made a great progress, but there is room for further improvement, and there is mainly a problem of low accuracy. Note: rank-1 indicates that the solution with the highest probability in the decision results is the correct solution.
Disclosure of Invention
The invention aims to provide a pedestrian re-identification method based on multi-view image feature decomposition, which realizes multi-view image feature decomposition (front, side and back) of any pedestrian image through an improved capsule neural network, generates image feature description information and corresponding probability information of any pedestrian image under multi-view angles, and uses the generated pedestrian feature description information and the generated probability information of the features for pedestrian image similarity measurement.
In order to achieve the purpose, the technical scheme provided by the invention is as follows:
a pedestrian re-identification method based on multi-view image feature decomposition comprises the following steps:
s1, selecting a standard image in the pedestrian image dataset, dividing the selected standard image into a training dataset and a testing dataset, and establishing a training dataset and a testing dataset of the multi-view image feature decomposition neural network;
s2, zooming or cutting the pedestrian image in the data set into an RGB image with the resolution of 192 multiplied by 64 to be used as a training set and a testing set;
s3, designing a capsule classification network Capsule eNet into two convolution layers and two capsule layers to obtain an improved capsule network;
s4, training the improved capsule classification network in S3 on a pedestrian image training set at a standard visual angle in S2, constructing a multi-visual-angle image feature decomposition neural network, and generating the similarity and image feature vectors of any pedestrian image at three standard visual angles;
s5, generating a cross-view characteristic transformation matrix for pedestrian image characteristic transformation at different views, and calculating a cross-view characteristic transformation coefficient matrix for reducing cross-view pedestrian image characteristic measurement loss;
s6, selecting a pedestrian image similarity measurement function, realizing pedestrian re-identification, realizing same-view angle feature comparison and cross-view angle feature comparison, and using the pedestrian image similarity measurement function in pedestrian re-identification.
Further, the standard images in S1 are a front image, a side image, and a back image with a standard shooting angle.
Further, the improved capsule network in S3 is a three-classification network.
Further, the first layer of the modified capsule network CapsuleNet in S3 is set as a convolution layer, and the parameters are as follows: 5 multiplied by 5 of convolution kernel, 32 of output channels, B multiplied by 3 multiplied by 192 multiplied by 64 of input dimensionality, and B is the quantity of ultrasonic parameter batch processing; the second layer is set as a convolution layer, and the parameters are as follows: convolution kernel 5 × 5, output channel number 256; the third layer is a primary capsule layer, and the parameters are as follows: convolution kernel 5 × 5, output channel number 8; the fourth layer is a coding capsule layer, and the parameters are as follows: the number of output channels is 3, the routing parameters are 2, the output dimension is Bx 3 x 64, and B is the batch processing number of the super parameters.
Further, the convolutional layer can be replaced by a single layer neural network, ResNet, HourglassNet, or other convolutional network.
Further, the similarity calculation formula in S4 is:
Figure RE-GDA0001949063680000041
where e is a natural constant (e ≈ 2.71828), V represents the length of a feature vector mode, i represents the ith pedestrian image, and VijRepresenting the length of the eigenvector mode, S, at a certain (j) view angle for the ith pedestrianijRepresenting the similarity of the ith pedestrian at viewing angle j.
Further, the generating of the cross-perspective feature transformation matrix in S5 specifically includes the following steps:
(1) establishing a data set, wherein the data set comprises characteristic pairs { V ] of the same pedestrian from different visual anglesi,UiFeature ViAnd UiHave the same dimension D, wherein Vi,UiRepresenting pedestrian feature codes under a certain view angle;
(2) establishing and training a feature transformation network, wherein the feature transformation network is a two-layer BP neural network, and the data of a network input layer is Vi Input layer dimension 1 × D, network output layer data is Ui', output layer dimension 1 × D, loss function: loss (Loss) (x) 1-cos (U)i’,Ui) Wherein U isi' denotes a network switching feature, UiTo representA target feature;
(3) after the feature transformation network is trained, the loss function is reduced to be below 0.07, and transformation matrixes W and U are extracted from the feature transformation networki’=W*ViW is a matrix of D × D;
(4) and (4) repeating the step (2) and the step (3), and calculating all the image characteristic conversion matrixes of the non-viewing angles once to obtain all the cross-viewing angle characteristic conversion matrixes.
Further, the cross-perspective feature transformation coefficient is calculated in S5 as
Figure RE-GDA0001949063680000042
Wherein t is the average loss function value;
the transformation coefficient matrixes formed by the transformation coefficients of different visual angles are as follows:
Figure RE-GDA0001949063680000051
wherein A is12Coefficient of transformation, A, representing the frontal to lateral characteristics of a pedestrian13Coefficient of conversion, A, representing the front to back features of a pedestrian21Representing the conversion coefficient from the lateral feature to the frontal feature of the pedestrian, A23Coefficient of transformation, A, representing the lateral to dorsal aspect of a pedestrian31Representing the conversion coefficient of the pedestrian's back features to front features, A32Representing the conversion coefficient of the pedestrian back feature to the side feature.
Further, the implementation of pedestrian re-identification in S6 specifically includes the following steps:
(1) selecting a cos function to measure the pedestrian characteristic distance, wherein the pedestrian characteristic distance measure is as follows: f (X, Y) ═ Alpha × cos (X, Y), where Alpha is the conversion coefficient;
(2) compared with the pedestrian characteristic at the view angle, the pedestrian characteristic distance is as follows: l ═ f (X)i,Yj) Wherein X isj、YjDecomposing features for the multi-view image features with the maximum probability of any pedestrian image;
(3) compared with the pedestrian characteristic across the visual angles, the pedestrian characteristic distance is L ═ f (W × X)i,Yj) Wherein X isj、YjDecomposing the characteristic for the multi-view image characteristic with the maximum probability of any pedestrian image, wherein W is a conversion matrix;
(4) and performing descending sorting on the distance measurement results of the searched pedestrian images, wherein the pedestrian image in the front in the sorting result is the pedestrian image re-identification result.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a pedestrian re-identification method based on multi-view image feature decomposition, which is characterized in that a pedestrian image multi-view image feature generation network is constructed by combining a capsule network from the problem of target image classification, and any pedestrian image is decomposed to obtain multi-view image features and the similarity under the view angle. The characteristics of the same visual angle are directly used for re-identification, and the characteristics of different visual angles need to be subjected to characteristic conversion. Experiments show that the decomposition of the pedestrian images greatly helps to improve the accuracy of pedestrian re-identification.
Drawings
FIG. 1 is a schematic diagram of multi-view image features of a pedestrian image;
FIG. 2 is a schematic illustration of a pedestrian image dataset;
FIG. 3 is a standard capsule network;
FIG. 4 is a network structure diagram of pedestrian image multi-view image feature generation;
FIG. 5 is a method of comparing iso-view features;
FIG. 6 illustrates a method for comparing features from different viewing angles;
FIG. 7 is a pedestrian feature transformation diagram;
FIG. 8 shows the results of example data tests;
fig. 9 shows the result of pedestrian re-recognition detection.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A pedestrian re-identification method based on multi-view image feature decomposition is carried out according to the following steps:
firstly, constructing a multi-view image feature decomposition neural network to realize the description of pedestrian images from different views (as shown in figure 1)
1. Training data set and testing data set for establishing multi-view image feature decomposition neural network
In the pedestrian image dataset, a standard front image and a side image are selected, a back image (as shown in fig. 2, in the figure, (a) is the front image of a pedestrian, (b) is the side image of the pedestrian, and (c) is the back image of the pedestrian) is not selected, and the selected standard image is divided into a training data set and a testing data set as shown in fig. 2. 1713 front pedestrian images, 1737 side pedestrian images and 1722 back pedestrian images in the training data set, wherein the total number is 5172; the test data set consisted of 170 front pedestrian images, 256 side pedestrian images, and 124 back pedestrian images, for a total of 550.
2. Data set preprocessing
The pedestrian image in the data set is scaled or cropped to an RGB image with a resolution of 192 x 64.
3. Constructing a multi-view image feature decomposition neural network by improving the structure of a capsule classification network Capsule Net
The first layer of the standard capsule network Capsule Net is a convolution layer, and the convolution kernel is 7 multiplied by 7; the second layer is a primary capsule layer, the convolution kernel is 7 multiplied by 7, and the number of output channels is 8; the third layer is a coding capsule layer, the number of output channels is 10, the number of routes is 3, the output dimension is B × 10 × 16, and B is the number of batch processing of the super parameters, as shown in fig. 3.
The first layer of the improved capsule network CapsuleNet is set as a convolution layer, and the parameters are as follows: 5 multiplied by 5 of convolution kernel, 32 of output channels, B multiplied by 3 multiplied by 192 multiplied by 64 of input dimensionality, and B is the quantity of ultrasonic parameter batch processing; the second layer is set as a convolution layer, and the parameters are as follows: convolution kernel 5 × 5, output channel number 256; the third layer is a primary capsule layer, and the parameters are as follows: convolution kernel 5 × 5, output channel number 8; the fourth layer is a coding capsule layer, and the parameters are as follows: the number of output channels is 3, the routing parameters are 2, the output dimension is Bx 3 x 64, and B is the batch processing number of the super parameters. As shown in fig. 4.
The first two layers of the improved capsule network CapsuleNe are feature extraction layers, and the feature extraction layers can be other feature extraction networks such as a single-layer neural network, ResNet, HourglassNet and the like. From experimental results, the complex feature extraction network can improve the result of pedestrian re-identification by 1% -3%.
4. Training multi-view image feature decomposition neural network
Training is carried out on a standard visual angle pedestrian image training set, the training iteration time epoch is 10, the accuracy rate of the multi-visual angle image feature decomposition neural network is 98% when the training is finished, and the overall loss function value is 0.016.
5. Obtaining multi-view pedestrian multi-view image features and corresponding similarity
Inputting the preprocessed image into a multi-view image feature decomposition neural network, performing multi-view image feature decomposition on the image through the multi-view image feature decomposition neural network to obtain image feature values of different views and corresponding similarities, and performing Softmax calculation on the image feature values of the same pedestrian image under different views to obtain corresponding similarities Pi1,Pi2,Pi3Where i represents the ith pedestrian image. As shown in fig. 1.
The visual angle similarity calculation formula is as follows:
Figure RE-GDA0001949063680000081
secondly, determining a characteristic transformation matrix under different visual angles and calculating transformation coefficients
1. Determining a characteristic transformation matrix by using a BP neural network, and solving the characteristic transformation matrix by adopting the following steps:
step 1, establishing a data set, wherein the data set comprises feature pairs { V } of the same pedestrian in different visual anglesi,UiFeature ViAnd UiHave the same dimension D, wherein Vi,UiRepresenting the pedestrian feature coding under a certain view angle, in the embodiment, D is selected to be 64, and dimension D may also be selected from other values: 128, 256, etc.;
step 2, schematically showing according to fig. 5, establishing and training a feature transformation network, wherein the feature transformation network is a two-layer BP neural network, and the data of a network input layer is Vi Input layer dimension 1 × D, network output layer data is Ui', output layer dimension 1 × D, loss function: loss (Loss) (x) 1-cos (U)i’,Ui) Wherein U isi' denotes a network switching feature, UiRepresenting target features
Step 3, after the feature transformation network is trained, the loss function is reduced to be below 0.07, and transformation matrixes W and U are extracted from the feature transformation networki’=W*ViW is a matrix of D × D, in this embodiment W is a matrix of 64 × 64;
and 4, repeating the step 2 and the step 3, and calculating all the image characteristic conversion matrixes of the non-visual angles once to obtain all the cross-visual angle characteristic conversion matrixes.
2. Computing a feature transformation coefficient matrix
The features are converted through the cross-perspective feature conversion network, a certain probability conversion error exists in the feature conversion process, and the value of a loss function of the cross-perspective feature conversion network is not zero, so that the result of the cross-perspective image feature measurement must be multiplied by a conversion coefficient Alpha, and the conversion coefficient Alpha is used for reducing the error of feature conversion.
The loss function value of the cross-view pedestrian feature conversion network represents the difference degree of different view angle feature conversion, the average loss function value is taken as t, and the value of the conversion coefficient is taken as
Figure RE-GDA0001949063680000091
Wherein t is the average loss function value;
the transformation coefficient matrixes formed by the transformation coefficients of different visual angles are as follows:
Figure RE-GDA0001949063680000092
wherein A is12Coefficient of transformation, A, representing the frontal to lateral characteristics of a pedestrian13Coefficient of conversion, A, representing the front to back features of a pedestrian21Representing the conversion coefficient from the lateral feature to the frontal feature of the pedestrian, A23Coefficient of transformation, A, representing the lateral to dorsal aspect of a pedestrian31Representing the conversion coefficient of the pedestrian's back features to front features, A32Representing the conversion coefficient of the pedestrian back feature to the side feature.
In this embodiment, the transformation coefficient matrix is:
Figure RE-GDA0001949063680000093
thirdly, selecting a pedestrian image similarity measurement function to realize pedestrian re-identification
Selecting a cos function to measure the pedestrian characteristic distance, wherein the pedestrian characteristic distance measure is as follows:
f(X,Y)=Alpha×cos(X,Y)
judging according to the similarity of the multi-angle features of any two pedestrian images (as shown in figures 6 and 7), and if the feature with the maximum similarity is the same-view feature, selecting a direct comparison mode, wherein the pedestrian feature distance is L ═ f (X)i,Yj) If the characteristic with the maximum similarity is not the same-view characteristic, selecting a characteristic conversion and comparison mode, wherein the pedestrian characteristic distance is L-f (W × X)i,Yj) W is the transformation matrix and Alpha is the transformation coefficient.
When the pedestrians are re-identified, any pair of pedestrian images (X, Y) are selected from the searched pedestrian image set, the value of the pedestrian characteristic distance measurement function is calculated, the distance measurement results of the searched pedestrian images are sorted in a descending order, and the pedestrian image in the front of the sorting results is the re-identification result of the pedestrian images.
Fourth, determining the sample size of experiment
In the embodiment, 28193 pedestrian image samples are selected, wherein 27145 training samples and 1048 testing samples are selected.
Fifth, the pedestrian re-identification experiment effect
On the data set selected in this embodiment, the feature vector D selects 4 dimensions, such as 16 dimensions, 32 dimensions, 48 dimensions, 64 dimensions, and the like, respectively, and the Rank1, Rank5, Rank10, Rank15, Rank20, and Rank indicate that the Rank contains a correct search image within a certain range. From the test results, it can be seen that in the re-recognition result, when 64 bits are selected for the image features of the pedestrians, one-time hit Rank1 reaches over 88%, the first five hit Rank5 reaches over 95%, and the first 20 hit Rank20 reaches over 98%. The pedestrian re-identification detection result is shown in fig. 9, and the data test result is shown in fig. 8.
The method regards the generation of multi-view image features as a multi-classification problem. In the multi-classification problem, the probability of each classification is generated after the last layer of SoftMax normalization processing of the multi-classification network, and the probability represents the similarity of the input and various classifications. For example: after the results of the multi-classification network are subjected to Softmax normalization processing, the probability of belonging to the class 1 is 50%, the probability of belonging to the class 2 is 49%, and the probability of belonging to the class 3 is 1%, so that the classification results show that the similarity with the class 1 is 50%, the similarity with the class 2 is 49%, and the similarity with the class 3 is only 1%.
The invention realizes the 3-view image feature decomposition neural network by means of the Capsule Multi-classification network. Firstly, perspective classification of the pedestrian image is realized by utilizing a Capsule Net multi-classification network, three probability values in a classification result are the similarity with a standard perspective, for example, the probability of a front perspective in the classification result is 60%, which shows that the similarity of the pedestrian image and a standard front image is 60%, and 60% of the image contains the front image characteristics of the pedestrian. Secondly, because the capsuleNet multi-classification network can generate the feature vectors corresponding to the classifications, the method can generate the similarity of three standard visual angles and also generate the image features under the three standard visual angles through the capsuleNet multi-classification network, and the image similarity is judged according to the similarity of the three standard visual angles and the corresponding image features in the method.
In this embodiment, the Cosine distance is used as a basic discrimination method for discriminating the similarity of the pedestrian images. In the process of calculating the similarity of the pedestrian images, if the multi-view classification results of the two pedestrian images are the same, namely the view angle with the maximum probability is the same view angle classification, the similarity of the pedestrian images adopts the corresponding image characteristics under the view angle to carry out Cosine calculation, and the calculated result is the similarity of the pedestrian images; if the multi-view classification results of the two pedestrian images are different, namely the view with the maximum probability is not classified in the same view, the pedestrian image similarity calculation adopts the cross-view conversion of the image features corresponding to a certain view (with the maximum probability) and then the Cosine calculation, and the calculation result is the pedestrian image similarity. Certain errors exist in pedestrian image similarity Cosine calculation through cross-view image feature conversion, and a conversion coefficient Alpha is multiplied before a conversion result and is used for reducing the conversion errors.
In the embodiment, the pedestrian image feature cross-view angle conversion is calculated in a form of conversion matrix multiplication, and the pedestrian image feature cross-view angle is obtained by multiplying the view angle conversion matrix and the original view angle pedestrian image feature vector. The view transformation matrix is obtained by extracting coefficients in a view transformation neural network, and in the embodiment, there are 3 standard views, and 6 view transformation matrices are required in total.
Although only the preferred embodiments of the present invention have been described in detail, the present invention is not limited to the above embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art, and all changes are encompassed in the scope of the present invention.

Claims (8)

1. A pedestrian re-identification method based on multi-view image feature decomposition is characterized by comprising the following steps:
s1, selecting a standard image in the pedestrian image dataset, dividing the selected standard image into a training dataset and a testing dataset, and establishing a training dataset and a testing dataset of the multi-view image feature decomposition neural network;
s2, zooming or cutting the pedestrian image in the data set into an RGB image with the resolution of 192 multiplied by 64 to be used as a training set and a testing set;
s3, designing a capsule classification network Capsule eNet into two convolution layers and two capsule layers to obtain an improved capsule network;
s4, training the improved capsule classification network in S3 on a pedestrian image training set at a standard visual angle in S2, constructing a multi-visual-angle image feature decomposition neural network, and generating the similarity and image feature vectors of any pedestrian image at three standard visual angles;
s5, generating a cross-view characteristic transformation matrix, and calculating a cross-view characteristic transformation coefficient matrix;
s6, selecting a pedestrian image similarity measurement function to realize pedestrian re-identification;
the implementation of pedestrian re-identification in S6 specifically includes the following steps:
(1) selecting a cos function to measure the pedestrian characteristic distance, wherein the pedestrian characteristic distance measure is as follows: f (X, Y) ═ Alpha × cos (X, Y), where Alpha is the conversion coefficient;
(2) compared with the pedestrian characteristic at the view angle, the pedestrian characteristic distance is as follows: l ═ f (X)i,Yj) Wherein X isj、YjDecomposing features for the multi-view image features with the maximum probability of any pedestrian image;
(3) compared with the pedestrian characteristic across the visual angles, the pedestrian characteristic distance is L ═ f (W × X)i,Yj) Wherein X isj、YjDecomposing the characteristic for the multi-view image characteristic with the maximum probability of any pedestrian image, wherein W is a conversion matrix;
(4) and performing descending sorting on the distance measurement results of the searched pedestrian images, wherein the pedestrian image in the front in the sorting result is the pedestrian image re-identification result.
2. The pedestrian re-identification method based on multi-view image feature decomposition according to claim 1, wherein: the standard images in S1 are front images, side images, and back images with standard shooting angles.
3. The pedestrian re-identification method based on multi-view image feature decomposition according to claim 1, wherein: the improved capsule network in the S3 is a three-classification network.
4. The pedestrian re-identification method based on multi-view image feature decomposition according to claim 1, wherein: the first layer of the improved capsule network CapsuleNet in S3 is set as a convolution layer, and the parameters are as follows: 5 multiplied by 5 of convolution kernel, 32 of output channels, B multiplied by 3 multiplied by 192 multiplied by 64 of input dimensionality, and B is the quantity of ultrasonic parameter batch processing; the second layer is set as a convolution layer, and the parameters are as follows: convolution kernel 5 × 5, output channel number 256; the third layer is a primary capsule layer, and the parameters are as follows: convolution kernel 5 × 5, output channel number 8; the fourth layer is a coding capsule layer, and the parameters are as follows: the number of output channels is 3, the routing parameters are 2, the output dimension is Bx 3 x 64, and B is the batch processing number of the super parameters.
5. The pedestrian re-identification method based on multi-view image feature decomposition according to claim 1 or 4, wherein: the convolutional layer may be replaced with a single layer neural network, ResNet, or HourglassNet.
6. The pedestrian re-identification method based on multi-view image feature decomposition according to claim 1, wherein the similarity calculation formula in S4 is as follows:
Figure FDA0002499835890000021
where e is a natural constant, V represents the length of the feature vector mode, i represents the ith pedestrian image, and VijRepresenting the length of the eigenvector mode, S, at a certain view angle, j, of the ith pedestrianijRepresenting the similarity of the ith pedestrian at viewing angle j.
7. The pedestrian re-identification method based on multi-view image feature decomposition according to claim 1, wherein the step of generating the cross-view feature transformation matrix in S5 specifically comprises the following steps:
(1) establishing a data set, wherein the data set comprises characteristics of the same pedestrian at different visual anglesFor { Vi,UiFeature ViAnd UiHave the same dimension D, wherein Vi,UiRepresenting pedestrian feature codes under a certain view angle;
(2) establishing and training a feature transformation network, wherein the feature transformation network is a two-layer BP neural network, and the data of a network input layer is ViInput layer dimension 1 × D, network output layer data is Ui', output layer dimension 1 × D, loss function: loss (Loss) (x) 1-cos (U)i’,Ui) Wherein U isi' denotes a network switching feature, UiRepresenting pedestrian feature codes under a certain view angle;
(3) after the feature transformation network is trained, the loss function is reduced to be below 0.07, and transformation matrixes W and U are extracted from the feature transformation networki’=W*ViW is a matrix of D × D;
(4) and (4) repeating the step (2) and the step (3), and calculating all the image characteristic conversion matrixes of different visual angles once to obtain all the cross-visual angle characteristic conversion matrixes.
8. The pedestrian re-identification method based on multi-view image feature decomposition according to claim 1, wherein: the cross-view characteristic conversion coefficient is calculated in the step S5 as
Figure FDA0002499835890000031
Wherein t is the average loss function value;
the transformation coefficient matrixes formed by the transformation coefficients of different visual angles are as follows:
Figure FDA0002499835890000032
wherein A is12Coefficient of transformation, A, representing the frontal to lateral characteristics of a pedestrian13Coefficient of conversion, A, representing the front to back features of a pedestrian21Representing the conversion coefficient from the lateral feature to the frontal feature of the pedestrian, A23Coefficient of transformation, A, representing the lateral to dorsal aspect of a pedestrian31Representing the conversion coefficient of the pedestrian's back features to front features, A32Line of representationThe conversion factor of the human back features to the side features.
CN201811388865.4A 2018-11-21 2018-11-21 Pedestrian re-identification method based on multi-view image feature decomposition Active CN109543602B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811388865.4A CN109543602B (en) 2018-11-21 2018-11-21 Pedestrian re-identification method based on multi-view image feature decomposition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811388865.4A CN109543602B (en) 2018-11-21 2018-11-21 Pedestrian re-identification method based on multi-view image feature decomposition

Publications (2)

Publication Number Publication Date
CN109543602A CN109543602A (en) 2019-03-29
CN109543602B true CN109543602B (en) 2020-08-14

Family

ID=65848994

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811388865.4A Active CN109543602B (en) 2018-11-21 2018-11-21 Pedestrian re-identification method based on multi-view image feature decomposition

Country Status (1)

Country Link
CN (1) CN109543602B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110020624B (en) * 2019-04-08 2023-04-18 石家庄铁道大学 Image recognition method, terminal device and storage medium
CN110222589A (en) * 2019-05-16 2019-09-10 五邑大学 A kind of pedestrian recognition methods and its system, device, storage medium again
CN110263855B (en) * 2019-06-20 2021-12-14 深圳大学 Method for classifying images by utilizing common-basis capsule projection
CN110427756B (en) * 2019-06-20 2021-05-04 中国人民解放军战略支援部队信息工程大学 Capsule network-based android malicious software detection method and device
CN110765903A (en) * 2019-10-10 2020-02-07 浙江大华技术股份有限公司 Pedestrian re-identification method and device and storage medium
CN111104867B (en) * 2019-11-25 2023-08-25 北京迈格威科技有限公司 Recognition model training and vehicle re-recognition method and device based on part segmentation
CN111881716A (en) * 2020-06-05 2020-11-03 东北林业大学 Pedestrian re-identification method based on multi-view-angle generation countermeasure network
CN111667001B (en) * 2020-06-05 2023-08-04 平安科技(深圳)有限公司 Target re-identification method, device, computer equipment and storage medium
CN111860331A (en) * 2020-07-21 2020-10-30 北京北斗天巡科技有限公司 Unmanned aerial vehicle is at face identification system in unknown territory of security protection
CN112348038A (en) * 2020-11-30 2021-02-09 江苏海洋大学 Visual positioning method based on capsule network
CN112906557B (en) * 2021-02-08 2023-07-14 重庆兆光科技股份有限公司 Multi-granularity feature aggregation target re-identification method and system under multi-view angle
CN113298037B (en) * 2021-06-18 2022-06-03 重庆交通大学 Vehicle weight recognition method based on capsule network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778527A (en) * 2016-11-28 2017-05-31 中通服公众信息产业股份有限公司 A kind of improved neutral net pedestrian recognition methods again based on triple losses
CN108764308A (en) * 2018-05-16 2018-11-06 中国人民解放军陆军工程大学 A kind of recognition methods again of the pedestrian based on convolution loop network

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8204292B2 (en) * 2008-05-21 2012-06-19 Riverain Medical Group, Llc Feature based neural network regression for feature suppression
JP6213950B2 (en) * 2013-06-03 2017-10-18 日本電信電話株式会社 Image processing apparatus, image processing method, and image processing program
CN106991396B (en) * 2017-04-01 2020-07-14 南京云创大数据科技股份有限公司 Target relay tracking algorithm based on intelligent street lamp partner

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778527A (en) * 2016-11-28 2017-05-31 中通服公众信息产业股份有限公司 A kind of improved neutral net pedestrian recognition methods again based on triple losses
CN108764308A (en) * 2018-05-16 2018-11-06 中国人民解放军陆军工程大学 A kind of recognition methods again of the pedestrian based on convolution loop network

Also Published As

Publication number Publication date
CN109543602A (en) 2019-03-29

Similar Documents

Publication Publication Date Title
CN109543602B (en) Pedestrian re-identification method based on multi-view image feature decomposition
Li et al. Actional-structural graph convolutional networks for skeleton-based action recognition
Hou et al. Vrstc: Occlusion-free video person re-identification
Wang et al. Depth pooling based large-scale 3-d action recognition with convolutional neural networks
CN107341452B (en) Human behavior identification method based on quaternion space-time convolution neural network
Li et al. Unsupervised learning of view-invariant action representations
Hong et al. Multimodal deep autoencoder for human pose recovery
CN110659665B (en) Model construction method of different-dimension characteristics and image recognition method and device
CN112446476A (en) Neural network model compression method, device, storage medium and chip
Wang et al. Laplacian LRR on product Grassmann manifolds for human activity clustering in multicamera video surveillance
CN111709313B (en) Pedestrian re-identification method based on local and channel combination characteristics
Lee et al. Face image retrieval using sparse representation classifier with gabor-lbp histogram
CN112580480B (en) Hyperspectral remote sensing image classification method and device
Hosseini et al. GF-CapsNet: Using gabor jet and capsule networks for facial age, gender, and expression recognition
Verma et al. Wild animal detection from highly cluttered images using deep convolutional neural network
Julina et al. Facial emotion recognition in videos using hog and lbp
Chergui et al. Kinship verification through facial images using cnn-based features
An Pedestrian re-recognition algorithm based on optimization deep learning-sequence memory model
Zhang et al. Discriminative tensor sparse coding for image classification.
Lee et al. Fast object localization using a CNN feature map based multi-scale search
Hong et al. Characterizing subtle facial movements via Riemannian manifold
CN112329662A (en) Multi-view saliency estimation method based on unsupervised learning
Ma et al. Cascade transformer decoder based occluded pedestrian detection with dynamic deformable convolution and Gaussian projection channel attention mechanism
Hsieh et al. Video-based human action and hand gesture recognition by fusing factored matrices of dual tensors
CN112270228A (en) Pedestrian re-identification method based on DCCA fusion characteristics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Yang Xiaofeng

Inventor after: Li Haifang

Inventor after: Deng Hongxia

Inventor after: Yao Rong

Inventor after: Guo Hao

Inventor before: Li Haifang

Inventor before: Yang Xiaofeng

Inventor before: Deng Hongxia

Inventor before: Yao Rong

Inventor before: Guo Hao

GR01 Patent grant
GR01 Patent grant