CN110516533B - Pedestrian re-identification method based on depth measurement - Google Patents

Pedestrian re-identification method based on depth measurement Download PDF

Info

Publication number
CN110516533B
CN110516533B CN201910626883.XA CN201910626883A CN110516533B CN 110516533 B CN110516533 B CN 110516533B CN 201910626883 A CN201910626883 A CN 201910626883A CN 110516533 B CN110516533 B CN 110516533B
Authority
CN
China
Prior art keywords
network
pedestrian
image
depth measurement
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910626883.XA
Other languages
Chinese (zh)
Other versions
CN110516533A (en
Inventor
苗夺谦
王倩倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN201910626883.XA priority Critical patent/CN110516533B/en
Publication of CN110516533A publication Critical patent/CN110516533A/en
Application granted granted Critical
Publication of CN110516533B publication Critical patent/CN110516533B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a pedestrian re-identification method based on depth measurement, which comprises the following steps: 1) Training a ResNet-50 network by taking the ImageNet data set as a training data set to enable the ResNet-50 network to have an initial value; 2) Removing the softmax layer and the last fully-connected layer of the ResNet-50 network; 3) Forming a depth measurement network by using a plurality of nonlinear full-connection layers, and adding an Euclidean distance calculating unit after outputting; 4) Connecting a depth measurement network after the ResNet-50 network is adjusted to form a final network model of the invention; 5) Randomly cutting images in the pedestrian re-identification training data set to obtain a group of training data sets with 224 multiplied by 224, randomly selecting pedestrians with different P positions from the training data sets, and randomly selecting K images for each pedestrian to form a small training batch; 6) Optimizing the network in 4) by minimizing Hard Triplet Loss loss function using the training data obtained in 5), and performing this step in a loop until the loss value converges; 7) Inputting the pedestrian image to be identified and the images in the candidate library into the optimized model, and obtaining the feature vector of the pedestrian image on the same feature space; 8) And calculating Euclidean distance between the feature vectors, sequencing the distance, and finally obtaining the matching rate of the pedestrian image to be identified and the comparison image.

Description

Pedestrian re-identification method based on depth measurement
Technical Field
The invention relates to the field of intelligent analysis of surveillance videos, in particular to a pedestrian re-identification method based on depth measurement.
Background
Pedestrian re-identification refers to the problem of matching pedestrians under different camera angles in a system consisting of multiple cameras, and relates to numerous research hotspots such as feature selection, saliency extraction, distance measurement learning, deep learning and the like. The pedestrian re-identification technology provides key help for analysis of different aspects of pedestrian identity, tracking and the like, and is developed into a key component in the intelligent video monitoring field.
The main methods in the pedestrian re-recognition field can be divided into the following two categories: 1) A pedestrian re-identification method based on characteristic representation; 2) A method based on distance metric learning.
The former aims at designing or learning features that are robust to changes in illumination and viewing angle, etc., and this type of approach typically combines multiple underlying visual features, where the underlying secondary features are typically color (color space, histogram, dominant color, etc.) and texture (LBP, gabor, co-occurrence matrix, etc.) features. For example: symmetry-based cumulative feature descriptors, covariance descriptors, horizontal stripe-based partition descriptors, pyramid match descriptors, pattern matches, saliency matches, deep learning models, and so forth. The method solves the problems of illumination, visual angle and the like to a certain extent, but only can extract the bottom visual information, the feature extraction rule is fixed, and the robustness and the adaptability of the features are limited to a certain extent.
The latter is focused on designing a similarity metric model suitable for pedestrian re-recognition. The existing distance measurement model is mainly divided into a non-learning method and a learning method. First-order distance, second-order distance, papanicolaou distance, etc. are non-learning methods that are generally mathematically simple. However, the recognition result is not ideal due to the influence of problems such as redundancy, robustness, and the like of the extracted pedestrian features. The measurement method based on learning generally learns the identification information of the appearance characteristics of the same pedestrian and different pedestrians under different cameras, and optimizes the difference and the similarity between samples, so that the identification effect is relatively good. The method mainly comprises RankSVM, relative distance comparison, metric learning based on a kernel method, mahalanobis distance learning, deep metric learning, metric integration and the like.
In general, the above method divides the pedestrian re-recognition process into two steps: the feature representation and the distance measure are then optimized for each of the two steps. These fracture the feature representation and the metric, while in practice the distance metric effect and the feature representation have a close relationship and are not completely cuttable.
The chinese application CN108171184a proposes a pedestrian re-recognition method based on a Siamese network, which uses two identical res net-50 networks to form a Siamese network, and uses paired training data to optimize the network. Although the method adopts the convolutional neural network to automatically learn the image characteristics, paired input is needed in training, and the training time is too long. Further, due to the influence of various factors such as illumination change, posture, visual angle, shielding, image resolution and the like, the pedestrian re-identification performance is still poor in the intelligent analysis of the monitoring video.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a pedestrian re-identification method based on depth measurement.
The aim of the invention can be achieved by the following technical scheme:
a pedestrian re-identification method based on depth measurement comprises the following steps:
1. constructing a network
1) The res net-50 network is pre-trained,
training a ResNet-50 network by taking the ImageNet data set as a training data set to enable the ResNet-50 network to have an initial value;
2) Adjusting the ResNet-50 network in the step 1), and removing the softmax layer and the last full connection layer in the ResNet-50 network; providing step 4);
3) A plurality of nonlinear full-connection layers are adopted to form a depth measurement network, an Euclidean distance calculation unit is added after the depth measurement network is output, and a random initialization method is adopted for the network parameters; providing step 4);
4) A pedestrian re-identification network model is constructed,
connecting the depth measurement network in the step 3) after the ResNet-50 network is adjusted in the step 2), and forming a final network model of the invention;
2. training
5) Preprocessing pedestrians, identifying a training data set, randomly cutting images in the training data set to obtain a group of training data sets with 224 multiplied by 224, randomly selecting pedestrians with different P bits from the training data sets, and randomly selecting K images for each pedestrian to form a small training batch;
6) The network model is trained such that,
optimizing the network model finally constructed in the step 4) by minimizing a Hard Triplet Loss loss function, inputting the training data obtained in the step 5) into the optimized network model, and circularly executing the step until the loss value converges;
3. identification of
7) Re-identifying pedestrians, respectively inputting the images of the pedestrians to be identified and the images in the candidate library into the optimized network model in the step 6), and obtaining the feature vectors of the pedestrians on the same feature space;
8) And calculating the similarity between the image to be recognized and all the images in the candidate library, namely calculating the Euclidean distance of the feature vector between the image to be recognized and the candidate library image, wherein the feature vector is obtained in the step 7). And then sequencing the images in the candidate library according to the rule that the similarity is from small to large, wherein the images are more similar to the images to be identified when the ranking is higher. Wherein similar refers to images in which two images of pedestrians are the same pedestrian. The first image is the same pedestrian image of the pedestrian image to be identified.
Further, the pretrained ResNet-50 network in the step 1) optimizes training by adopting a dropout or Batchnormal method, so that the ResNet-50 network has image feature extraction capability.
Further, the ResNet-50 is adjusted in the step 2), namely, the softmax layer and the last full connection layer of the ResNet-50 network are deleted, and a vector with the final output of 2048 dimensions is obtained.
Further, the depth measurement network of step 3),
step 3), which is a key innovation step of the present invention, the depth measurement network module is one of the innovations of the present invention, and is connected to the 2048-dimensional feature vector to output the euclidean space feature vector after nonlinear projection. The depth measurement network structure specifically comprises:
after a neural network consisting of M nonlinear fully connected layers, a Euclidean distance calculation layer is added. The depth of the first full-connection layer is 2048, the parameter initialization of each layer adopts a random initialization method, and the calculation formula is as follows:
Figure BDA0002126210050000031
wherein M is more than or equal to 1 and less than or equal to M, r (m) Is the depth of the m-th layer, and r (0) =2048,
Figure BDA0002126210050000032
Is the weight of the m-th layer, the bias of each layer +.>
Figure BDA0002126210050000033
Initializing to zero vector, wherein M is the total layer number of the full connection layer in the depth measurement network, and is a super parameter.
Further, the step 4) of constructing a pedestrian re-identification network model specifically includes:
connecting the ResNet-50 network adjusted in the step 2) with the depth measurement network obtained in the step 3), namely inputting the output of the ResNet-50 network into the depth measurement network, and constructing the pedestrian re-identification network model.
Further, the training network model in step 6) specifically refers to that in the new training set generated in step 5), pedestrians with different P positions are randomly selected, K images are randomly selected for each pedestrian to form a small training batch, the training batch is input into the network for training, the loss function adopts Hard Triplet Loss, and the calculation formula is as follows:
61 Acquiring characteristics of each sample in the training batch extracted through ResNet-50 network
Figure BDA0002126210050000041
(1≤i≤P,1≤a≤K),/>
Figure BDA0002126210050000042
An a-th image representing an i-th pedestrian in the training batch, r (-) represents the output of the ResNet-50 network.
62 Acquiring each bitSign vector
Figure BDA0002126210050000043
The output through the depth measurement network is specifically calculated as follows:
Figure BDA0002126210050000044
Figure BDA0002126210050000045
Figure BDA0002126210050000046
wherein M is more than or equal to 1 and less than or equal to M and h (m) For the output of the mth layer in the depth metric network,
Figure BDA0002126210050000047
for nonlinear activation functions, f (·) is a nonlinear mapping function of depth metric network parameterization. />
Figure BDA00021262100500000415
Representing the bias vector of the mth layer in the depth metric network. />
Figure BDA0002126210050000048
Is the weight of the m-th layer in the depth measurement network. r is (r) (m) To measure the depth of the m-th layer of the network, and r (0) =2048。/>
Figure BDA00021262100500000416
Indicating that the content is r (m) And each element in the vector is a real value. R is a real number set.
63 Calculating a loss function value):
Figure BDA0002126210050000049
Figure BDA00021262100500000414
Figure BDA00021262100500000410
wherein the method comprises the steps of
Figure BDA00021262100500000411
Figure BDA00021262100500000412
The a-th image representing the i-th pedestrian in the training Batch, r (-) represents the output of the ResNet-50 network, and P, K is the number of different pedestrians in the Batch and the number of images of each pedestrian respectively. X represents the input of Batch, σ is the threshold, θ is the parameter of the network, ++>
Figure BDA00021262100500000413
For nonlinear activation functions, f (·) is a nonlinear mapping function of depth metric network parameterization. d, d f (p 1 ,p 2 ) Represents p 1 And p is as follows 2 Depth measurement distance between, where p 1 And p is as follows 2 Are vectors.
And then, solving an optimal solution for the loss function by using a random gradient descent method, so as to update and optimize the corresponding parameters.
Further, the step 7) of re-identifying the pedestrian specifically refers to inputting the image of the pedestrian to be identified and the image in the candidate library into the network to obtain an output f (r (x)) of each image x, where x represents any one of the image to be identified and the candidate library.
Further, in step 8), the distance between the pedestrian image to be recognized and the contrast image is:
d f (r(x),r(y))=d(f(r(x)),f(r(y)))=||f(r(x))-f(r(y))|| 2
wherein x represents any image to be identified, and y represents any image in the candidate libraryAn image. r (-) represents the output of the ResNet-50 network. f (·) is a non-linear mapping function of the depth metric network parameterization. d, d f (p 1 ,p 2 ) Represents p 1 And p is as follows 2 Depth measurement distance between, where p 1 And p is as follows 2 Are vectors. Wherein r (x), r (y) are feature vectors of the image to be recognized and the contrast image respectively, and f (r (x)) and f (r (y)) are feature vectors of the image to be recognized and the contrast image on the same feature space obtained by nonlinear mapping of the depth measurement network respectively. d, d f (r (x), r (y)) represents the depth measurement distance between the image x to be recognized and any image y in the candidate library.
According to the technical scheme, the feature extraction and the measurement learning are integrated in the unified frame, so that the method can be optimized under the unified target, and the accuracy of pedestrian re-identification is improved.
Compared with the prior art, the invention has the following advantages:
1. by utilizing the excellent network model trained on the large-scale image database and performing fine adjustment on the pedestrian re-identification database, the image features can be automatically learned through the network model without complex preprocessing operation when the image features are extracted.
2. Using a multi-layer nonlinear feedforward neural network, learning a potential nonlinear mapping function, mapping the image features extracted by ResNet-50 into a low-dimensional feature space, and calculating Euclidean distance of the mapped features in the feature space to serve as similarity measurement of the images. The depth metric may capture a nonlinear relationship between data points compared to a traditional mahalanobis distance;
3. the feature extraction and the measurement learning are fused under one frame, and the optimization is carried out under a unified target, so that the extracted features are more suitable for the re-identification problem.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention.
FIG. 2 is a schematic diagram of the system structure of the present invention.
Detailed Description
The invention will now be described in detail with reference to the drawings and specific examples.
Examples:
in order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the following examples, which are specifically illustrated in the flowcharts and block diagrams shown in fig. 1 and 2. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Step one: pre-training the ResNet-50 network, taking the ImageNet data set as a training data set, and training one ResNet-50 network to effectively initialize the parameters of the ResNet-50 network; effective initialization refers to the ability to learn certain image features;
step two: trimming the ResN-50 network, deleting the softmax layer and the last full connection layer of the ResNet-50 network, wherein the output of the network after deletion is 2048-dimensional vector;
step three: the depth measurement network is constructed, 2 nonlinear full-connection layers are connected to form the depth measurement network, and after the output, an Euclidean distance calculation unit is added, the depths of the full-connection layers of the two layers are respectively 512 and 128, the activation function adopts a tanh function, the network parameters adopt a random initialization method, and the specific formula is as follows:
Figure BDA0002126210050000061
wherein m is more than or equal to 1 and less than or equal to 2, r (0) =2048,r (1) =512,r (2) =128, the bias of the two-layer network is initialized to zero vector.
Step four: the pedestrian re-identification network model is constructed, and specifically comprises the following steps:
connecting the ResNet-50 network adjusted in the step 2) with the depth measurement network obtained in the step 3), namely inputting the output of the ResNet-50 network into the depth measurement network, and constructing the pedestrian re-identification network model.
Step five: preprocessing a pedestrian and identifying a training data set, randomly cutting all images in the training data set to obtain a training data set with the uniform image size of 225 multiplied by 225, scrambling the sequence of the training data set obtained after cutting, randomly selecting P=25 pedestrians from the training data set, and randomly selecting K=4 images of each pedestrian to form a small training batch;
step six: training the pedestrian re-identification network model, solving Hard Triplet Loss loss function by using the training data obtained in the step five, updating network parameters by using a random gradient descent method, and circularly executing the step until the loss function converges, wherein the specific calculation is as follows:
firstly, obtaining the characteristics extracted from each sample in the training batch through ResNet-50 network
Figure BDA0002126210050000062
Figure BDA0002126210050000063
Figure BDA0002126210050000064
An a-th image representing an i-th pedestrian in the training batch, r (-) represents the output of the ResNet-50 network. Then each feature vector is acquired +.>
Figure BDA0002126210050000065
The output through the depth measurement network is specifically calculated as follows:
Figure BDA0002126210050000071
Figure BDA0002126210050000072
Figure BDA0002126210050000073
wherein M is more than or equal to 1 and less than or equal to M and h (m) Depth isThe output of the m-th layer in the quantity network,
Figure BDA0002126210050000074
for nonlinear activation functions, f (·) is a nonlinear mapping function of depth metric network parameterization. Finally, calculating a loss function value:
Figure BDA0002126210050000075
Figure BDA0002126210050000076
Figure BDA0002126210050000077
wherein the method comprises the steps of
Figure BDA0002126210050000078
Figure BDA0002126210050000079
The a-th image representing the i-th pedestrian in the training Batch, r (-) represents the output of the ResNet-50 network, and P, K is the number of different pedestrians in the Batch and the number of images of each pedestrian respectively. X represents the input of Batch, σ is the threshold, θ is the parameter of the network, ++>
Figure BDA00021262100500000710
For nonlinear activation functions, f (·) is a nonlinear mapping function of depth metric network parameterization. />
Figure BDA00021262100500000711
Representation of
Figure BDA00021262100500000712
And->
Figure BDA00021262100500000713
Euclidean distance between them.d f (p 1 ,p 2 ) Represents p 1 And p is as follows 2 Depth measurement distance between, where p 1 And p is as follows 2 Are vectors. />
And then, solving an optimal solution for the loss function by using a random gradient descent method, so as to update and optimize the corresponding parameters.
Step seven: and (3) re-identifying pedestrians, inputting the images to be identified and the images in the candidate library into a trained network, and extracting the output of the last full-connection layer of the depth measurement network to obtain the feature vector of the pedestrian image in the same feature space.
Step eight: and calculating Euclidean distance between the image of the pedestrian to be identified and the feature vector of the candidate library image, and sequencing the distance, wherein the image with the higher rank is the image with the same type as the image to be identified, and the same type refers to the image of the same pedestrian.
Further described in conjunction with the drawings.
FIG. 1 is a flowchart of an algorithm implementation of the present invention, and the specific embodiment is as follows:
1. pretraining the ResNet-50 network, and optimizing the training by adopting a dropout or Batch Normalization method, so that the ResNet-50 network has the capability of extracting image characteristics;
2. trimming the ResN-50 network, deleting the softmax layer and the last full connection layer of the ResNet-50 network, wherein the output of the network after deletion is 2048-dimensional vector;
3. a depth measurement network is formed by adopting a plurality of nonlinear full-connection layers, an Euclidean distance calculation unit is added after the depth measurement network is output, and network parameters adopt a random initialization method, wherein the specific formula is as follows:
Figure BDA0002126210050000081
wherein M is more than or equal to 1 and less than or equal to M, r (m) Is the depth of the m-th layer, and r (0) =2048,
Figure BDA0002126210050000082
Is the weight of the m-th layer, each layerBias b (m) ∈R r(m) Initialized to the zero vector.
4. Constructing a pedestrian re-identification network model, and connecting a depth measurement network after an adjusted ResNet-50 network to form a final network model of the invention, as shown in figure 2;
5. preprocessing pedestrian re-identification training data, randomly cutting images in the training data set to obtain a group of training data sets with 224 multiplied by 224, randomly selecting P pedestrians with different positions from the training data sets, and randomly selecting K images for each pedestrian to form a small training batch;
6. training a network model, optimizing the network in 4) by minimizing Hard Triplet Loss loss function by using the training data obtained in 5), and circularly executing the step until the loss value converges;
7. re-identifying pedestrians, inputting images of pedestrians to be identified and images in a candidate library into an optimized model, and obtaining feature vectors of the pedestrians on the same feature space;
8. calculating Euclidean distance of feature vectors of the sample feature vector to be identified and the pedestrian image library;
9. and sequencing the images in the candidate library according to the sequence from small to large in distance, wherein the image with rank 1 is the image of the same pedestrian as the image to be recognized.
Tables 1-3 are comparisons of performance of the algorithms of the embodiments of the present invention after operation with other algorithms.
Table 1 comparison of the performance of the inventive algorithm with other algorithms on VIPeR pedestrian re-identification public dataset
Method rank-1 rank-10 rank-20
Our 56.34 90.25 98.45
DDML 46.50 87.53 96.13
XQDA 40.50 80.42 91.03
KISSME 19.73 61.20 77.01
DML 29.73 71.20 86.01
TABLE 2 comparison of the performance of the inventive algorithm with other algorithms on the Market-1501 pedestrian re-identification public dataset
Method rank-1 mAP
Our 73.8 89.4
DDML 32.6 57.4
DML 29.4 53.7
Gated 39.6 65.9
Pose 56.0 79.3
Scalable 68.8 82.2
TABLE 3 comparison of Performance of the inventive algorithm with other algorithms on CUHK03 line re-identification public dataset
Method rank-1 rank-5 rank-10
Our 75.5 90.6 98.4
DDML 56.8 87.3 90.2
XQDA 46.3 78.9 88.6
KISSME 11.7 33.3 48.0
DML 35.7 60.9 73.4
Re-ranking 64.0 86.4 93.7
The results obtained from experiments on three common pedestrian re-identification public data sets can show that the rank-1 value and the mAP value of the CMC curve of the embodiment are better than those of other algorithms, which indicates that the embodiment can obtain good pedestrian re-identification performance by constructing a network model based on depth measurement and referencing a triple loss function selected by a difficult sample.
It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. While still being apparent from variations or modifications that may be made by those skilled in the art are within the scope of the invention.

Claims (7)

1. The pedestrian re-identification method based on the depth measurement is characterized by comprising the following steps of:
1. constructing a network
1) The res net-50 network is pre-trained,
training a ResNet-50 network by taking the ImageNet data set as a training data set to enable the ResNet-50 network to have an initial value;
2) Adjusting the ResNet-50 network in the step 1), and removing the softmax layer and the last full connection layer in the ResNet-50 network; providing to step 4);
3) A plurality of nonlinear full-connection layers are adopted to form a depth measurement network, an Euclidean distance calculation unit is added after the depth measurement network is output, and a random initialization method is adopted for the network parameters; providing step 4);
step 4) constructing a pedestrian re-identification network model, which comprises the following steps:
connecting the ResNet-50 network adjusted in the step 2) with the depth measurement network obtained in the step 3), namely inputting the output of the ResNet-50 network into the depth measurement network, and constructing a pedestrian re-identification network model;
2. training
5) Preprocessing pedestrians, identifying a training data set, randomly cutting images in the training data set to obtain a group of training data sets with 224 multiplied by 224, randomly selecting pedestrians with different P bits from the training data sets, and randomly selecting K images for each pedestrian to form a small training batch;
6) The network model is trained such that,
optimizing the network model finally constructed in the step 4) by minimizing a Hard Triplet Loss loss function, inputting the training data obtained in the step 5) into the optimized network model, and circularly executing the step until the loss value converges;
3. identification of
7) Re-identifying pedestrians, respectively inputting the images of the pedestrians to be identified and the images in the candidate library into the optimized network model in the step 6), and obtaining the feature vectors of the pedestrians on the same feature space;
8) Calculating the similarity between the image to be identified and all the images in the candidate library, namely calculating the Euclidean distance of the feature vector between the image to be identified and the candidate library image, wherein the feature vector is obtained in the step 7); then, according to the rule that the similarity is from small to large, sequencing the images in the candidate library, wherein the images are more similar to the images to be identified when the ranking is higher; wherein similar refers to images in which two images of pedestrians are the same pedestrian.
2. The pedestrian re-recognition method based on depth measurement according to claim 1, wherein the pre-training res net-50 network in step 1) optimizes training by using a dropout or Batch Normalization method, so that the res net-50 network has image feature extraction capability.
3. The pedestrian re-recognition method based on depth measurement according to claim 1, wherein the step 2) of adjusting the res net-50 is to eliminate a softmax layer and a last full-connection layer of the res net-50 network to obtain a vector with a final output of 2048 dimensions.
4. The pedestrian re-recognition method based on depth measurement according to claim 3, wherein the depth measurement network in step 3) is accessed to the 2048-dimensional feature vector to output a non-linear projected European space feature vector; the depth measurement network structure specifically comprises:
after a neural network formed by M nonlinear full-connection layers, adding a Euclidean distance calculation layer; the depth of the first full-connection layer is 2048, the parameter initialization of each layer adopts a random initialization method, and the calculation formula is as follows:
Figure FDA0004170181940000031
/>
wherein M is more than or equal to 1 and less than or equal to M, r (m) Is the depth of the m-th layer, and r (0) =2048,
Figure FDA0004170181940000032
Is the weight of the m-th layer, the bias of each layer +.>
Figure FDA0004170181940000033
Initializing to zero vector, wherein M is the total layer number of the full connection layer in the depth measurement network, and is a super parameter.
5. The pedestrian re-identification method based on depth measurement according to claim 1, wherein the training network model in step 6) specifically refers to a new training set generated in step 5), randomly selecting P different pedestrians, randomly selecting K images for each pedestrian to form a small training batch, inputting the training batch into a network for training, adopting a loss function Hard Triplet Loss, and adopting a calculation formula as follows:
61 Acquiring characteristics of each sample in the training batch extracted through ResNet-50 network
Figure FDA0004170181940000034
Figure FDA0004170181940000035
An a-th image representing an i-th pedestrian in the training batch, and r (-) represents the output of the ResNet-50 network;
62 Acquiring each feature vector
Figure FDA0004170181940000036
The output through the depth measurement network is specifically calculated as follows:
Figure FDA0004170181940000037
Figure FDA0004170181940000038
Figure FDA0004170181940000039
wherein M is more than or equal to 1 and less than or equal to M and h (m) For the output of the mth layer in the depth metric network,
Figure FDA00041701819400000310
f (·) is a nonlinear activation function, f (·) is a nonlinear mapping function of depth metric network parameterization; />
Figure FDA00041701819400000311
A bias vector representing an mth layer in the depth metric network;
Figure FDA00041701819400000312
the weight of the m layer in the depth measurement network; r is (r) (m) To measure the depth of the m-th layer of the network, and r (0) =2048;/>
Figure FDA00041701819400000313
Indicating that the content is r (m) And each element in the vector is a real value; r is a real number set;
63 Calculating a loss function value):
Figure FDA0004170181940000041
Figure FDA0004170181940000042
Figure FDA0004170181940000043
wherein the method comprises the steps of
Figure FDA0004170181940000044
Figure FDA0004170181940000045
The a-th image of the ith pedestrian in the training Batch is represented, r (-) represents the output of the ResNet-50 network, and P, K is the number of different pedestrians in the Batch and the number of images of each pedestrian respectively; x represents the input of Batch, σ is the threshold, θ is the parameter of the network, ++>
Figure FDA0004170181940000046
F (·) is a nonlinear activation function, f (·) is a nonlinear mapping function of depth metric network parameterization; d, d f (p 1 ,p 2 ) Represents p 1 And p is as follows 2 Depth measurement distance between, where p 1 And p is as follows 2 Are vectors; l (L) BH (theta; X) is the network loss value of the training single batch;
and then, solving an optimal solution for the loss function by using a random gradient descent method, so as to update and optimize the corresponding parameters.
6. The pedestrian re-recognition method based on the depth measurement according to claim 1, wherein the pedestrian re-recognition in step 7) specifically refers to inputting the image of the pedestrian to be recognized and the image in the candidate library into a network, and obtaining an output f (r (x)) of each image x.
7. The pedestrian re-recognition method based on depth measurement according to claim 1, wherein in step 8), the distance between the pedestrian image to be recognized and the contrast image is:
d f (r(x),r(y))=d(f(r(x)),f(r(y)))=||f(r(x))-f(r(y))|| 2
wherein x represents any image to be identified, and y represents any image in the candidate library; r (-) represents the output of the ResNet-50 network; f (·) is a nonlinear mapping function of the depth metric network parameterization;
d f (p 1 ,p 2 ) Represents p 1 And p is as follows 2 Depth measurement distance between, where p 1 And p is as follows 2 Are vectors; r (x), r (y) are feature vectors of the image to be recognized and the contrast image respectively, f (r (x)), f (r (y)) are feature vectors of the image to be recognized and the contrast image on the same feature space obtained by nonlinear mapping of the depth measurement network respectively, and d f (r (x), r (y)) represents the depth metric distance between the image x to be identified and the image y in the candidate library.
CN201910626883.XA 2019-07-11 2019-07-11 Pedestrian re-identification method based on depth measurement Active CN110516533B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910626883.XA CN110516533B (en) 2019-07-11 2019-07-11 Pedestrian re-identification method based on depth measurement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910626883.XA CN110516533B (en) 2019-07-11 2019-07-11 Pedestrian re-identification method based on depth measurement

Publications (2)

Publication Number Publication Date
CN110516533A CN110516533A (en) 2019-11-29
CN110516533B true CN110516533B (en) 2023-06-02

Family

ID=68622686

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910626883.XA Active CN110516533B (en) 2019-07-11 2019-07-11 Pedestrian re-identification method based on depth measurement

Country Status (1)

Country Link
CN (1) CN110516533B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111667050B (en) * 2020-04-21 2021-11-30 佳都科技集团股份有限公司 Metric learning method, device, equipment and storage medium
CN111786999B (en) * 2020-06-30 2023-03-24 中国电子科技集团公司电子科学研究院 Intrusion behavior detection method, device, equipment and storage medium
CN111814705B (en) * 2020-07-14 2022-08-02 广西师范大学 Pedestrian re-identification method based on batch blocking shielding network
CN112329833B (en) * 2020-10-28 2022-08-12 浙江大学 Image metric learning method based on spherical surface embedding
CN112686200A (en) * 2021-01-11 2021-04-20 中山大学 Pedestrian re-identification method and system based on multi-scheme parallel attention mechanism

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416295A (en) * 2018-03-08 2018-08-17 天津师范大学 A kind of recognition methods again of the pedestrian based on locally embedding depth characteristic
US10176405B1 (en) * 2018-06-18 2019-01-08 Inception Institute Of Artificial Intelligence Vehicle re-identification techniques using neural networks for image analysis, viewpoint-aware pattern recognition, and generation of multi- view vehicle representations
CN109670528A (en) * 2018-11-14 2019-04-23 中国矿业大学 The data extending method for blocking strategy at random based on paired samples towards pedestrian's weight identification mission

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10699151B2 (en) * 2016-06-03 2020-06-30 Miovision Technologies Incorporated System and method for performing saliency detection using deep active contours
CN108009528B (en) * 2017-12-26 2020-04-07 广州广电运通金融电子股份有限公司 Triple Loss-based face authentication method and device, computer equipment and storage medium
CN108171184B (en) * 2018-01-03 2020-04-10 南京理工大学 Method for re-identifying pedestrians based on Simese network
US10685446B2 (en) * 2018-01-12 2020-06-16 Intel Corporation Method and system of recurrent semantic segmentation for image processing
CN108491884A (en) * 2018-03-27 2018-09-04 中山大学 Pedestrian based on lightweight network identifying system and implementation method again
CN108537181A (en) * 2018-04-13 2018-09-14 盐城师范学院 A kind of gait recognition method based on the study of big spacing depth measure
CN108960127B (en) * 2018-06-29 2021-11-05 厦门大学 Shielded pedestrian re-identification method based on adaptive depth measurement learning
CN108960141B (en) * 2018-07-04 2021-04-23 国家新闻出版广电总局广播科学研究院 Pedestrian re-identification method based on enhanced deep convolutional neural network
CN109190446A (en) * 2018-07-06 2019-01-11 西北工业大学 Pedestrian's recognition methods again based on triple focused lost function
CN109034035A (en) * 2018-07-18 2018-12-18 电子科技大学 Pedestrian's recognition methods again based on conspicuousness detection and Fusion Features
CN109446898B (en) * 2018-09-20 2021-10-15 暨南大学 Pedestrian re-identification method based on transfer learning and feature fusion
CN109711281B (en) * 2018-12-10 2023-05-02 复旦大学 Pedestrian re-recognition and feature recognition fusion method based on deep learning
CN109815908A (en) * 2019-01-25 2019-05-28 同济大学 It is a kind of based on the discrimination method again of the pedestrian that measures between deep learning and overlapping image block
CN109829414B (en) * 2019-01-25 2020-11-24 华南理工大学 Pedestrian re-identification method based on label uncertainty and human body component model
CN109993070B (en) * 2019-03-13 2021-06-08 华南理工大学 Pedestrian re-identification method based on global distance scale loss function

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416295A (en) * 2018-03-08 2018-08-17 天津师范大学 A kind of recognition methods again of the pedestrian based on locally embedding depth characteristic
US10176405B1 (en) * 2018-06-18 2019-01-08 Inception Institute Of Artificial Intelligence Vehicle re-identification techniques using neural networks for image analysis, viewpoint-aware pattern recognition, and generation of multi- view vehicle representations
CN109670528A (en) * 2018-11-14 2019-04-23 中国矿业大学 The data extending method for blocking strategy at random based on paired samples towards pedestrian's weight identification mission

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
改进的单尺度Retinex和LBP结合的人脸识别;段红燕;何文思;李世杰;;计算机工程与应用(第23期);全文 *

Also Published As

Publication number Publication date
CN110516533A (en) 2019-11-29

Similar Documents

Publication Publication Date Title
CN110516533B (en) Pedestrian re-identification method based on depth measurement
CN108960140B (en) Pedestrian re-identification method based on multi-region feature extraction and fusion
CN111126360B (en) Cross-domain pedestrian re-identification method based on unsupervised combined multi-loss model
CN109961051B (en) Pedestrian re-identification method based on clustering and block feature extraction
CN111178432A (en) Weak supervision fine-grained image classification method of multi-branch neural network model
CN104268593B (en) The face identification method of many rarefaction representations under a kind of Small Sample Size
CN109711366B (en) Pedestrian re-identification method based on group information loss function
CN111738143B (en) Pedestrian re-identification method based on expectation maximization
CN108509854B (en) Pedestrian re-identification method based on projection matrix constraint and discriminative dictionary learning
Ge et al. Modelling local deep convolutional neural network features to improve fine-grained image classification
CN110097060B (en) Open set identification method for trunk image
CN109447123B (en) Pedestrian re-identification method based on label consistency constraint and stretching regularization dictionary learning
CN109543723B (en) Robust image clustering method
CN110728694B (en) Long-time visual target tracking method based on continuous learning
CN107169117B (en) Hand-drawn human motion retrieval method based on automatic encoder and DTW
CN109544603B (en) Target tracking method based on deep migration learning
Wang et al. Head pose estimation with combined 2D SIFT and 3D HOG features
CN113920472B (en) Attention mechanism-based unsupervised target re-identification method and system
CN109840518B (en) Visual tracking method combining classification and domain adaptation
CN108345866B (en) Pedestrian re-identification method based on deep feature learning
CN112084895A (en) Pedestrian re-identification method based on deep learning
CN116740763A (en) Cross-mode pedestrian re-identification method based on dual-attention perception fusion network
CN108108652B (en) Cross-view human behavior recognition method and device based on dictionary learning
CN110321801B (en) Clothing changing pedestrian re-identification method and system based on self-coding network
CN109063725A (en) More figure regularization matrix of depths decomposition methods towards multiple view cluster

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant