CN110163110B - Pedestrian re-recognition method based on transfer learning and depth feature fusion - Google Patents

Pedestrian re-recognition method based on transfer learning and depth feature fusion Download PDF

Info

Publication number
CN110163110B
CN110163110B CN201910329733.2A CN201910329733A CN110163110B CN 110163110 B CN110163110 B CN 110163110B CN 201910329733 A CN201910329733 A CN 201910329733A CN 110163110 B CN110163110 B CN 110163110B
Authority
CN
China
Prior art keywords
pedestrian
training
network model
global
convolutional neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910329733.2A
Other languages
Chinese (zh)
Other versions
CN110163110A (en
Inventor
丁剑飞
王进
阚丹会
闫盈盈
曹扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC Big Data Research Institute Co Ltd
Original Assignee
CETC Big Data Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC Big Data Research Institute Co Ltd filed Critical CETC Big Data Research Institute Co Ltd
Priority to CN201910329733.2A priority Critical patent/CN110163110B/en
Publication of CN110163110A publication Critical patent/CN110163110A/en
Application granted granted Critical
Publication of CN110163110B publication Critical patent/CN110163110B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/2163Partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention provides a pedestrian re-identification method based on transfer learning and depth feature fusion, which comprises the following steps: pre-training, human posture correction and segmentation, feature vector-depth feature fusion, training model, test model and recognition result. According to the invention, the pedestrian global and local features are extracted by utilizing the deep convolutional neural network, the two features are subjected to deep fusion to obtain final pedestrian feature characterization, then in the training process of the deep convolutional neural network, a pedestrian re-recognition network model with better effect is obtained by adopting a transfer learning mode, and finally, the features extracted by the pedestrian re-recognition network model have stronger resolution capability, so that the purpose of improving the pedestrian re-recognition accuracy is achieved.

Description

Pedestrian re-recognition method based on transfer learning and depth feature fusion
Technical Field
The invention relates to a pedestrian re-identification method based on transfer learning and depth feature fusion, and belongs to the technical field of deep learning and transfer learning.
Background
Pedestrian re-identification is mainly aimed at carrying out pedestrian matching tasks under a non-overlapping view field multi-camera network, namely, finding out target pedestrians shot by cameras at different positions at different moments.
With the development of artificial intelligence technology, pedestrian re-identification technology in application scenes such as public security and image retrieval is focused on in a wide research area. However, compared with the traditional biological recognition technologies such as face recognition and gesture recognition, the pedestrian re-recognition technology has the problem of low recognition accuracy caused by factors such as low image resolution, visual angle change, gesture change, light change and shielding due to complex and uncontrollable environment of a monitoring video. Therefore, the pedestrian re-recognition technology faces a great challenge in the practical application scene.
In order to improve the accuracy of pedestrian re-recognition and enhance the robustness of the system, a plurality of scholars have proposed different pedestrian re-recognition methods through long-term researches. The open vision technology Facci++ has made great progress in the field of pedestrian Re-recognition, and the paper alignedReID published by the team provides a new method, and through dynamic alignment (Dynamic Alignment) and collaborative learning (Mutual learning), and then in Re-sequencing (Re-Ranking), the paper finds out through experiments that the recognition accuracy of extracting global features of pedestrians and fusing global and local features of pedestrians is almost the same in the test stage; yi and the like propose a deep measurement learning method based on a twin convolutional neural network, and good effects are obtained; liu et al propose a deep nonlinear measurement learning method based on neighborhood component analysis and a deep confidence network, and the function of neighborhood transformation analysis is to maximize the number of identifiable samples of each type of data in training data through data transformation, and in order to expand the data transformation in the neighborhood transformation analysis, the deep confidence network is used for learning nonlinear feature transformation. However, in research, it is found that most of the pedestrian re-recognition methods extract global feature vectors based on pedestrian global images as input in the training process, and some methods extract local features, but do not fully utilize the pedestrian local features to perform depth fusion, so as to obtain differentiated image characterization. And simple fine-tuning on the pedestrian database using the pre-training model does not take into account the data distribution differences between the source domain and target domain datasets. Further, the network migration effect is not ideal.
Disclosure of Invention
In order to solve the technical problems, the invention provides a pedestrian re-recognition method based on transfer learning and depth feature fusion, which can not deeply fuse global features and local features of pedestrians and does not fully consider the difference of data distribution in the network fine tuning process.
The invention is realized by the following technical scheme.
The invention provides a pedestrian re-identification method based on transfer learning and depth feature fusion, which comprises the following steps:
(1) pre-training: pre-training the pre-training model based on the ImageNet on the pedestrian re-recognition data to obtain a pedestrian re-recognition pre-training network model;
(2) correcting and dividing human body posture: selecting a refractory sample pair from a pedestrian data set, inputting the refractory sample pair into a human skeleton key point detection network, detecting fourteen key points, correcting the posture of the human body and segmenting a pedestrian local ROI, and obtaining a refractory sample pair with enhanced data and corrected global and local images;
(3) feature vector: inputting corrected global and local images and difficult-to-separate sample pairs with enhanced data into a pedestrian re-identification pre-training network model to obtain pedestrian local and global feature vectors;
(4) depth feature fusion: depth feature fusion is carried out on the local feature vector and the global feature vector of the pedestrian, and a final pedestrian feature vector is obtained;
(5) training a model: fine tuning the pedestrian re-recognition pre-training network model by adopting a transfer learning mode and the final pedestrian feature vector in the step (4), and adding a self-adaptive layer into the pedestrian re-recognition pre-training network model to obtain the pedestrian re-recognition network model;
(6) test model: inputting inquiry pedestrian and target pedestrian images, and extracting two distinguishable pedestrian global feature vectors by using a pedestrian re-recognition network model;
(7) recognition result: and (3) calculating the similarity between the query pedestrian and any image in the target pedestrian data set based on the pedestrian global feature vector in the step (6), wherein the pedestrian with the highest similarity is considered to be the same pedestrian.
And in the training stage, inputting a pedestrian re-recognition network model, and adopting a triplet pedestrian image.
The step (1) is divided into the following steps:
(1.1) acquiring a depth convolution network model trained in advance on an ImageNet data set, and training the depth convolution network model on pedestrian re-identification data;
and (1.2) when the deep convolutional neural network model is pre-trained on the pedestrian re-identification data, only the sample marking information is used for fine tuning the deep convolutional neural network model.
The step (1.2) is divided into the following steps:
(1.2.1) removing the top full connection layer from the pre-trained ResNet50 network model on the ImageNet dataset, and adding two full connection layers and one softmax layer after the maximum pooling layer;
(1.2.2) finely adjusting the constructed deep convolutional neural network by using label information marked by pedestrian images, and fixing the first three layers of the deep convolutional neural network in the fine adjustment process;
(1.2.3) obtaining the prediction probability of the pedestrian global image according to the deep convolutional neural network;
(1.2.4) defining a loss function in the deep convolutional neural network according to the predictive probability.
The corrected global and local images obtained in the step (2) are a pedestrian image of a positive and negative sample triplet, a pedestrian global image corrected by human body posture and a local ROI image.
The step (2) is divided into the following steps:
(2.1) randomly selecting P ID pedestrians from each training batch, randomly selecting K different images from each pedestrian, wherein each batch contains P multiplied by K pedestrian images;
(2.2) taking the image in each training batch as the anchor sample H n Selecting a positive sample which is the most difficult to use
Figure GDA0004088048170000041
And a most difficult negative sample +.>
Figure GDA0004088048170000042
And H n Forming a triplet, wherein the requirement for selecting refractory sample pair is +.>
Figure GDA0004088048170000043
Maximum (max)/(min)>
Figure GDA0004088048170000044
Minimum;
(2.3) inputting the pedestrian images of the positive and negative sample triplets which are difficult to separate into a human skeleton key point detection network, respectively detecting fourteen human skeleton key points, including a head, four limbs, an upper half body and a lower half body, and correcting the human body posture by taking the fourteen key points as coordinates;
and (2.4) dividing the pedestrian global image into three pedestrian local ROI images of the head, the upper body and the lower body according to fourteen human skeleton key points, and obtaining a corrected pedestrian global image and three pedestrian local images.
In the step (2.2), the pre-trained deep convolutional neural network model in the step (1.1) is used for obtaining the anchor sample H n The pedestrian image samples with the lowest scores are selected from the same pedestrian ID images to form a difficult-to-separate positive sample pair, and the difficult-to-separate positive sample pair is matched with an anchor sample H n And selecting pedestrian image samples with highest scores from different pedestrian ID images to form a difficult-to-separate negative sample pair.
The step (3) is divided into the following steps:
(3.1) obtaining the pre-training depth convolutional neural network model in the step (1.1) and the global and local images corrected in the step (2), and removing a softmax layer and a full connection layer on the top layer of the pre-training depth convolutional neural network model;
and (3.2) respectively inputting the hard-to-separate sample pair with the enhanced data and the corrected global and local images into a deep convolutional neural network model, and obtaining a pedestrian global feature vector and a pedestrian local feature vector through the deep convolutional neural network model constructed in the step (3.1).
The step (4) is divided into the following steps:
(4.1) inputting the pedestrian local and global feature vectors in the step (3) into a full-connection layer, and carrying out depth feature fusion to obtain the pedestrian feature vectors after output fusion;
and (4.2) respectively inputting the fused pedestrian characteristic vector and the pedestrian local characteristic vector in the step (3.2) into a square layer, wherein the square layer measures the similarity between the refractory sample pairs by using the square Euclidean distance.
The step (6) is divided into the following steps:
(6.1) inputting the inquiry and target pedestrian images into a human body key point and posture correction network to correct the human body posture;
and (6.2) inputting the pedestrian image with the corrected human body posture into a pedestrian re-recognition network model to obtain a pedestrian global feature vector.
The invention has the beneficial effects that: the pedestrian global feature and the pedestrian local feature are extracted through the deep convolutional neural network, the two features are subjected to deep fusion to obtain final pedestrian feature characterization, then in the deep convolutional neural network training process, a pedestrian re-recognition network model with better effect is obtained by adopting a transfer learning mode, and finally the features extracted by the pedestrian re-recognition network model have stronger resolving power, so that the purpose of improving the pedestrian re-recognition accuracy is achieved.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a network block diagram of a global feature and local feature depth fusion in accordance with an embodiment of the present invention;
fig. 3 is a network structure diagram of a deep feature fusion and local feature learning model based on a deep convolutional neural network according to an embodiment of the present invention.
Detailed Description
The technical solution of the present invention is further described below, but the scope of the claimed invention is not limited to the above.
As shown in fig. 1, a pedestrian re-recognition method based on transfer learning and depth feature fusion comprises the following steps:
(1) pre-training: pre-training the pre-training model based on the ImageNet on the pedestrian re-recognition data to obtain a pedestrian re-recognition pre-training network model;
(2) correcting and dividing human body posture: selecting a refractory sample pair from a pedestrian data set, inputting the refractory sample pair into a human skeleton key point detection network, detecting fourteen key points, correcting the posture of the human body and segmenting a pedestrian local ROI, and obtaining a refractory sample pair with enhanced data and corrected global and local images;
(3) feature vector: inputting corrected global and local images and difficult-to-separate sample pairs with enhanced data into a pedestrian re-identification pre-training network model to obtain pedestrian local and global feature vectors;
(4) depth feature fusion: depth feature fusion is carried out on the local feature vector and the global feature vector of the pedestrian, and a final pedestrian feature vector is obtained;
(5) training a model: fine tuning the pedestrian re-recognition pre-training network model by adopting a transfer learning mode and the final pedestrian feature vector in the step (4), and adding a self-adaptive layer into the pedestrian re-recognition pre-training network model to obtain the pedestrian re-recognition network model;
(6) test model: inputting inquiry pedestrian and target pedestrian images, and extracting two distinguishable pedestrian global feature vectors by using a pedestrian re-recognition network model;
(7) recognition result: and (3) calculating the similarity between the query pedestrian and any image in the target pedestrian data set based on the pedestrian global feature vector in the step (6), wherein the pedestrian with the highest similarity is considered to be the same pedestrian.
And in the training stage, inputting a pedestrian re-recognition network model, and adopting a triplet pedestrian image.
The step (1) is divided into the following steps:
(1.1) acquiring a depth convolution network model trained in advance on an ImageNet data set, and training the depth convolution network model on pedestrian re-identification data;
and (1.2) when the deep convolutional neural network model is pre-trained on the pedestrian re-identification data, only the sample marking information is used for fine tuning the deep convolutional neural network model.
The step (1.2) is divided into the following steps:
(1.2.1) removing the top full connection layer from the pre-trained ResNet50 network model on the ImageNet dataset, and adding two full connection layers and one softmax layer after the maximum pooling layer;
(1.2.2) finely adjusting the constructed deep convolutional neural network by using label information marked by pedestrian images, and fixing the first three layers of the deep convolutional neural network in the fine adjustment process;
(1.2.3) obtaining the prediction probability of the pedestrian global image according to the deep convolutional neural network;
(1.2.4) defining a loss function in the deep convolutional neural network according to the predictive probability.
The corrected global and local images obtained in the step (2) are a pedestrian image of a positive and negative sample triplet, a pedestrian global image corrected by human body posture and a local ROI image.
The step (2) is divided into the following steps:
(2.1) randomly selecting P ID pedestrians from each training batch, randomly selecting K different images from each pedestrian, wherein each batch contains P multiplied by K pedestrian images;
(2.2) taking the image in each training batch as the anchor sample H n Selecting a positive sample which is the most difficult to use
Figure GDA0004088048170000081
And a most difficult negative sample +.>
Figure GDA0004088048170000082
And H n Forming a triplet, wherein the requirement for selecting refractory sample pair is +.>
Figure GDA0004088048170000083
Maximum (max)/(min)>
Figure GDA0004088048170000084
Minimum;
(2.3) inputting the pedestrian images of the positive and negative sample triplets which are difficult to separate into a human skeleton key point detection network, respectively detecting fourteen human skeleton key points, including a head, four limbs, an upper half body and a lower half body, and correcting the human body posture by taking the fourteen key points as coordinates;
and (2.4) dividing the pedestrian global image into three pedestrian local ROI images of the head, the upper body and the lower body according to fourteen human skeleton key points, and obtaining a corrected pedestrian global image and three pedestrian local images.
In the step (2.2), the pre-trained deep convolutional neural network model in the step (1.1) is used for obtaining the anchor sample H n The pedestrian image samples with the lowest scores are selected from the same pedestrian ID images to form a difficult-to-separate positive sample pair, and the difficult-to-separate positive sample pair is matched with an anchor sample H n And selecting pedestrian image samples with highest scores from different pedestrian ID images to form a difficult-to-separate negative sample pair.
The step (3) is divided into the following steps:
(3.1) obtaining the pre-training depth convolutional neural network model in the step (1.1) and the global and local images corrected in the step (2), and removing a softmax layer and a full connection layer on the top layer of the pre-training depth convolutional neural network model;
and (3.2) respectively inputting the hard-to-separate sample pair with the enhanced data and the corrected global and local images into a deep convolutional neural network model, and obtaining a pedestrian global feature vector and a pedestrian local feature vector through the deep convolutional neural network model constructed in the step (3.1).
The step (4) is divided into the following steps:
(4.1) inputting the pedestrian local and global feature vectors in the step (3) into a full-connection layer, and carrying out depth feature fusion to obtain the pedestrian feature vectors after output fusion;
and (4.2) respectively inputting the fused pedestrian characteristic vector and the pedestrian local characteristic vector in the step (3.2) into a square layer, wherein the square layer measures the similarity between the refractory sample pairs by using the square Euclidean distance.
The step (6) is divided into the following steps:
(6.1) inputting the inquiry and target pedestrian images into a human body key point and posture correction network to correct the human body posture;
and (6.2) inputting the pedestrian image with the corrected human body posture into a pedestrian re-recognition network model to obtain a pedestrian global feature vector.
In summary, the invention utilizes the advantages of self-adaptive learning of transfer learning and deep learning, and adopts the fusion of the local features and the integral features of the pedestrian image to obtain the network model which can pay attention to the local features of the pedestrian, thereby improving the accuracy of pedestrian re-identification.
Example 1
As described above, the pedestrian re-recognition method based on transfer learning and depth feature fusion comprises the following steps:
(1) pre-training: pre-training the pre-training model based on the ImageNet on the pedestrian re-recognition data to obtain a pedestrian re-recognition pre-training network model; the method comprises the following steps:
(1.1) acquiring a depth convolution network model trained in advance on an ImageNet data set, and training the depth convolution network model on pedestrian re-identification data;
(1.2) when the deep convolutional neural network model is pre-trained on the pedestrian re-recognition data, only sample marking information is utilized to finely tune the deep convolutional neural network model;
(1.2.1) removing the top full connection layer from the pre-trained ResNet50 network model on the ImageNet dataset, and adding two full connection layers and one softmax layer after the maximum pooling layer;
(1.2.2) fine-tuning the constructed deep convolutional neural network by using label information marked by pedestrian images, wherein the front three layers of the deep convolutional neural network are fixed in the fine-tuning process, and the characteristics extracted from the front three layers of the deep convolutional neural network are usually textures, edges and the like, so that the characteristics have certain universality;
(1.2.3) obtaining the prediction probability y of the pedestrian global image according to the deep convolutional neural network i Expressed as:
Figure GDA0004088048170000101
/>
wherein y is i Representing the probability that sample x belongs to the i-th class,
Figure GDA0004088048170000102
is a normalization term, C is the total number of categories;
(1.2.4) defining a loss function L in the deep convolutional neural network according to the prediction probability I Expressed as:
Figure GDA0004088048170000103
wherein q j Representing the tag probability, C is the total number of categories.
(2) Correcting and dividing human body posture: selecting a difficult-to-separate sample pair from a pedestrian data set, inputting the difficult-to-separate sample pair into fourteen key points detected by a human skeleton key point detection network, correcting the posture of a human body and segmenting a local ROI of the human body, and obtaining a difficult-to-separate sample pair with enhanced data and corrected global and local images; the corrected global and local images are a difficult-to-separate positive and negative sample triplet pedestrian image, a pedestrian global image and a local ROI image after human body posture correction; the method comprises the following steps:
(2.1) randomly selecting P ID pedestrians from each training batch, randomly selecting K different images from each pedestrian, wherein each batch contains P multiplied by K pedestrian images;
(2.2) taking the image in each training batch as the anchor sample H n Selecting a positive sample which is the most difficult to use
Figure GDA0004088048170000104
And a most difficult negative sample +.>
Figure GDA0004088048170000105
And H n Forming a triplet, wherein the requirement for selecting refractory sample pair is +.>
Figure GDA0004088048170000106
Maximum (max)/(min)>
Figure GDA0004088048170000107
Minimum;
specifically, using the pre-trained deep convolutional neural network model of step (1.1), in combination with the anchor sample H n The pedestrian image samples with the lowest scores are selected from the same pedestrian ID images to form a difficult-to-separate positive sample pair, and the difficult-to-separate positive sample pair is matched with an anchor sample H n Selecting pedestrian image samples with highest scores from different pedestrian ID images to form a difficult-to-separate negative sample pair;
(2.3) inputting the pedestrian images of the positive and negative sample triplets which are difficult to separate into a human skeleton key point detection network, respectively detecting fourteen human skeleton key points, including a head, four limbs, an upper half body and a lower half body, and correcting the human body posture by taking the fourteen key points as coordinates;
and (2.4) dividing the pedestrian global image into three pedestrian local ROI images of the head, the upper body and the lower body according to fourteen human skeleton key points, and obtaining a corrected pedestrian global image and three pedestrian local images.
(3) Feature vector: inputting corrected global and local images and difficult-to-separate sample pairs with enhanced data into a pedestrian re-identification pre-training network model to obtain pedestrian local and global feature vectors; the method comprises the following steps:
(3.1) obtaining the pre-training depth convolutional neural network model in the step (1.1) and the global and local images corrected in the step (2), and removing a softmax layer and a full connection layer on the top layer of the pre-training depth convolutional neural network model;
(3.2) respectively inputting the hard-to-separate sample pair after data enhancement and the global and local images after correction into a deep convolutional neural network model, and constructing the deep convolutional neural network model through the step (3.1)Obtaining a pedestrian global feature vector A and a pedestrian local feature vector B 1 、B 2 、B 3 Wherein B is 1 For head region feature vector, B 2 For the upper body region feature vector, B 3 Is the lower body feature vector.
Further, when the hard-to-separate sample pairs after data enhancement are respectively input into the deep convolutional neural network in parallel, the deep convolutional neural network models are simultaneously propagated and weight is shared.
(4) Depth feature fusion: depth feature fusion is carried out on the local feature vector and the global feature vector of the pedestrian, and a final pedestrian feature vector is obtained; the method comprises the following steps:
(4.1) inputting the pedestrian local and global feature vectors in the step (3) into a full-connection layer, and carrying out depth feature fusion to obtain an output fused pedestrian feature vector C;
(4.2) combining the pedestrian feature vector C and the pedestrian local feature vector B in the step (3.2) 1 、B 2 、B 3 And respectively inputting a square layer, wherein the square layer measures the similarity between the refractory sample pairs by using the square Euclidean distance, and the similarity is expressed as follows:
Figure GDA0004088048170000121
Figure GDA0004088048170000122
wherein a is the anchor sample, p is the most difficult positive sample, n is the most difficult negative sample, d a,p To difficultly divide the distance between positive sample pairs, d a,n The distance between the negative sample pairs is difficult to separate.
Preferably, in order to enable the deep convolutional neural network to extract pedestrian features with higher resolution, and make full use of labeling information of pedestrian samples, cross entropy loss and triplet loss are used in the training process, wherein the deep convolutional neural network which fuses global features and local features uses two loss functions in the training process, and the deep convolutional neural network for extracting the head, the upper body and the lower body uses only the triplet loss functions;
further, the deep convolutional neural network that fuses global and local features uses cross entropy loss and TriHard loss, expressed as:
Figure GDA0004088048170000123
the sample set having the same ID as the anchor sample a is a, and the remaining sample set having a different ID is B. L (L) th For TriHard loss, L I For cross entropy loss, L th Alpha in (a) is a threshold parameter considered to be set, q j Representing the tag probability, C being the total number of categories;
further, the deep convolutional neural network for extracting the head, the upper body and the lower body of the pedestrian uses a TriHard loss function, and the deep convolutional neural network for extracting the global features of the pedestrian can pay more attention to the local features with distinguishing property by sharing weight parameters, wherein the loss function L th The method comprises the following steps:
Figure GDA0004088048170000131
the sample set having the same ID as the anchor sample a is a, and the remaining sample set having a different ID is B. L (L) th Is a threshold parameter considered to be set;
finally, the loss of the extracted depth fusion features and the loss of the local features are weighted according to the corresponding weights to form Total loss, and the network parameters are updated by back propagation of the whole network;
(5) training a model: fine tuning the pedestrian re-recognition pre-training network model by adopting a transfer learning mode and the final pedestrian feature vector in the step (4), and adding a self-adaptive layer in the pedestrian re-recognition pre-training network model to obtain the pedestrian re-recognition network model; in order to obtain a better migration learning effect, an adaptive layer is added, so that the data distribution of a source domain and a target domain is closer, and the effect of re-identifying a network model by a pedestrian is better;
specifically, parameter learning of the multi-core MMD metric is added in training of the deep convolutional neural network, and the difference between a source domain and a target domain is measured, wherein the multi-core of the multi-core MMD metric is expressed as:
Figure GDA0004088048170000132
the distribution distance between the source domain and the target domain is expressed as:
Figure GDA0004088048170000133
wherein phi () is a mapping for mapping the original variable to a regenerated kernel hilbert space, and H represents that the metric distance is measured by phi () mapping the data into the regenerated hilbert space (RKHS);
the optimization objective of the adaptive layer consists of a loss function and an adaptive loss, expressed as:
Figure GDA0004088048170000141
where Θ represents all weight and bias parameters of the network, which are the target parameters of learning, l 1 To l 2 Is the beginning and ending layer of the network adaptation, the former does not adapt,
Figure GDA0004088048170000142
n a representing a set of all annotation data in the source domain and the target domain, J () being a loss function;
specifically, after removing a top softmax layer from the obtained pre-training deep convolutional neural network, selecting a pedestrian image for input, calculating scores of a plurality of layers of convolutional layers at the top of the convolutional neural network by using a trained classifier, fixing a network before the layer with the highest score, and fine-tuning the layer with the highest score and the network layers after the layer with the highest score;
(6) test model: only using a pedestrian re-recognition network model, inputting and inquiring pedestrian and target pedestrian images, obtaining two distinguishing pedestrian feature vectors, and respectively extracting pedestrian global feature vectors from the two distinguishing pedestrian feature vectors; the method comprises the following steps:
(6.1) inputting the inquiry and target pedestrian images into a human body key point and posture correction network to correct the human body posture;
and (6.2) inputting the pedestrian image with the corrected human body posture into a pedestrian re-recognition network model to obtain a pedestrian global feature vector.
(7) Recognition result: and (3) calculating the similarity between the query pedestrian and any image in the target pedestrian data set based on the pedestrian global feature vector in the step (6), wherein the pedestrian with the highest similarity is considered to be the same pedestrian.
Further, the input of the pedestrian re-recognition network model in the training stage adopts a triplet (triplet) pedestrian image.
Example 2
As described above, the pedestrian re-recognition method based on transfer learning and depth feature fusion comprises the following steps:
step S1, pre-training a pre-training model based on ImageNet on pedestrian re-recognition data to obtain a pedestrian re-recognition pre-training network model;
step S11, a depth convolution network model trained in advance on an ImageNet data set is obtained, and training is carried out on pedestrian re-identification data;
step S12, when the deep convolutional neural network model is pre-trained on the pedestrian re-identification data, only sample labeling information is used for fine tuning the network model;
step S121, removing the full connection layer of the top layer from a pre-trained ResNet50 network model on an ImageNet data set, and adding two full connection layers and one softmax layer after the maximum pooling layer;
further, the parameters of the added two full-connection layers are respectively 1×1×2048 and 1×1×751, the input images 224×224 are subjected to iterative optimization by adopting a gradient descent method when the ResNet50 is pre-trained, the iterative times are set to 75, the learning rate is initialized to 0.1, the weight attenuation value in the optimization process is set to 0.001, and 64 pedestrian samples are input into each batch;
and step S122, performing fine adjustment on the constructed deep convolutional neural network by using label information marked by the pedestrian image, and fixing the first three layers of the network in the fine adjustment process. Because the characteristics extracted from the first three layers of the convolutional neural network are usually textures, edges and the like, the characteristics have certain universality;
step S123, obtaining the prediction probability of the pedestrian global image according to the convolutional neural network, wherein the prediction probability is expressed as follows:
Figure GDA0004088048170000151
wherein y is i Representing the probability that sample x belongs to the i-th class,
Figure GDA0004088048170000152
is a normalization term, C is the total number of categories;
preferably, c=751 when training and testing is performed on the Market-1501 database.
Step S124, setting the loss function in the convolutional neural network to L according to the prediction probability I Expressed as:
Figure GDA0004088048170000161
wherein qj represents tag probability and C is 751;
step S2, in the training stage, a triplet (triplet) pedestrian image is adopted as input of the network model. Firstly, selecting a difficult-to-separate sample pair on a pedestrian data set, inputting the difficult-to-separate sample pair into a human skeleton key point detection network to detect fourteen key points, and then correcting human body posture and segmenting a pedestrian local ROI to obtain a corrected global image and a corrected local image;
step S21, randomly selecting pedestrians with P IDs from each training batch, randomly selecting K different images from each pedestrian, wherein each batch contains P multiplied by K pedestrian images;
specifically, 6 pedestrians with ID are selected in the embodiment, each ID pedestrian randomly selects 16 different images, and each batch contains 64 pedestrian images;
step S22, taking the image in each training batch as an anchor sample H n Selecting a positive sample which is the most difficult to use
Figure GDA0004088048170000162
And a most difficult negative sample +.>
Figure GDA0004088048170000163
And H n Forming a triplet, wherein the requirement for selecting refractory sample pair is +.>
Figure GDA0004088048170000164
The maximum value of the total number of the components,
Figure GDA0004088048170000165
minimum;
further, using the convolutional neural network model pre-trained in step S1, in conjunction with the anchor sample H n The pedestrian image samples with the lowest scores are selected from the same pedestrian ID images to form a difficult-to-separate positive sample pair, and the difficult-to-separate positive sample pair is matched with an anchor sample H n Selecting pedestrian image samples with highest scores from different pedestrian ID images to form a difficult-to-separate negative sample pair;
step S23, inputting the triple pedestrian images into a human skeleton key point detection network, respectively detecting fourteen human skeleton key points, including a head, four limbs, an upper half body and a lower half body, and correcting the human body posture by taking the fourteen key points as coordinates;
step S24, dividing the pedestrian global image into three pedestrian local ROI images of the head, the upper body and the lower body according to fourteen human skeleton key points, and further obtaining a corrected pedestrian global image and three pedestrian local images;
s3, inputting the difficult-to-separate sample pair subjected to human body posture correction and data enhancement into a pre-training network so as to obtain local and global feature vectors of pedestrians;
step S31, obtaining a deep convolutional neural network model pre-trained in step S1, obtaining a pedestrian global image and a pedestrian local image based on a refractory sample pair in step S2, and removing a softmax layer and a full-connection layer on the top layer of the pre-trained deep convolutional neural network;
in this embodiment, the parameters of the added full connection layer are respectively 1×1×751, the input images 224×224 are subjected to iterative optimization by adopting a gradient descent method when the res net50 is pre-trained, the iteration times are set to 60, the learning rate of the first 20 iterations is initialized to 0.01, the learning rate of the latter 40 iterations is 0.001, the weight attenuation value in the optimization process is set to 0.0001, and 64 pedestrian samples are input into each batch;
step S32, the obtained refractory sample pairs are respectively input into a deep convolutional neural network, wherein the deep convolutional neural network comprises a pedestrian global image and a pedestrian local image, and the pedestrian global feature vector A and the pedestrian local feature vector B are obtained through the deep convolutional neural network constructed in the step S31 1 ,B 2 ,B 3 Wherein B is 1 For head region feature vector, B 2 For the upper body region feature vector, B 3 Is the lower body feature vector;
s33, when the difficult-to-separate sample pairs are respectively input into the deep convolutional neural networks in parallel, all the deep convolutional neural network models propagate simultaneously and share weights;
s4, carrying out depth feature fusion on the obtained pedestrian local feature vector and the global feature vector to obtain a final pedestrian feature vector;
step S41, obtaining a pedestrian global feature vector A and a pedestrian local feature vector B through step S3 1 ,B 2 ,B 3 Then inputting the pedestrian global feature vector and the local feature vector into a layer of full-connection layer for depth feature fusion, and outputting a fused pedestrian feature vector C, as shown in figure 2;
step S42, fusing the pedestrian characteristic vector C and the local characteristic vector B 1 ,B 2 ,B 3 And respectively inputting a square layer, wherein the square layer measures the similarity between the refractory sample pairs by using the square Euclidean distance, and the similarity is expressed as follows:
Figure GDA0004088048170000181
Figure GDA0004088048170000182
wherein a is the anchor sample, p is the most difficult positive sample, n is the most difficult negative sample, d a,p To difficultly divide the distance between positive sample pairs, d a,n The distance between the negative sample pairs is difficult to separate;
step S43, in order to enable the deep convolutional neural network to extract pedestrian features with higher resolution, and make full use of labeling information of pedestrian samples, cross entropy loss and triplet loss are used in the training network process, wherein the deep convolutional neural network which fuses global features and local features uses two loss functions in the training process, and the deep convolutional neural network which extracts head, upper body and lower body features uses only triplet loss functions;
step S431, the deep convolutional neural network fusing the global features and the local features uses cross entropy loss and TriHard loss, expressed as:
Figure GDA0004088048170000183
the sample set having the same ID as the anchor sample a is a, and the remaining sample set having a different ID is B. L (L) th For TriHard loss, L I For cross entropy loss, L th Alpha in (a) is a threshold parameter considered to be set, q j Representing tag probability, C751;
step S432, extracting the depth convolution neural network of the head, upper body and lower body of the pedestrian uses the TriHard loss function, and the global features of the pedestrian are extracted by sharing weight parametersThe depth convolution neural network of features can pay more attention to the distinguishing local features, wherein the loss function L th The method comprises the following steps:
Figure GDA0004088048170000184
the sample set having the same ID as the anchor sample a is a, and the remaining sample set having a different ID is B. L (L) th Is a threshold parameter considered to be set;
step S433, finally, the loss of the extracted depth fusion feature and the local feature is weighted according to the corresponding weight to form Total loss, and the network parameters are updated by back propagation on the whole network, as shown in FIG. 3;
further, the combination mode of each network loss weighted according to the corresponding weight is as follows:
Figure GDA0004088048170000191
wherein p is c To extract the cross entropy loss of depth fusion features, p t
Figure GDA0004088048170000192
TriHard loss of extracted depth fusion feature, head feature, upper body feature, lower body feature, respectively, weight factor alpha 1 、α 2 、α 3 、α 4 、α 5 Are respectively set to 0.2, 0.2 and 0.2;
step S5, in the whole process of training the pedestrian re-identification network model, fine tuning is carried out on the pre-training network in a transfer learning mode, and a self-adaptive layer is added in the network;
step S51, after removing a top softmax layer of the obtained pre-training convolutional neural network, selecting a pedestrian image to input into the network, calculating scores of a plurality of layers of convolutional layers at the top of the convolutional neural network by using a trained classifier, fixing a network before the layer with the highest score, and fine-tuning the layer with the highest score and the later network layers;
in step S52, an adaptive layer is added in the fine-tuned network layer to obtain a better migration learning effect, so that the data distribution of the source domain and the target domain is closer, and a better effect is obtained by re-identifying the network by the pedestrian. Adding parameter learning of the multi-core MMD metric in training of a deep convolutional neural network, and measuring the difference between a source domain and a target domain, wherein the multi-core of the multi-core MMD metric is expressed as:
Figure GDA0004088048170000193
the distribution distance between the source domain and the target domain is expressed as:
Figure GDA0004088048170000201
wherein phi () is a mapping for mapping the original variable to a regenerated kernel hilbert space, and H represents that the metric distance is measured by phi () mapping the data into the regenerated hilbert space (RKHS);
step S53, the optimization objective of the adaptive layer consists of a loss function and an adaptive loss, expressed as:
Figure GDA0004088048170000202
where Θ represents all weight and bias parameters of the network, which are the target parameters of learning, l 1 To l 2 Is the beginning and ending layer of the network adaptation, the former does not adapt,
Figure GDA0004088048170000203
n a representing a set of all annotation data in the source domain and the target domain, J () being a loss function;
and S6, in the test stage, extracting the pedestrian global feature vector by using the trained network model only. Based on the trained network model in the steps, inputting and inquiring images of pedestrians and target pedestrians to obtain two pedestrian feature vectors with higher distinguishability;
step S61, the pedestrian global feature vector extracted by the deep convolutional neural network trained by the step S has higher distinguishability, so that only the pedestrian global feature vector is extracted in the test stage of the model;
step S62, inputting inquiry and target pedestrian images into a human body key point and posture correction network to correct human body posture;
and S7, calculating the similarity between the query pedestrian image and any one image in the target pedestrian image data set based on the pedestrian global feature vector, wherein the pedestrian with the highest similarity is considered to be the same pedestrian, and obtaining a pedestrian re-identification result.
Specifically, the invention takes the mark-1501 pedestrian database as a training set and a testing set, and the rank-1 of the pedestrian re-recognition method based on transfer learning and depth feature fusion is up to 85% and mAp to 60%. The pedestrian re-recognition method adopts the method of transition learning and deep fusion of the global features and the local features of the pedestrian during training, thereby greatly improving the accuracy of pedestrian re-recognition and further seeing the effectiveness of the method.
In summary, the network model with the pedestrian local and global features fused deeply is trained by adopting a transfer learning mode, in the training stage, a difficult-to-separate positive sample pair and a difficult-to-separate negative sample pair are selected from a pedestrian data set and input into a human skeleton key point detection network, fourteen key points are detected, and the pedestrian gesture is corrected and divided into three pedestrian image subregions by the fourteen key points; respectively inputting pedestrian training images containing a difficult-to-separate positive sample pair and a difficult-to-separate negative sample pair into a pre-training network, wherein each input sample is expanded into a pedestrian integral image and three pedestrian sub-area images, and local and global feature vectors of pedestrians are obtained; three pedestrian local feature vectors are input into a layer of full-connection layer to be fused with the pedestrian global feature vector, so that a depth feature fused pedestrian feature vector is obtained; the pre-training network is finely tuned in a transfer learning mode in the training process of the pedestrian re-recognition network, and an adaptive layer is added to the top layer of the pre-training network to complete the adaptation of the source domain and the target domain, so that the data distribution of the source domain and the target domain is more approximate, and the effect of the pedestrian re-recognition network is better; in the test stage, only the global feature network model is used, the query pedestrian image and the target pedestrian image are input into the global feature extraction network model to obtain two global feature vectors, and then the similarity of the query pedestrian and the target pedestrian is calculated to obtain the recognition result.

Claims (6)

1. A pedestrian re-identification method based on transfer learning and depth feature fusion is characterized by comprising the following steps of: the method comprises the following steps:
(1) pre-training: pre-training the pre-training model based on the ImageNet on the pedestrian re-recognition data to obtain a pedestrian re-recognition pre-training network model;
(2) correcting and dividing human body posture: selecting a refractory sample pair from a pedestrian data set, inputting the refractory sample pair into a human skeleton key point detection network, detecting fourteen key points, correcting the posture of the human body and segmenting a pedestrian local ROI, and obtaining a refractory sample pair with enhanced data and corrected global and local images;
(3) feature vector: inputting corrected global and local images and difficult-to-separate sample pairs with enhanced data into a pedestrian re-identification pre-training network model to obtain pedestrian local and global feature vectors;
(4) depth feature fusion: depth feature fusion is carried out on the local feature vector and the global feature vector of the pedestrian, and a final pedestrian feature vector is obtained;
(5) training a model: fine tuning the pedestrian re-recognition pre-training network model by adopting a transfer learning mode and the final pedestrian feature vector in the step (4), and adding a self-adaptive layer into the pedestrian re-recognition pre-training network model to obtain the pedestrian re-recognition network model;
(6) test model: inputting inquiry pedestrian and target pedestrian images, and extracting two distinguishable pedestrian global feature vectors by using a pedestrian re-recognition network model;
(7) recognition result: based on the pedestrian global feature vector in the step (6), calculating the similarity between the query pedestrian and any one image in the target pedestrian data set, wherein the pedestrian with the highest similarity is considered to be the same pedestrian;
the step (1) is divided into the following steps:
(1.1) acquiring a depth convolution network model trained in advance on an ImageNet data set, and training the depth convolution network model on pedestrian re-identification data;
(1.2) when the deep convolutional neural network model is pre-trained on the pedestrian re-recognition data, only sample marking information is utilized to finely tune the deep convolutional neural network model;
the step (1.2) is divided into the following steps:
(1.2.1) removing the top full connection layer from the pre-trained ResNet50 network model on the ImageNet dataset, and adding two full connection layers and one softmax layer after the maximum pooling layer;
(1.2.2) finely adjusting the constructed deep convolutional neural network by using label information marked by pedestrian images, and fixing the first three layers of the deep convolutional neural network in the fine adjustment process;
(1.2.3) obtaining the prediction probability of the pedestrian global image according to the deep convolutional neural network;
(1.2.4) defining a loss function in the deep convolutional neural network according to the predictive probability;
the step (2) is divided into the following steps:
(2.1) randomly selecting P ID pedestrians from each training batch, randomly selecting K different images from each pedestrian, wherein each batch contains P multiplied by K pedestrian images;
(2.2) taking the image in each training batch as the anchor sample H n Selecting a positive sample which is most difficult to separate
Figure QLYQS_1
And a negative sample of the most difficult to separate +.>
Figure QLYQS_2
And H n Forming a triplet, wherein the requirement for selecting refractory sample pair is +.>
Figure QLYQS_3
Maximum (max)/(min)>
Figure QLYQS_4
Minimum;
(2.3) inputting the pedestrian images of the positive and negative sample triplets which are difficult to separate into a human skeleton key point detection network, respectively detecting fourteen human skeleton key points, including a head, four limbs, an upper half body and a lower half body, and correcting the human body posture by taking the fourteen key points as coordinates;
(2.4) dividing the pedestrian global image into three pedestrian local ROI images of the head, the upper body and the lower body according to fourteen human skeleton key points, and obtaining a corrected pedestrian global image and three pedestrian local images;
the step (6) is divided into the following steps:
(6.1) inputting the inquiry and target pedestrian images into a human body key point and posture correction network to correct the human body posture;
(6.2) inputting the pedestrian image with the corrected human body posture into a pedestrian re-recognition network model to obtain a pedestrian global feature vector;
in the step (5), parameter learning of the multi-core MMD metric is added in training of the deep convolutional neural network, and the difference between the source domain and the target domain is measured, wherein the multi-core of the multi-core MMD metric is expressed as:
Figure QLYQS_5
the distribution distance between the source domain and the target domain is expressed as:
Figure QLYQS_6
wherein phi () is a mapping for mapping the original variable to the regenerated kernel hilbert space, and H represents that the measurement distance is measured by mapping the data by phi () into the regenerated hilbert space RKHS;
the optimization objective of the adaptive layer consists of a loss function and an adaptive loss, expressed as:
Figure QLYQS_7
where Θ represents all weight and bias parameters of the network, which are the target parameters of learning, l 1 To l 2 Is the beginning and ending layer of the network adaptation, the former does not adapt,
Figure QLYQS_8
n a representing the set of all annotation data in the source and target domains, J () is a loss function.
2. The pedestrian re-recognition method based on transfer learning and depth feature fusion as claimed in claim 1, wherein: and in the training stage, inputting a pedestrian re-recognition network model, and adopting a triplet pedestrian image.
3. The pedestrian re-recognition method based on transfer learning and depth feature fusion as claimed in claim 1, wherein: the corrected global and local images obtained in the step (2) are a pedestrian image of a positive and negative sample triplet, a pedestrian global image corrected by human body posture and a local ROI image.
4. The pedestrian re-recognition method based on transfer learning and depth feature fusion as claimed in claim 1, wherein: in the step (2.2), the pre-trained deep convolutional neural network model in the step (1.1) is used for obtaining the anchor sample H n The pedestrian image samples with the lowest scores are selected from the same pedestrian ID images to form a difficult-to-separate positive sample pair, and the difficult-to-separate positive sample pair is matched with an anchor sample H n And selecting pedestrian image samples with highest scores from different pedestrian ID images to form a difficult-to-separate negative sample pair.
5. The pedestrian re-recognition method based on transfer learning and depth feature fusion as claimed in claim 1, wherein: the step (3) is divided into the following steps:
(3.1) obtaining the pre-training depth convolutional neural network model in the step (1.1) and the global and local images corrected in the step (2), and removing a softmax layer and a full connection layer on the top layer of the pre-training depth convolutional neural network model;
and (3.2) respectively inputting the hard-to-separate sample pair with the enhanced data and the corrected global and local images into a deep convolutional neural network model, and obtaining a pedestrian global feature vector and a pedestrian local feature vector through the deep convolutional neural network model constructed in the step (3.1).
6. The pedestrian re-recognition method based on transfer learning and depth feature fusion as claimed in claim 1, wherein: the step (4) is divided into the following steps:
(4.1) inputting the pedestrian local and global feature vectors in the step (3) into a full-connection layer, and carrying out depth feature fusion to obtain the pedestrian feature vectors after output fusion;
and (4.2) respectively inputting the fused pedestrian characteristic vector and the pedestrian local characteristic vector in the step (3.2) into a square layer, wherein the square layer measures the similarity between the refractory sample pairs by using the square Euclidean distance.
CN201910329733.2A 2019-04-23 2019-04-23 Pedestrian re-recognition method based on transfer learning and depth feature fusion Active CN110163110B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910329733.2A CN110163110B (en) 2019-04-23 2019-04-23 Pedestrian re-recognition method based on transfer learning and depth feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910329733.2A CN110163110B (en) 2019-04-23 2019-04-23 Pedestrian re-recognition method based on transfer learning and depth feature fusion

Publications (2)

Publication Number Publication Date
CN110163110A CN110163110A (en) 2019-08-23
CN110163110B true CN110163110B (en) 2023-06-06

Family

ID=67639792

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910329733.2A Active CN110163110B (en) 2019-04-23 2019-04-23 Pedestrian re-recognition method based on transfer learning and depth feature fusion

Country Status (1)

Country Link
CN (1) CN110163110B (en)

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110569779B (en) * 2019-08-28 2022-10-04 西北工业大学 Pedestrian attribute identification method based on pedestrian local and overall attribute joint learning
CN110533184B (en) * 2019-08-31 2023-01-06 南京人工智能高等研究院有限公司 Network model training method and device
CN110555420B (en) * 2019-09-09 2022-04-12 电子科技大学 Fusion model network and method based on pedestrian regional feature extraction and re-identification
CN110674881B (en) * 2019-09-27 2022-02-11 长城计算机软件与系统有限公司 Trademark image retrieval model training method, system, storage medium and computer equipment
CN110688976A (en) * 2019-10-09 2020-01-14 创新奇智(北京)科技有限公司 Store comparison method based on image identification
CN110705499B (en) * 2019-10-12 2020-06-02 成都考拉悠然科技有限公司 Crowd counting method based on transfer learning
CN110795580B (en) * 2019-10-23 2023-12-08 武汉理工大学 Vehicle weight identification method based on space-time constraint model optimization
CN112784630A (en) * 2019-11-06 2021-05-11 广东毓秀科技有限公司 Method for re-identifying pedestrians based on local features of physical segmentation
CN111126198B (en) * 2019-12-11 2023-05-09 中山大学 Pedestrian re-identification method based on deep representation learning and dynamic matching
CN112990424A (en) * 2019-12-17 2021-06-18 杭州海康威视数字技术股份有限公司 Method and device for training neural network model
CN111126599B (en) * 2019-12-20 2023-09-05 复旦大学 Neural network weight initialization method based on transfer learning
CN111126275B (en) * 2019-12-24 2023-05-05 广东省智能制造研究所 Pedestrian re-identification method and device based on multi-granularity feature fusion
CN111160295B (en) * 2019-12-31 2023-05-12 广州视声智能科技有限公司 Video pedestrian re-recognition method based on region guidance and space-time attention
CN111274922B (en) * 2020-01-17 2022-11-29 山东师范大学 Pedestrian re-identification method and system based on multi-level deep learning network
CN111401265B (en) * 2020-03-19 2020-12-25 重庆紫光华山智安科技有限公司 Pedestrian re-identification method and device, electronic equipment and computer-readable storage medium
CN111428650B (en) * 2020-03-26 2024-04-02 北京工业大学 Pedestrian re-recognition method based on SP-PGGAN style migration
CN111539257B (en) * 2020-03-31 2022-07-26 苏州科达科技股份有限公司 Person re-identification method, device and storage medium
CN111428675A (en) * 2020-04-02 2020-07-17 南开大学 Pedestrian re-recognition method integrated with pedestrian posture features
CN111582154A (en) * 2020-05-07 2020-08-25 浙江工商大学 Pedestrian re-identification method based on multitask skeleton posture division component
CN111696056B (en) * 2020-05-25 2023-05-02 五邑大学 Digital archive image correction method based on multitasking transfer learning
CN111881842A (en) * 2020-07-30 2020-11-03 深圳力维智联技术有限公司 Pedestrian re-identification method and device, electronic equipment and storage medium
CN111967389B (en) * 2020-08-18 2022-02-18 厦门理工学院 Face attribute recognition method and system based on deep double-path learning network
CN112132200A (en) * 2020-09-17 2020-12-25 山东大学 Lithology identification method and system based on multi-dimensional rock image deep learning
CN112183438B (en) * 2020-10-13 2022-11-04 深圳龙岗智能视听研究院 Image identification method for illegal behaviors based on small sample learning neural network
CN112989911A (en) * 2020-12-10 2021-06-18 奥比中光科技集团股份有限公司 Pedestrian re-identification method and system
CN112990144B (en) * 2021-04-30 2021-08-17 德鲁动力科技(成都)有限公司 Data enhancement method and system for pedestrian re-identification
CN113221770A (en) * 2021-05-18 2021-08-06 青岛根尖智能科技有限公司 Cross-domain pedestrian re-identification method and system based on multi-feature hybrid learning
CN113379627B (en) * 2021-06-07 2023-06-27 北京百度网讯科技有限公司 Training method of image enhancement model and method for enhancing image
CN113377991B (en) * 2021-06-10 2022-04-15 电子科技大学 Image retrieval method based on most difficult positive and negative samples
CN113378729A (en) * 2021-06-16 2021-09-10 西安理工大学 Pose embedding-based multi-scale convolution feature fusion pedestrian re-identification method
CN113591864B (en) * 2021-07-28 2023-04-07 北京百度网讯科技有限公司 Training method, device and system for text recognition model framework
CN114863138B (en) * 2022-07-08 2022-09-06 腾讯科技(深圳)有限公司 Image processing method, device, storage medium and equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107766791A (en) * 2017-09-06 2018-03-06 北京大学 A kind of pedestrian based on global characteristics and coarseness local feature recognition methods and device again
CN107832672A (en) * 2017-10-12 2018-03-23 北京航空航天大学 A kind of pedestrian's recognition methods again that more loss functions are designed using attitude information
CN108334849A (en) * 2018-01-31 2018-07-27 中山大学 A kind of recognition methods again of the pedestrian based on Riemann manifold
CN109002761A (en) * 2018-06-13 2018-12-14 中山大学新华学院 A kind of pedestrian's weight identification monitoring system based on depth convolutional neural networks
CN109271870A (en) * 2018-08-21 2019-01-25 平安科技(深圳)有限公司 Pedestrian recognition methods, device, computer equipment and storage medium again
CN109446898A (en) * 2018-09-20 2019-03-08 暨南大学 A kind of recognition methods again of the pedestrian based on transfer learning and Fusion Features

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3532993A4 (en) * 2016-10-25 2020-09-30 Deep North, Inc. Point to set similarity comparison and deep feature learning for visual recognition

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107766791A (en) * 2017-09-06 2018-03-06 北京大学 A kind of pedestrian based on global characteristics and coarseness local feature recognition methods and device again
CN107832672A (en) * 2017-10-12 2018-03-23 北京航空航天大学 A kind of pedestrian's recognition methods again that more loss functions are designed using attitude information
CN108334849A (en) * 2018-01-31 2018-07-27 中山大学 A kind of recognition methods again of the pedestrian based on Riemann manifold
CN109002761A (en) * 2018-06-13 2018-12-14 中山大学新华学院 A kind of pedestrian's weight identification monitoring system based on depth convolutional neural networks
CN109271870A (en) * 2018-08-21 2019-01-25 平安科技(深圳)有限公司 Pedestrian recognition methods, device, computer equipment and storage medium again
CN109446898A (en) * 2018-09-20 2019-03-08 暨南大学 A kind of recognition methods again of the pedestrian based on transfer learning and Fusion Features

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"GLAD: Global-Local-Alignment Descriptor for Scalable Person Re-Identification";Longhui Wei;《IEEE Transactions on Multimedia》;20180914;986-999页 *
"基于深度学习的行人再识别研究综述";徐梦洋;《计算机科学》;20181031;119-122页 *

Also Published As

Publication number Publication date
CN110163110A (en) 2019-08-23

Similar Documents

Publication Publication Date Title
CN110163110B (en) Pedestrian re-recognition method based on transfer learning and depth feature fusion
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
CN108960140B (en) Pedestrian re-identification method based on multi-region feature extraction and fusion
US20200285896A1 (en) Method for person re-identification based on deep model with multi-loss fusion training strategy
CN107862705B (en) Unmanned aerial vehicle small target detection method based on motion characteristics and deep learning characteristics
CN109598268A (en) A kind of RGB-D well-marked target detection method based on single flow depth degree network
CN110276264B (en) Crowd density estimation method based on foreground segmentation graph
CN109190446A (en) Pedestrian's recognition methods again based on triple focused lost function
CN113033520B (en) Tree nematode disease wood identification method and system based on deep learning
CN109284767B (en) Pedestrian retrieval method based on augmented sample and multi-flow layer
CN112347888A (en) Remote sensing image scene classification method based on bidirectional feature iterative fusion
CN110390308B (en) Video behavior identification method based on space-time confrontation generation network
CN108960142B (en) Pedestrian re-identification method based on global feature loss function
CN109635726B (en) Landslide identification method based on combination of symmetric deep network and multi-scale pooling
CN112966647A (en) Pedestrian re-identification method based on layer-by-layer clustering and enhanced discrimination
JP2022082493A (en) Pedestrian re-identification method for random shielding recovery based on noise channel
CN112966740A (en) Small sample hyperspectral image classification method based on core sample adaptive expansion
CN112070010B (en) Pedestrian re-recognition method for enhancing local feature learning by combining multiple-loss dynamic training strategies
CN112084895B (en) Pedestrian re-identification method based on deep learning
CN115830643B (en) Light pedestrian re-recognition method based on posture guiding alignment
CN112446305A (en) Pedestrian re-identification method based on classification weight equidistant distribution loss model
CN113762009A (en) Crowd counting method based on multi-scale feature fusion and double-attention machine mechanism
CN113763417B (en) Target tracking method based on twin network and residual error structure
CN115376159A (en) Cross-appearance pedestrian re-recognition method based on multi-mode information
Liu et al. A novel deep transfer learning method for sar and optical fusion imagery semantic segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant