CN111027421A - Graph-based direct-push type semi-supervised pedestrian re-identification method - Google Patents
Graph-based direct-push type semi-supervised pedestrian re-identification method Download PDFInfo
- Publication number
- CN111027421A CN111027421A CN201911173132.3A CN201911173132A CN111027421A CN 111027421 A CN111027421 A CN 111027421A CN 201911173132 A CN201911173132 A CN 201911173132A CN 111027421 A CN111027421 A CN 111027421A
- Authority
- CN
- China
- Prior art keywords
- model
- pedestrian
- data
- loss function
- graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000012549 training Methods 0.000 claims abstract description 25
- 238000000605 extraction Methods 0.000 claims abstract description 12
- 238000012795 verification Methods 0.000 claims abstract description 5
- 230000006870 function Effects 0.000 claims description 64
- 238000005065 mining Methods 0.000 claims description 7
- 238000005070 sampling Methods 0.000 claims description 7
- 150000001875 compounds Chemical class 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000009499 grossing Methods 0.000 claims description 3
- 230000003247 decreasing effect Effects 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 description 6
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011840 criminal investigation Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
Abstract
The invention discloses a direct-push type semi-supervised pedestrian re-identification method based on a graph, and belongs to the technical field of computer vision pedestrian re-identification. Firstly, training a two-channel model by using labeled pedestrian data, performing feature extraction on unlabeled pedestrian data after obtaining a base model, establishing a graph model for the extracted unlabeled pedestrian data features, giving a pseudo label to the unlabeled pedestrian data according to the graph model, and constructing a positive and negative sample pair by using the labeled pedestrian data and the unlabeled pedestrian data with the pseudo label; using a graph model to endow positive and negative samples with opposite credibility and then jointly fine-tuning a base model; gradually increasing the difficulty and confidence of the positive and negative sample pairs, training the base model to be completely converged by using a course learning method, performing feature extraction and feature matching on the verification set data after obtaining a final model, and completing pedestrian re-identification according to a matching result. The method reduces the negative influence caused by false labels, improves the robustness of the model, and further improves the accuracy of pedestrian re-identification.
Description
Technical Field
The invention belongs to the technical field of computer vision pedestrian re-identification, and particularly relates to a direct-push type semi-supervised pedestrian re-identification method based on a graph.
Background
Pedestrian re-identification is a technique that uses computer vision techniques to determine whether a particular pedestrian is present in an image or video sequence. Is widely considered as a sub-problem for image retrieval. Given a monitored pedestrian image, the pedestrian image is retrieved across the device. The camera aims to make up the visual limitation of the existing fixed camera, can be combined with a pedestrian detection/pedestrian tracking technology, and can be widely applied to the fields of intelligent video monitoring, intelligent security and the like. Pedestrian re-identification is the need for the machine to identify all of the images of a particular person taken by different cameras. Specifically, it is a person comparison technique implemented by the overall characteristics of pedestrians to find one or more pictures (query images) belonging to a given person from among multiple pictures (galery images). The pedestrian re-identification technology has high application value in public security criminal investigation work, image retrieval and other scenes. Besides, the pedestrian re-identification technology can help mobile phone users to realize photo album clustering, help retail or business super-operators to obtain effective customer tracks and mine business values. However, the accuracy of the existing pedestrian re-identification technology is not high, and much work still depends on the input of a large amount of manpower.
Pedestrian re-identification is a very important and challenging task, and due to the fact that the time and the place of image shooting are random, the light, the angle and the posture are different, and in addition, pedestrians are easily affected by factors such as detection precision and shielding, the research work of pedestrian re-identification faces more difficult challenges in practical application. Most pedestrian re-identification algorithms employ fully supervised convolutional neural networks. However, a neural network with better normalization performance often requires tens of thousands of labeled training samples to train. Unlike classification datasets or face recognition datasets, the amount of data has spread to millions, with most pedestrian re-recognition datasets being less than 2000 people, with tens of images per person. Semi-supervised learning based pedestrian re-identification techniques would be more valuable in practical applications since obtaining labeled pedestrian samples is too expensive.
The research work in the field of pedestrian re-identification mainly adopts a feature representation method for researching a pedestrian object to extract identification features with higher robustness to represent pedestrians, and adopts a distance measurement learning method to enable the distance between images of the same person to be smaller than the distance between images of different pedestrians by learning a discriminative distance measurement function. The core goal of the image-based pedestrian re-identification technology is to find a pedestrian image which is most similar to a candidate set comprising N pedestrian images for a specified pedestrian image. In order to distinguish pedestrians with different identities, the pedestrian re-identification needs to extract an identifying pedestrian feature descriptor. In daily life, humans usually identify whether the same pedestrian is according to clothing, while in intelligent multi-camera surveillance systems, pedestrian appearance often changes dramatically due to changes in lighting, walking pose, camera view. How to extract robust descriptors under severe appearance change is a technical difficulty which needs to be solved at present, so that the method has great limitation in the practical application process.
Disclosure of Invention
In order to solve the above problems, an object of the present invention is to provide a graph-based direct-push semi-supervised pedestrian re-identification method, which reduces negative effects caused by false labels, improves robustness of a model, and further improves accuracy of pedestrian re-identification.
The invention is realized by the following technical scheme:
a graph-based direct-push type semi-supervised pedestrian re-identification method comprises the following steps:
step 1: training a double-channel model by using the pedestrian data with the labels to obtain a base model;
step 2: performing feature extraction on the unlabeled pedestrian data by using a base model, establishing a graph model for the extracted unlabeled pedestrian data features, giving a pseudo label to the unlabeled pedestrian data according to the graph model, and constructing a positive and negative sample pair by using the labeled pedestrian data and the unlabeled pedestrian data with the pseudo label;
and step 3: giving positive and negative sample opposite confidence degrees by using a graph model, and fine-tuning the base model obtained in the step 1 by using a positive and negative sample pair with the confidence degrees;
and 4, step 4: repeating the steps 2 and 3, gradually increasing the difficulty and confidence coefficient of the positive and negative sample pairs, and training the base model by using a course learning method until the base model is completely converged to obtain a final model;
and 5: and performing feature extraction and feature matching on the verification set data by using the final model, and completing pedestrian re-identification according to a matching result.
Preferably, step 1 is specifically: the pedestrian data marked with the label is XLThe non-tag pedestrian data is XU(ii) a Defining sample pairs belonging to the same label as positive sample pairs, and defining sample pairs not belonging to the same label as negative sample pairs; in a two-channel model, one of the two-channel model is ResNet50, and parameters of the model are obtained by learning and are set as a student model; another channel model is ResNet50, and the parameters of the model are obtained by the "student" model through exponential average moving calculation, and are set as the "teacher" model, and the calculation formula is as follows:
θt′=αθt-1′+(1-α)θt
in the formula, thetatFor "student" model parameters, θt'teacher' model parameter, α smoothing coefficient, and the loss function used by the two-channel model comprises three parts, namely a consistency loss function based on characteristics, a triple loss function and a cross entropy loss function, wherein:
in the formula, LCLFor the consistency loss function, N is the number of samples,η is the square of the norm of L2iAnd ηi' are two different noises;
in the formula (I), the compound is shown in the specification,is a triple loss function, N is the number of triples, fθ() is a feature obtained by extracting a pedestrian image for a student model; theta is a parameter of the student model;α is a boundary parameter in the triple loss function;
in the formula (I), the compound is shown in the specification,is a cross entropy loss function, sigma is a standard softmax cross entropy loss function, N is the number of the labeled pedestrian data in the current training batch, and y isiThe method comprises the following steps of (1) marking a label with labeled pedestrian data in a current training batch, wherein omega is a parameter of the last full-connection layer in a student model;
using a hyperparameter lambda0Combining the triple loss function and the cross entropy loss function to obtain a fully supervised loss function LSL:
Using a hyperparameter lambda1Will fully supervise the loss function LSLAnd consistency learning loss function LCLAnd combining to obtain a final loss function of the labeled pedestrian data:
LSL-CL=LSL+λ1LCL
and using the final loss function of the pedestrian data with the label as constraint, and using the pedestrian data with the label to train the double channels to obtain a base model.
Preferably, step 2 is specifically: performing feature extraction on the unlabeled pedestrian data by using a base model, and constructing a directed KNN graph G (V, E) by using the features of the unlabeled pedestrian data, wherein in G:
V={vi=fθ(xi)|xi∈XU}
E={eij=P(vi,vj)|vj∈Nk(vi)}
in the formula, Nk(vi) Is a vertex viKNN map of (C), P (v)i,vj) Is v isiTo vjA directed edge in between; selecting pairwise combination of all non-label pedestrian data as a positive sample pair C in a closed loop formed by a plurality of edges in a KNN imagetWherein t is the number of edges of the closed loop; for an anchor sample in the pedestrian data with the label, acquiring a positive sample from the pedestrian data with the same label, and acquiring a negative sample from the pedestrian data with different labels and the pedestrian data without the label; for anchor samples in unlabeled pedestrian data, from CtObtaining a positive sample pair, and obtaining a negative sample pair from the labeled pedestrian data; difficulty in raising excavated negative samples:
where min (-) is the pair of samples selected with the smallest Euclidean distance,is at CtThe ith non-tag pedestrian data of (1),is from XLThe selected tagged pedestrian data is selected from the group,to be in the current training batch, from CtSelecting unlabeled pedestrian data, wherein D (-) is the calculated Euclidean distance; for one to belong to CtAnchor sample of (2), NiIs from XLThe negative samples selected in (1) are selected,is from CtC is a constant used to control the confidence of the negative sample pair.
Preferably, the difficulty of mining negative samples is boosted by gradually decreasing the constant c.
Preferably, step 3 is specifically: and after positive and negative sample pairs are obtained, constructing triple loss by using the positive and negative sample pairs:
wherein N represents the number of triples in the current training batch, and the number of the non-labeled data and the number of the labeled data in each batch are equal; triplet slave C of unlabeled datat,Ni,Sampling by using a standard triple sampling strategy for triples with label data; assignment of triple confidence, s, using graph modelsiIs the confidence of the ith triplet; confidence s for triples of tagged dataiSet to a constant of 1; for confidence setting of triples of unlabeled data, D is usediTo define the repetition times of the ith sample pair in the graph model, wherein the sample pair with the most repetition times in the graph model is Dmax=max({Di}) fromUsing a constant c for controlling the confidence coefficient of the negative sample pair as the confidence coefficient of the negative sample pair; from NiA negative pair of mid-samples, using c as its confidence; the final triplet confidence for unlabeled data is defined as:
wherein α represents the lowest confidence of the positive sample pairs of unlabeled data selected from the graph model, defining a direct-push metric learning loss function as:
using a hyperparameter lambda0Combining a direct-push type metric learning loss function with a consistency learning loss function, wherein the final loss function is as follows:
LTSML-CL=LTSML+λ1LCL,
and (3) based on the loss function, jointly fine-tuning the base model obtained in the step (1) by using positive and negative sample pairs with confidence coefficients, and improving the performance of the model.
Further preferably, the value range of the constant c is 0.5-1.
Compared with the prior art, the invention has the following beneficial technical effects:
the invention relates to a direct-push type semi-supervised pedestrian re-identification method based on a graph, which is combined with a deep convolutional neural network technology, firstly constructs a graph model aiming at non-tag data, performs difficult sample mining on the non-tag data by utilizing the graph model, simultaneously endows the difficult sample confidence coefficient by utilizing the graph model, and optimizes the model by utilizing a pseudo tag with the confidence coefficient. Compared with traditional algorithms such as Knn, K-means and the like, the method based on the graph model can obtain more accurate pseudo labels. The introduction of confidence also makes the pseudo-label more stable when used, for example, samples with lower confidence will control its influence on the optimization model process. The method for mining the difficult samples can mine more difficult positive and negative sample pairs, and can maximize the feature expression capability of the model when used in metric learning. The traditional method is only suitable for mining the difficult samples with the labels, the difficult sample mining method provided by the invention is also suitable for non-label data, and the traditional difficult sample mining method is improved aiming at the prior knowledge of data distribution, so that the positive and negative sample pairs are more sufficient, and the performance of the model can be further greatly improved.
Meanwhile, consistency learning is introduced and improved, and the model is optimized by using the assumption that the same data added with different noises has consistent characteristics. For the unlabeled data, the optimization process of the model is constrained by using the characteristic consistency hypothesis, so that the teacher model and the student model can learn mutually to optimize parameters, the convergence is accelerated, and the performances of the two models are improved. The course learning method is introduced, wherein simple knowledge (samples) is firstly learned, and then the knowledge (samples) difficult to learn is gradually increased. Firstly, fine-tuning a basic model by using the non-label data and the labeled data with high confidence level, updating the confidence level of the non-label data after the performance of the model is improved, wherein the confidence level information is more reliable than that before the updating, and then fine-tuning the model again by using the non-label data and the labeled data which update the confidence level information, so that the model is repeatedly converged to obtain a final model. Meaningful training data ordering can maximize the improvement in model performance. The method reduces the negative influence caused by false labels, improves the robustness of the model, and further improves the accuracy of pedestrian re-identification.
Drawings
FIG. 1 is a flow chart of the present invention;
fig. 2 is a schematic diagram of a direct-push metric learning network structure according to the present invention.
Detailed Description
The invention will be described in further detail with reference to the following drawings and examples, which are given by way of illustration and not by way of limitation.
Fig. 1 is a logic block diagram of the flow of the present invention, and the diagram-based direct-push type semi-supervised pedestrian re-identification method of the present invention includes the following steps:
step 1: constructing a triplet by using the pedestrian data with the labels;
step 2: training a double-channel model, as shown in FIG. 2, to obtain a base model;
and step 3: performing feature extraction on the unlabeled pedestrian data by using a base model;
and 4, step 4: establishing a graph model for the extracted non-tag pedestrian data characteristics;
and 5: giving a pseudo label with confidence degree to the non-label pedestrian data according to the graph model, and constructing a triple for the non-label pedestrian data;
step 6: using the tagged pedestrian data triple and the untagged pedestrian data triple with the pseudo tag to jointly fine tune the model;
and 7: repeating the steps 2, 3, 4, 5 and 6, and training the model until the model is completely converged to obtain a final model;
and 8: and performing feature extraction and feature matching on the verification set data by using the final model, and completing pedestrian re-identification according to a matching result.
Specifically, the method comprises the following steps: the pedestrian data marked with the label is XLThe non-tag pedestrian data is XU(ii) a Defining sample pairs belonging to the same label as positive sample pairs, and defining sample pairs not belonging to the same label as negative sample pairs; in a two-channel model, one of the two-channel model is ResNet50, and parameters of the model are obtained by learning and are set as a student model; another channel model is ResNet50, and the parameters of the model are calculated by the "student" model through Exponential Moving Average (Exponential Moving Average), and are set as the "teacher" model, and the calculation formula is as follows:
θt′=αθt-1′+(1-α)θt
in the formula, thetatFor "student" model parameters, θt'teacher' model parameters, model frame as shown in FIG. 2, α smoothing coefficients, and loss function used by the two-channel model includes three parts, one part based on characteristicsAn induced Loss function, a triple Loss function, and a cross entropy Loss function, as shown in fig. 2, which are respectively consistence Loss, triple Loss, and class Loss; wherein:
in the formula, LCLFor the consistency loss function, N is the number of samples,η is the square of the norm of L2iAnd ηiTwo different types of noise;
in the formula (I), the compound is shown in the specification,is a triple loss function, N is the number of triples, fθ() is a feature obtained by extracting a pedestrian image for a student model; theta is a parameter of the student model;α is a boundary parameter in the triple loss function;
in the formula (I), the compound is shown in the specification,is a cross entropy loss function, sigma is a standard softmax cross entropy loss function, N is the number of the labeled pedestrian data in the current training batch, and y isiThe method comprises the following steps of (1) marking a label with labeled pedestrian data in a current training batch, wherein omega is a parameter of the last full-connection layer in a student model;
using a hyperparameter lambda0Combining the triplet loss function and the cross entropy loss function, λ0Is set to 0.1 during model training to obtain a fully supervised loss function LSL:
Using a hyperparameter lambda1Will fully supervise the loss function LSLAnd consistency learning loss function LCLAnd combining to obtain a final loss function of the labeled pedestrian data:
LSL-CL=LSL+λ1LCL
using the final loss function of the pedestrian data with the label as constraint, and using the pedestrian data with the label to train the double channels to obtain a base model; as shown in fig. 1 with labeled data triplet construction and model training.
Feature extraction is performed on the unlabeled pedestrian data by using a basis model, a directed KNN graph G (V, E) is constructed by using the features of the unlabeled pedestrian data, and K-4 is used as a neighbor number. In G:
V={vi=fθ(xi)|xi∈XU}
E={eij=P(vi,vj)|vj∈Nk(vi)}
in the formula, Nk(vi) Is a vertex viKNN map of (C), P (v)i,vj) Is v isiTo vjA directed edge in between; if t directed edges are available, a vertex can be connected back to itself, eij→ejk→ekl→eliWe call such vertices to form a "ring", and t refers to the order of this "ring". Any combination of samples in a ring is referred to as a positive sample pair generated by the ring. The probability that the "ring" obtained by the KNN map belongs to the same tag is higher. Compared with the neighbor-based method, the method can only find the positive sample pair with smaller intra-class variance. The positive sample pairs provided by the "ring" have a larger intra-class variance. In KNN diagramIn a closed loop formed by a plurality of edges, selecting pairwise combination of all non-label pedestrian data as a positive sample pair CtWherein t is the number of edges of the closed loop; for an anchor sample in the pedestrian data with the label, acquiring a positive sample from the pedestrian data with the same label, and acquiring a negative sample from the pedestrian data with different labels and the pedestrian data without the label; for anchor samples in unlabeled pedestrian data, from CtObtaining a positive sample pair, and obtaining a negative sample pair from the labeled pedestrian data; difficulty in raising excavated negative samples:
where min (-) is the pair of samples selected with the smallest Euclidean distance,is at CtThe ith non-tag pedestrian data of (1),is from XLThe selected tagged pedestrian data is selected from the group,to be in the current training batch, from CtSelecting unlabeled pedestrian data, wherein D (-) is the calculated Euclidean distance; for one to belong to CtAnchor sample of (2), NiIs from XLThe negative samples selected in (1) are selected,is from CtC is the confidence used to control the negative sample pairA constant of (d); c-0.7, c-0.8, c-0.9 are typically used to create more difficult negative sample pairs.
And after positive and negative sample pairs are obtained, constructing triple loss by using the positive and negative sample pairs:
wherein N represents the number of triples in the current training batch, and the number of the non-labeled data and the number of the labeled data in each batch are equal; triplet slave C of unlabeled datat,Ni,Sampling by using a standard triple sampling strategy for triples with label data; assignment of triple confidence, s, using graph modelsiIs the confidence of the ith triplet; confidence s for triples of tagged dataiSet to a constant of 1; for confidence setting of triples of unlabeled data, D is usediTo define the repetition times of the ith sample pair in the graph model, wherein the sample pair with the most repetition times in the graph model is Dmax=max({Di}) fromFor the negative sample pair of the middle sample, a constant c for controlling the confidence coefficient of the negative sample pair is used as the confidence coefficient of the negative sample pair; from NiA negative pair of samples, using c-1 as its confidence; the final triplet confidence for unlabeled data is defined as:
where α represents the lowest confidence of a positive sample pair of unlabeled data selected from the graph model, α is set to 0.8, and a direct-push metric learning loss function is defined as:
using a hyperparameter lambda0Combining a direct-push type metric learning loss function with a consistency learning loss function, wherein the final loss function is as follows:
LTSML-CL=LTSML+λ1LCL,
and (3) based on the loss function, jointly fine-tuning the base model obtained in the step (1) by using positive and negative sample pairs with confidence coefficients, and improving the performance of the model.
And after the performance of the model is optimized, repeatedly extracting the features, constructing a graph model, continuously updating the confidence coefficients of the positive and negative sample pairs and the positive and negative sample pairs, and training the base model to be completely converged by using a course learning method to obtain a final model. And performing feature extraction and feature matching on the verification set data by using the final model, and completing pedestrian re-identification according to a matching result.
The invention is further illustrated by the following specific examples:
the method is realized by adopting a ResNet50 convolutional neural network model, the size of an input image is 128 x 384, the dimension of a feature layer is 2048, a PyTorch framework is used, an Adam optimizer is used, the initial learning rate is 0.0002, the weight decade is 0.0005, 32 pedestrian IDs (identity) are in total in each batch, each pedestrian ID is provided with 4 images, and the model is pre-trained in an ImageNet data set.
Table 1 lists the comparison of the classification accuracy of the pedestrian re-identification method of the present invention with that of other methods on the public pedestrian re-identification data set (Market1501, DukeMCMT-ReID), and it can be seen that the accuracy of the model obtained by the method is higher.
TABLE 1
[1]X.Xin,J.Wang,R.Xie,S.Zhou,W.Huang,N.Zheng,Semi-supervised personre-identification using multi-view clustering,Pattern Recognition 88(2019)285–297.
[2]X.Xin,X.Wu,Y.Wang,J.Wang,Deep self-paced learning for semi-supervised person re-identification using multi-view self-paced clustering,in:2019IEEE International Conference on Image Processing(ICIP),2019,pp.2631–2635.doi:10.1109/ICIP.2019.8803290.
It should be noted that the above description is only a part of the embodiments of the present invention, and all equivalent changes made according to the present invention are included in the protection scope of the present invention. Those skilled in the art to which the invention relates may substitute similar embodiments for the specific examples described, all falling within the scope of the invention, without thereby departing from the invention or exceeding the scope of the claims defined thereby.
Claims (6)
1. A graph-based direct-push semi-supervised pedestrian re-identification method is characterized by comprising the following steps:
step 1: training a double-channel model by using the pedestrian data with the labels to obtain a base model;
step 2: performing feature extraction on the unlabeled pedestrian data by using a base model, establishing a graph model for the extracted unlabeled pedestrian data features, giving a pseudo label to the unlabeled pedestrian data according to the graph model, and constructing a positive and negative sample pair by using the labeled pedestrian data and the unlabeled pedestrian data with the pseudo label;
and step 3: giving positive and negative sample opposite confidence degrees by using a graph model, and fine-tuning the base model obtained in the step 1 by using a positive and negative sample pair with the confidence degrees;
and 4, step 4: repeating the steps 2 and 3, gradually increasing the difficulty and confidence coefficient of the positive and negative sample pairs, and training the base model by using a course learning method until the base model is completely converged to obtain a final model;
and 5: and performing feature extraction and feature matching on the verification set data by using the final model, and completing pedestrian re-identification according to a matching result.
2. The graph-based direct-push semi-supervised pedestrian re-identification method according to claim 1, wherein the step 1 is specifically as follows: the pedestrian data marked with the label is XLNumber of pedestrians without labelsAccording to XU(ii) a Defining sample pairs belonging to the same label as positive sample pairs, and defining sample pairs not belonging to the same label as negative sample pairs; in a two-channel model, one of the two-channel model is ResNet50, and parameters of the model are obtained by learning and are set as a student model; another channel model is ResNet50, and the parameters of the model are obtained by the "student" model through exponential average moving calculation, and are set as the "teacher" model, and the calculation formula is as follows:
θt′=αθt-1′+(1-α)θt
in the formula, thetatFor "student" model parameters, θt'teacher' model parameter, α smoothing coefficient, and the loss function used by the two-channel model comprises three parts, namely a consistency loss function based on characteristics, a triple loss function and a cross entropy loss function, wherein:
in the formula, LCLFor the consistency loss function, N is the number of samples,η is the square of the norm of L2iAnd ηi' are two different noises;
in the formula (I), the compound is shown in the specification,is a triple loss function, N is the number of triples, fθ() is a feature obtained by extracting a pedestrian image for a student model; theta is a parameter of the student model;anchor, positive and negative samples in the tripletα is the boundary parameter in the triple loss function;
in the formula (I), the compound is shown in the specification,is a cross entropy loss function, sigma is a standard softmax cross entropy loss function, N is the number of the labeled pedestrian data in the current training batch, and y isiThe method comprises the following steps of (1) marking a label with labeled pedestrian data in a current training batch, wherein omega is a parameter of the last full-connection layer in a student model;
using a hyperparameter lambda0Combining the triple loss function and the cross entropy loss function to obtain a fully supervised loss function LSL:
Using a hyperparameter lambda1Will fully supervise the loss function LSLAnd consistency learning loss function LCLAnd combining to obtain a final loss function of the labeled pedestrian data:
LSL-CL=LSL+λ1LCL
and using the final loss function of the pedestrian data with the label as constraint, and using the pedestrian data with the label to train the double channels to obtain a base model.
3. The graph-based direct-push semi-supervised pedestrian re-identification method according to claim 1, wherein the step 2 is specifically as follows: performing feature extraction on the unlabeled pedestrian data by using a base model, and constructing a directed KNN graph G (V, E) by using the features of the unlabeled pedestrian data, wherein in G:
V={vi=fθ(xi)|xi∈XU}
E={eij=P(vi,vj)|vj∈Nk(vi)}
in the formula, Nk(vi) Is a vertex viKNN map of (C), P (v)i,vj) Is v isiTo vjA directed edge in between; selecting pairwise combination of all non-label pedestrian data as a positive sample pair C in a closed loop formed by a plurality of edges in a KNN imagetWherein t is the number of edges of the closed loop; for an anchor sample in the pedestrian data with the label, acquiring a positive sample from the pedestrian data with the same label, and acquiring a negative sample from the pedestrian data with different labels and the pedestrian data without the label; for anchor samples in unlabeled pedestrian data, from CtObtaining a positive sample pair, and obtaining a negative sample pair from the labeled pedestrian data; difficulty in raising excavated negative samples:
where min (-) is the pair of samples selected with the smallest Euclidean distance,is at CtThe ith non-tag pedestrian data of (1),is from XLThe selected tagged pedestrian data is selected from the group,to be in the current training batch, from CtSelecting unlabeled pedestrian data, wherein D (-) is the calculated Euclidean distance; for oneBelong to CtAnchor sample of (2), NiIs from XLThe negative samples selected in (1) are selected,is from CtC is a constant used to control the confidence of the negative sample pair.
4. The graph-based direct-push semi-supervised pedestrian re-identification method of claim 1, wherein the difficulty of mining negative examples is boosted by gradually decreasing the constant c.
5. The graph-based direct-push semi-supervised pedestrian re-identification method according to claim 1, wherein the step 3 is specifically as follows: and after positive and negative sample pairs are obtained, constructing triple loss by using the positive and negative sample pairs:
wherein N represents the number of triples in the current training batch, and the number of the non-labeled data and the number of the labeled data in each batch are equal; triplet slave C of unlabeled datat,Ni,Sampling by using a standard triple sampling strategy for triples with label data; assignment of triple confidence, s, using graph modelsiIs the confidence of the ith triplet; confidence s for triples of tagged dataiSet to a constant of 1; for confidence setting of triples of unlabeled data, D is usediTo define the repetition times of the ith sample pair in the graph model, wherein the sample pair with the most repetition times in the graph model is Dmax=max({Di}) fromNegative sample pair of medium sampling, using controlThe constant c of the confidence degree of the negative sample pair is used as the confidence degree of the negative sample pair; from NiA negative pair of mid-samples, using c as its confidence; the final triplet confidence for unlabeled data is defined as:
wherein α represents the lowest confidence of the positive sample pairs of unlabeled data selected from the graph model, defining a direct-push metric learning loss function as:
using a hyperparameter lambda0Combining a direct-push type metric learning loss function with a consistency learning loss function, wherein the final loss function is as follows:
LTSML-CL=LTSML+λ1LCL,
and (3) based on the loss function, jointly fine-tuning the base model obtained in the step (1) by using positive and negative sample pairs with confidence coefficients, and improving the performance of the model.
6. The graph-based direct-push semi-supervised pedestrian re-identification method as claimed in any one of claims 3 to 5, wherein the constant c has a value in a range of 0.5 to 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911173132.3A CN111027421A (en) | 2019-11-26 | 2019-11-26 | Graph-based direct-push type semi-supervised pedestrian re-identification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911173132.3A CN111027421A (en) | 2019-11-26 | 2019-11-26 | Graph-based direct-push type semi-supervised pedestrian re-identification method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111027421A true CN111027421A (en) | 2020-04-17 |
Family
ID=70206813
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911173132.3A Pending CN111027421A (en) | 2019-11-26 | 2019-11-26 | Graph-based direct-push type semi-supervised pedestrian re-identification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111027421A (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111724867A (en) * | 2020-06-24 | 2020-09-29 | 中国科学技术大学 | Molecular property measurement method, molecular property measurement device, electronic apparatus, and storage medium |
CN111783646A (en) * | 2020-06-30 | 2020-10-16 | 北京百度网讯科技有限公司 | Training method, device, equipment and storage medium of pedestrian re-identification model |
CN112101217A (en) * | 2020-09-15 | 2020-12-18 | 镇江启迪数字天下科技有限公司 | Pedestrian re-identification method based on semi-supervised learning |
CN112418331A (en) * | 2020-11-26 | 2021-02-26 | 国网甘肃省电力公司电力科学研究院 | Clustering fusion-based semi-supervised learning pseudo label assignment method |
CN112614150A (en) * | 2020-12-18 | 2021-04-06 | 中山大学 | Off-line pedestrian tracking method, system and storage medium based on dual-model interactive semi-supervised learning |
CN112686912A (en) * | 2021-01-05 | 2021-04-20 | 南开大学 | Acute stroke lesion segmentation method based on gradual learning and mixed samples |
CN112861825A (en) * | 2021-04-07 | 2021-05-28 | 北京百度网讯科技有限公司 | Model training method, pedestrian re-identification method, device and electronic equipment |
CN112949384A (en) * | 2021-01-23 | 2021-06-11 | 西北工业大学 | Remote sensing image scene classification method based on antagonistic feature extraction |
CN113158554A (en) * | 2021-03-25 | 2021-07-23 | 腾讯科技(深圳)有限公司 | Model optimization method and device, computer equipment and storage medium |
CN113657267A (en) * | 2021-08-17 | 2021-11-16 | 中国科学院长春光学精密机械与物理研究所 | Semi-supervised pedestrian re-identification model, method and device |
CN113792574A (en) * | 2021-07-14 | 2021-12-14 | 哈尔滨工程大学 | Cross-data-set expression recognition method based on metric learning and teacher student model |
CN113920574A (en) * | 2021-12-15 | 2022-01-11 | 深圳市视美泰技术股份有限公司 | Training method and device for picture quality evaluation model, computer equipment and medium |
CN114548259A (en) * | 2022-02-18 | 2022-05-27 | 东北大学 | PISA fault identification method based on Semi-supervised Semi-KNN model |
CN112101217B (en) * | 2020-09-15 | 2024-04-26 | 镇江启迪数字天下科技有限公司 | Pedestrian re-identification method based on semi-supervised learning |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108446666A (en) * | 2018-04-04 | 2018-08-24 | 平安科技(深圳)有限公司 | The training of binary channels neural network model and face comparison method, terminal and medium |
CN109154973A (en) * | 2016-05-20 | 2019-01-04 | 奇跃公司 | Execute the method and system of convolved image transformation estimation |
CN110135295A (en) * | 2019-04-29 | 2019-08-16 | 华南理工大学 | A kind of unsupervised pedestrian recognition methods again based on transfer learning |
-
2019
- 2019-11-26 CN CN201911173132.3A patent/CN111027421A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109154973A (en) * | 2016-05-20 | 2019-01-04 | 奇跃公司 | Execute the method and system of convolved image transformation estimation |
CN108446666A (en) * | 2018-04-04 | 2018-08-24 | 平安科技(深圳)有限公司 | The training of binary channels neural network model and face comparison method, terminal and medium |
WO2019192121A1 (en) * | 2018-04-04 | 2019-10-10 | 平安科技(深圳)有限公司 | Dual-channel neural network model training and human face comparison method, and terminal and medium |
CN110135295A (en) * | 2019-04-29 | 2019-08-16 | 华南理工大学 | A kind of unsupervised pedestrian recognition methods again based on transfer learning |
Non-Patent Citations (2)
Title |
---|
安浩南;赵明;潘胜达;林长青;: "基于伪模态转换的红外目标融合检测算法" * |
贾迪;朱宁丹;杨宁华;吴思;李玉秀;赵明远;: "图像匹配方法研究综述" * |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111724867B (en) * | 2020-06-24 | 2022-09-09 | 中国科学技术大学 | Molecular property measurement method, molecular property measurement device, electronic apparatus, and storage medium |
CN111724867A (en) * | 2020-06-24 | 2020-09-29 | 中国科学技术大学 | Molecular property measurement method, molecular property measurement device, electronic apparatus, and storage medium |
CN111783646A (en) * | 2020-06-30 | 2020-10-16 | 北京百度网讯科技有限公司 | Training method, device, equipment and storage medium of pedestrian re-identification model |
CN111783646B (en) * | 2020-06-30 | 2024-01-23 | 北京百度网讯科技有限公司 | Training method, device, equipment and storage medium of pedestrian re-identification model |
CN112101217B (en) * | 2020-09-15 | 2024-04-26 | 镇江启迪数字天下科技有限公司 | Pedestrian re-identification method based on semi-supervised learning |
CN112101217A (en) * | 2020-09-15 | 2020-12-18 | 镇江启迪数字天下科技有限公司 | Pedestrian re-identification method based on semi-supervised learning |
CN112418331A (en) * | 2020-11-26 | 2021-02-26 | 国网甘肃省电力公司电力科学研究院 | Clustering fusion-based semi-supervised learning pseudo label assignment method |
CN112614150A (en) * | 2020-12-18 | 2021-04-06 | 中山大学 | Off-line pedestrian tracking method, system and storage medium based on dual-model interactive semi-supervised learning |
CN112686912A (en) * | 2021-01-05 | 2021-04-20 | 南开大学 | Acute stroke lesion segmentation method based on gradual learning and mixed samples |
CN112686912B (en) * | 2021-01-05 | 2022-06-10 | 南开大学 | Acute stroke lesion segmentation method based on gradual learning and mixed samples |
CN112949384A (en) * | 2021-01-23 | 2021-06-11 | 西北工业大学 | Remote sensing image scene classification method based on antagonistic feature extraction |
CN112949384B (en) * | 2021-01-23 | 2024-03-08 | 西北工业大学 | Remote sensing image scene classification method based on antagonistic feature extraction |
CN113158554A (en) * | 2021-03-25 | 2021-07-23 | 腾讯科技(深圳)有限公司 | Model optimization method and device, computer equipment and storage medium |
CN113158554B (en) * | 2021-03-25 | 2023-02-14 | 腾讯科技(深圳)有限公司 | Model optimization method and device, computer equipment and storage medium |
CN112861825B (en) * | 2021-04-07 | 2023-07-04 | 北京百度网讯科技有限公司 | Model training method, pedestrian re-recognition method, device and electronic equipment |
WO2022213717A1 (en) * | 2021-04-07 | 2022-10-13 | 北京百度网讯科技有限公司 | Model training method and apparatus, person re-identification method and apparatus, and electronic device |
CN112861825A (en) * | 2021-04-07 | 2021-05-28 | 北京百度网讯科技有限公司 | Model training method, pedestrian re-identification method, device and electronic equipment |
CN113792574B (en) * | 2021-07-14 | 2023-12-19 | 哈尔滨工程大学 | Cross-dataset expression recognition method based on metric learning and teacher student model |
CN113792574A (en) * | 2021-07-14 | 2021-12-14 | 哈尔滨工程大学 | Cross-data-set expression recognition method based on metric learning and teacher student model |
CN113657267B (en) * | 2021-08-17 | 2024-01-12 | 中国科学院长春光学精密机械与物理研究所 | Semi-supervised pedestrian re-identification method and device |
CN113657267A (en) * | 2021-08-17 | 2021-11-16 | 中国科学院长春光学精密机械与物理研究所 | Semi-supervised pedestrian re-identification model, method and device |
CN113920574B (en) * | 2021-12-15 | 2022-03-18 | 深圳市视美泰技术股份有限公司 | Training method and device for picture quality evaluation model, computer equipment and medium |
CN113920574A (en) * | 2021-12-15 | 2022-01-11 | 深圳市视美泰技术股份有限公司 | Training method and device for picture quality evaluation model, computer equipment and medium |
CN114548259A (en) * | 2022-02-18 | 2022-05-27 | 东北大学 | PISA fault identification method based on Semi-supervised Semi-KNN model |
CN114548259B (en) * | 2022-02-18 | 2023-10-10 | 东北大学 | PISA fault identification method based on Semi-supervised Semi-KNN model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111027421A (en) | Graph-based direct-push type semi-supervised pedestrian re-identification method | |
CN111126360B (en) | Cross-domain pedestrian re-identification method based on unsupervised combined multi-loss model | |
CN109740413B (en) | Pedestrian re-identification method, device, computer equipment and computer storage medium | |
CN109344787B (en) | Specific target tracking method based on face recognition and pedestrian re-recognition | |
CN110414368B (en) | Unsupervised pedestrian re-identification method based on knowledge distillation | |
CN109934117B (en) | Pedestrian re-identification detection method based on generation of countermeasure network | |
CN108960080B (en) | Face recognition method based on active defense image anti-attack | |
US11263435B2 (en) | Method for recognizing face from monitoring video data | |
CN105701467B (en) | A kind of more people's abnormal behaviour recognition methods based on human figure feature | |
CN110717411A (en) | Pedestrian re-identification method based on deep layer feature fusion | |
CN109711366B (en) | Pedestrian re-identification method based on group information loss function | |
CN109299707A (en) | A kind of unsupervised pedestrian recognition methods again based on fuzzy depth cluster | |
CN111639564B (en) | Video pedestrian re-identification method based on multi-attention heterogeneous network | |
CN110728216A (en) | Unsupervised pedestrian re-identification method based on pedestrian attribute adaptive learning | |
CN111723645A (en) | Multi-camera high-precision pedestrian re-identification method for in-phase built-in supervised scene | |
CN111310662B (en) | Flame detection and identification method and system based on integrated deep network | |
CN109919073B (en) | Pedestrian re-identification method with illumination robustness | |
CN108345900B (en) | Pedestrian re-identification method and system based on color texture distribution characteristics | |
CN111950372A (en) | Unsupervised pedestrian re-identification method based on graph convolution network | |
CN112819065A (en) | Unsupervised pedestrian sample mining method and unsupervised pedestrian sample mining system based on multi-clustering information | |
CN111488760A (en) | Few-sample pedestrian re-identification method based on deep multi-example learning | |
CN112115780A (en) | Semi-supervised pedestrian re-identification method based on deep multi-model cooperation | |
CN113065409A (en) | Unsupervised pedestrian re-identification method based on camera distribution difference alignment constraint | |
CN111695531A (en) | Cross-domain pedestrian re-identification method based on heterogeneous convolutional network | |
CN115049894A (en) | Target re-identification method of global structure information embedded network based on graph learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
AD01 | Patent right deemed abandoned |
Effective date of abandoning: 20240202 |
|
AD01 | Patent right deemed abandoned |