CN111027421A - Graph-based direct-push type semi-supervised pedestrian re-identification method - Google Patents

Graph-based direct-push type semi-supervised pedestrian re-identification method Download PDF

Info

Publication number
CN111027421A
CN111027421A CN201911173132.3A CN201911173132A CN111027421A CN 111027421 A CN111027421 A CN 111027421A CN 201911173132 A CN201911173132 A CN 201911173132A CN 111027421 A CN111027421 A CN 111027421A
Authority
CN
China
Prior art keywords
model
pedestrian
data
loss function
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911173132.3A
Other languages
Chinese (zh)
Inventor
常新远
龚怡宏
魏星
洪晓鹏
马智恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xi'an Honggui Electronic Technology Co Ltd
Original Assignee
Xi'an Honggui Electronic Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xi'an Honggui Electronic Technology Co Ltd filed Critical Xi'an Honggui Electronic Technology Co Ltd
Priority to CN201911173132.3A priority Critical patent/CN111027421A/en
Publication of CN111027421A publication Critical patent/CN111027421A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Abstract

The invention discloses a direct-push type semi-supervised pedestrian re-identification method based on a graph, and belongs to the technical field of computer vision pedestrian re-identification. Firstly, training a two-channel model by using labeled pedestrian data, performing feature extraction on unlabeled pedestrian data after obtaining a base model, establishing a graph model for the extracted unlabeled pedestrian data features, giving a pseudo label to the unlabeled pedestrian data according to the graph model, and constructing a positive and negative sample pair by using the labeled pedestrian data and the unlabeled pedestrian data with the pseudo label; using a graph model to endow positive and negative samples with opposite credibility and then jointly fine-tuning a base model; gradually increasing the difficulty and confidence of the positive and negative sample pairs, training the base model to be completely converged by using a course learning method, performing feature extraction and feature matching on the verification set data after obtaining a final model, and completing pedestrian re-identification according to a matching result. The method reduces the negative influence caused by false labels, improves the robustness of the model, and further improves the accuracy of pedestrian re-identification.

Description

Graph-based direct-push type semi-supervised pedestrian re-identification method
Technical Field
The invention belongs to the technical field of computer vision pedestrian re-identification, and particularly relates to a direct-push type semi-supervised pedestrian re-identification method based on a graph.
Background
Pedestrian re-identification is a technique that uses computer vision techniques to determine whether a particular pedestrian is present in an image or video sequence. Is widely considered as a sub-problem for image retrieval. Given a monitored pedestrian image, the pedestrian image is retrieved across the device. The camera aims to make up the visual limitation of the existing fixed camera, can be combined with a pedestrian detection/pedestrian tracking technology, and can be widely applied to the fields of intelligent video monitoring, intelligent security and the like. Pedestrian re-identification is the need for the machine to identify all of the images of a particular person taken by different cameras. Specifically, it is a person comparison technique implemented by the overall characteristics of pedestrians to find one or more pictures (query images) belonging to a given person from among multiple pictures (galery images). The pedestrian re-identification technology has high application value in public security criminal investigation work, image retrieval and other scenes. Besides, the pedestrian re-identification technology can help mobile phone users to realize photo album clustering, help retail or business super-operators to obtain effective customer tracks and mine business values. However, the accuracy of the existing pedestrian re-identification technology is not high, and much work still depends on the input of a large amount of manpower.
Pedestrian re-identification is a very important and challenging task, and due to the fact that the time and the place of image shooting are random, the light, the angle and the posture are different, and in addition, pedestrians are easily affected by factors such as detection precision and shielding, the research work of pedestrian re-identification faces more difficult challenges in practical application. Most pedestrian re-identification algorithms employ fully supervised convolutional neural networks. However, a neural network with better normalization performance often requires tens of thousands of labeled training samples to train. Unlike classification datasets or face recognition datasets, the amount of data has spread to millions, with most pedestrian re-recognition datasets being less than 2000 people, with tens of images per person. Semi-supervised learning based pedestrian re-identification techniques would be more valuable in practical applications since obtaining labeled pedestrian samples is too expensive.
The research work in the field of pedestrian re-identification mainly adopts a feature representation method for researching a pedestrian object to extract identification features with higher robustness to represent pedestrians, and adopts a distance measurement learning method to enable the distance between images of the same person to be smaller than the distance between images of different pedestrians by learning a discriminative distance measurement function. The core goal of the image-based pedestrian re-identification technology is to find a pedestrian image which is most similar to a candidate set comprising N pedestrian images for a specified pedestrian image. In order to distinguish pedestrians with different identities, the pedestrian re-identification needs to extract an identifying pedestrian feature descriptor. In daily life, humans usually identify whether the same pedestrian is according to clothing, while in intelligent multi-camera surveillance systems, pedestrian appearance often changes dramatically due to changes in lighting, walking pose, camera view. How to extract robust descriptors under severe appearance change is a technical difficulty which needs to be solved at present, so that the method has great limitation in the practical application process.
Disclosure of Invention
In order to solve the above problems, an object of the present invention is to provide a graph-based direct-push semi-supervised pedestrian re-identification method, which reduces negative effects caused by false labels, improves robustness of a model, and further improves accuracy of pedestrian re-identification.
The invention is realized by the following technical scheme:
a graph-based direct-push type semi-supervised pedestrian re-identification method comprises the following steps:
step 1: training a double-channel model by using the pedestrian data with the labels to obtain a base model;
step 2: performing feature extraction on the unlabeled pedestrian data by using a base model, establishing a graph model for the extracted unlabeled pedestrian data features, giving a pseudo label to the unlabeled pedestrian data according to the graph model, and constructing a positive and negative sample pair by using the labeled pedestrian data and the unlabeled pedestrian data with the pseudo label;
and step 3: giving positive and negative sample opposite confidence degrees by using a graph model, and fine-tuning the base model obtained in the step 1 by using a positive and negative sample pair with the confidence degrees;
and 4, step 4: repeating the steps 2 and 3, gradually increasing the difficulty and confidence coefficient of the positive and negative sample pairs, and training the base model by using a course learning method until the base model is completely converged to obtain a final model;
and 5: and performing feature extraction and feature matching on the verification set data by using the final model, and completing pedestrian re-identification according to a matching result.
Preferably, step 1 is specifically: the pedestrian data marked with the label is XLThe non-tag pedestrian data is XU(ii) a Defining sample pairs belonging to the same label as positive sample pairs, and defining sample pairs not belonging to the same label as negative sample pairs; in a two-channel model, one of the two-channel model is ResNet50, and parameters of the model are obtained by learning and are set as a student model; another channel model is ResNet50, and the parameters of the model are obtained by the "student" model through exponential average moving calculation, and are set as the "teacher" model, and the calculation formula is as follows:
θt′=αθt-1′+(1-α)θt
in the formula, thetatFor "student" model parameters, θt'teacher' model parameter, α smoothing coefficient, and the loss function used by the two-channel model comprises three parts, namely a consistency loss function based on characteristics, a triple loss function and a cross entropy loss function, wherein:
Figure BDA0002289265700000031
in the formula, LCLFor the consistency loss function, N is the number of samples,
Figure BDA0002289265700000032
η is the square of the norm of L2iAnd ηi' are two different noises;
Figure BDA0002289265700000033
in the formula (I), the compound is shown in the specification,
Figure BDA0002289265700000034
is a triple loss function, N is the number of triples, fθ() is a feature obtained by extracting a pedestrian image for a student model; theta is a parameter of the student model;
Figure BDA0002289265700000035
α is a boundary parameter in the triple loss function;
Figure BDA0002289265700000036
in the formula (I), the compound is shown in the specification,
Figure BDA0002289265700000037
is a cross entropy loss function, sigma is a standard softmax cross entropy loss function, N is the number of the labeled pedestrian data in the current training batch, and y isiThe method comprises the following steps of (1) marking a label with labeled pedestrian data in a current training batch, wherein omega is a parameter of the last full-connection layer in a student model;
using a hyperparameter lambda0Combining the triple loss function and the cross entropy loss function to obtain a fully supervised loss function LSL
Figure BDA0002289265700000041
Using a hyperparameter lambda1Will fully supervise the loss function LSLAnd consistency learning loss function LCLAnd combining to obtain a final loss function of the labeled pedestrian data:
LSL-CL=LSL1LCL
and using the final loss function of the pedestrian data with the label as constraint, and using the pedestrian data with the label to train the double channels to obtain a base model.
Preferably, step 2 is specifically: performing feature extraction on the unlabeled pedestrian data by using a base model, and constructing a directed KNN graph G (V, E) by using the features of the unlabeled pedestrian data, wherein in G:
V={vi=fθ(xi)|xi∈XU}
E={eij=P(vi,vj)|vj∈Nk(vi)}
in the formula, Nk(vi) Is a vertex viKNN map of (C), P (v)i,vj) Is v isiTo vjA directed edge in between; selecting pairwise combination of all non-label pedestrian data as a positive sample pair C in a closed loop formed by a plurality of edges in a KNN imagetWherein t is the number of edges of the closed loop; for an anchor sample in the pedestrian data with the label, acquiring a positive sample from the pedestrian data with the same label, and acquiring a negative sample from the pedestrian data with different labels and the pedestrian data without the label; for anchor samples in unlabeled pedestrian data, from CtObtaining a positive sample pair, and obtaining a negative sample pair from the labeled pedestrian data; difficulty in raising excavated negative samples:
Figure BDA0002289265700000042
Figure BDA0002289265700000043
Figure BDA0002289265700000044
where min (-) is the pair of samples selected with the smallest Euclidean distance,
Figure BDA0002289265700000045
is at CtThe ith non-tag pedestrian data of (1),
Figure BDA0002289265700000046
is from XLThe selected tagged pedestrian data is selected from the group,
Figure BDA0002289265700000047
to be in the current training batch, from CtSelecting unlabeled pedestrian data, wherein D (-) is the calculated Euclidean distance; for one to belong to CtAnchor sample of (2), NiIs from XLThe negative samples selected in (1) are selected,
Figure BDA0002289265700000048
is from CtC is a constant used to control the confidence of the negative sample pair.
Preferably, the difficulty of mining negative samples is boosted by gradually decreasing the constant c.
Preferably, step 3 is specifically: and after positive and negative sample pairs are obtained, constructing triple loss by using the positive and negative sample pairs:
Figure BDA0002289265700000051
wherein N represents the number of triples in the current training batch, and the number of the non-labeled data and the number of the labeled data in each batch are equal; triplet slave C of unlabeled datat,Ni
Figure BDA0002289265700000052
Sampling by using a standard triple sampling strategy for triples with label data; assignment of triple confidence, s, using graph modelsiIs the confidence of the ith triplet; confidence s for triples of tagged dataiSet to a constant of 1; for confidence setting of triples of unlabeled data, D is usediTo define the repetition times of the ith sample pair in the graph model, wherein the sample pair with the most repetition times in the graph model is Dmax=max({Di}) from
Figure BDA0002289265700000055
Using a constant c for controlling the confidence coefficient of the negative sample pair as the confidence coefficient of the negative sample pair; from NiA negative pair of mid-samples, using c as its confidence; the final triplet confidence for unlabeled data is defined as:
Figure BDA0002289265700000053
wherein α represents the lowest confidence of the positive sample pairs of unlabeled data selected from the graph model, defining a direct-push metric learning loss function as:
Figure BDA0002289265700000054
using a hyperparameter lambda0Combining a direct-push type metric learning loss function with a consistency learning loss function, wherein the final loss function is as follows:
LTSML-CL=LTSML1LCL
and (3) based on the loss function, jointly fine-tuning the base model obtained in the step (1) by using positive and negative sample pairs with confidence coefficients, and improving the performance of the model.
Further preferably, the value range of the constant c is 0.5-1.
Compared with the prior art, the invention has the following beneficial technical effects:
the invention relates to a direct-push type semi-supervised pedestrian re-identification method based on a graph, which is combined with a deep convolutional neural network technology, firstly constructs a graph model aiming at non-tag data, performs difficult sample mining on the non-tag data by utilizing the graph model, simultaneously endows the difficult sample confidence coefficient by utilizing the graph model, and optimizes the model by utilizing a pseudo tag with the confidence coefficient. Compared with traditional algorithms such as Knn, K-means and the like, the method based on the graph model can obtain more accurate pseudo labels. The introduction of confidence also makes the pseudo-label more stable when used, for example, samples with lower confidence will control its influence on the optimization model process. The method for mining the difficult samples can mine more difficult positive and negative sample pairs, and can maximize the feature expression capability of the model when used in metric learning. The traditional method is only suitable for mining the difficult samples with the labels, the difficult sample mining method provided by the invention is also suitable for non-label data, and the traditional difficult sample mining method is improved aiming at the prior knowledge of data distribution, so that the positive and negative sample pairs are more sufficient, and the performance of the model can be further greatly improved.
Meanwhile, consistency learning is introduced and improved, and the model is optimized by using the assumption that the same data added with different noises has consistent characteristics. For the unlabeled data, the optimization process of the model is constrained by using the characteristic consistency hypothesis, so that the teacher model and the student model can learn mutually to optimize parameters, the convergence is accelerated, and the performances of the two models are improved. The course learning method is introduced, wherein simple knowledge (samples) is firstly learned, and then the knowledge (samples) difficult to learn is gradually increased. Firstly, fine-tuning a basic model by using the non-label data and the labeled data with high confidence level, updating the confidence level of the non-label data after the performance of the model is improved, wherein the confidence level information is more reliable than that before the updating, and then fine-tuning the model again by using the non-label data and the labeled data which update the confidence level information, so that the model is repeatedly converged to obtain a final model. Meaningful training data ordering can maximize the improvement in model performance. The method reduces the negative influence caused by false labels, improves the robustness of the model, and further improves the accuracy of pedestrian re-identification.
Drawings
FIG. 1 is a flow chart of the present invention;
fig. 2 is a schematic diagram of a direct-push metric learning network structure according to the present invention.
Detailed Description
The invention will be described in further detail with reference to the following drawings and examples, which are given by way of illustration and not by way of limitation.
Fig. 1 is a logic block diagram of the flow of the present invention, and the diagram-based direct-push type semi-supervised pedestrian re-identification method of the present invention includes the following steps:
step 1: constructing a triplet by using the pedestrian data with the labels;
step 2: training a double-channel model, as shown in FIG. 2, to obtain a base model;
and step 3: performing feature extraction on the unlabeled pedestrian data by using a base model;
and 4, step 4: establishing a graph model for the extracted non-tag pedestrian data characteristics;
and 5: giving a pseudo label with confidence degree to the non-label pedestrian data according to the graph model, and constructing a triple for the non-label pedestrian data;
step 6: using the tagged pedestrian data triple and the untagged pedestrian data triple with the pseudo tag to jointly fine tune the model;
and 7: repeating the steps 2, 3, 4, 5 and 6, and training the model until the model is completely converged to obtain a final model;
and 8: and performing feature extraction and feature matching on the verification set data by using the final model, and completing pedestrian re-identification according to a matching result.
Specifically, the method comprises the following steps: the pedestrian data marked with the label is XLThe non-tag pedestrian data is XU(ii) a Defining sample pairs belonging to the same label as positive sample pairs, and defining sample pairs not belonging to the same label as negative sample pairs; in a two-channel model, one of the two-channel model is ResNet50, and parameters of the model are obtained by learning and are set as a student model; another channel model is ResNet50, and the parameters of the model are calculated by the "student" model through Exponential Moving Average (Exponential Moving Average), and are set as the "teacher" model, and the calculation formula is as follows:
θt′=αθt-1′+(1-α)θt
in the formula, thetatFor "student" model parameters, θt'teacher' model parameters, model frame as shown in FIG. 2, α smoothing coefficients, and loss function used by the two-channel model includes three parts, one part based on characteristicsAn induced Loss function, a triple Loss function, and a cross entropy Loss function, as shown in fig. 2, which are respectively consistence Loss, triple Loss, and class Loss; wherein:
Figure BDA0002289265700000081
in the formula, LCLFor the consistency loss function, N is the number of samples,
Figure BDA0002289265700000082
η is the square of the norm of L2iAnd ηiTwo different types of noise;
Figure BDA0002289265700000083
in the formula (I), the compound is shown in the specification,
Figure BDA0002289265700000084
is a triple loss function, N is the number of triples, fθ() is a feature obtained by extracting a pedestrian image for a student model; theta is a parameter of the student model;
Figure BDA0002289265700000085
α is a boundary parameter in the triple loss function;
Figure BDA0002289265700000086
in the formula (I), the compound is shown in the specification,
Figure BDA0002289265700000087
is a cross entropy loss function, sigma is a standard softmax cross entropy loss function, N is the number of the labeled pedestrian data in the current training batch, and y isiThe method comprises the following steps of (1) marking a label with labeled pedestrian data in a current training batch, wherein omega is a parameter of the last full-connection layer in a student model;
using a hyperparameter lambda0Combining the triplet loss function and the cross entropy loss function, λ0Is set to 0.1 during model training to obtain a fully supervised loss function LSL
Figure BDA0002289265700000088
Using a hyperparameter lambda1Will fully supervise the loss function LSLAnd consistency learning loss function LCLAnd combining to obtain a final loss function of the labeled pedestrian data:
LSL-CL=LSL1LCL
using the final loss function of the pedestrian data with the label as constraint, and using the pedestrian data with the label to train the double channels to obtain a base model; as shown in fig. 1 with labeled data triplet construction and model training.
Feature extraction is performed on the unlabeled pedestrian data by using a basis model, a directed KNN graph G (V, E) is constructed by using the features of the unlabeled pedestrian data, and K-4 is used as a neighbor number. In G:
V={vi=fθ(xi)|xi∈XU}
E={eij=P(vi,vj)|vj∈Nk(vi)}
in the formula, Nk(vi) Is a vertex viKNN map of (C), P (v)i,vj) Is v isiTo vjA directed edge in between; if t directed edges are available, a vertex can be connected back to itself, eij→ejk→ekl→eliWe call such vertices to form a "ring", and t refers to the order of this "ring". Any combination of samples in a ring is referred to as a positive sample pair generated by the ring. The probability that the "ring" obtained by the KNN map belongs to the same tag is higher. Compared with the neighbor-based method, the method can only find the positive sample pair with smaller intra-class variance. The positive sample pairs provided by the "ring" have a larger intra-class variance. In KNN diagramIn a closed loop formed by a plurality of edges, selecting pairwise combination of all non-label pedestrian data as a positive sample pair CtWherein t is the number of edges of the closed loop; for an anchor sample in the pedestrian data with the label, acquiring a positive sample from the pedestrian data with the same label, and acquiring a negative sample from the pedestrian data with different labels and the pedestrian data without the label; for anchor samples in unlabeled pedestrian data, from CtObtaining a positive sample pair, and obtaining a negative sample pair from the labeled pedestrian data; difficulty in raising excavated negative samples:
Figure BDA0002289265700000091
Figure BDA0002289265700000092
Figure BDA0002289265700000093
where min (-) is the pair of samples selected with the smallest Euclidean distance,
Figure BDA0002289265700000094
is at CtThe ith non-tag pedestrian data of (1),
Figure BDA0002289265700000095
is from XLThe selected tagged pedestrian data is selected from the group,
Figure BDA0002289265700000096
to be in the current training batch, from CtSelecting unlabeled pedestrian data, wherein D (-) is the calculated Euclidean distance; for one to belong to CtAnchor sample of (2), NiIs from XLThe negative samples selected in (1) are selected,
Figure BDA0002289265700000097
is from CtC is the confidence used to control the negative sample pairA constant of (d); c-0.7, c-0.8, c-0.9 are typically used to create more difficult negative sample pairs.
And after positive and negative sample pairs are obtained, constructing triple loss by using the positive and negative sample pairs:
Figure BDA0002289265700000101
wherein N represents the number of triples in the current training batch, and the number of the non-labeled data and the number of the labeled data in each batch are equal; triplet slave C of unlabeled datat,Ni
Figure BDA0002289265700000102
Sampling by using a standard triple sampling strategy for triples with label data; assignment of triple confidence, s, using graph modelsiIs the confidence of the ith triplet; confidence s for triples of tagged dataiSet to a constant of 1; for confidence setting of triples of unlabeled data, D is usediTo define the repetition times of the ith sample pair in the graph model, wherein the sample pair with the most repetition times in the graph model is Dmax=max({Di}) from
Figure BDA0002289265700000105
For the negative sample pair of the middle sample, a constant c for controlling the confidence coefficient of the negative sample pair is used as the confidence coefficient of the negative sample pair; from NiA negative pair of samples, using c-1 as its confidence; the final triplet confidence for unlabeled data is defined as:
Figure BDA0002289265700000103
where α represents the lowest confidence of a positive sample pair of unlabeled data selected from the graph model, α is set to 0.8, and a direct-push metric learning loss function is defined as:
Figure BDA0002289265700000104
using a hyperparameter lambda0Combining a direct-push type metric learning loss function with a consistency learning loss function, wherein the final loss function is as follows:
LTSML-CL=LTSML1LCL
and (3) based on the loss function, jointly fine-tuning the base model obtained in the step (1) by using positive and negative sample pairs with confidence coefficients, and improving the performance of the model.
And after the performance of the model is optimized, repeatedly extracting the features, constructing a graph model, continuously updating the confidence coefficients of the positive and negative sample pairs and the positive and negative sample pairs, and training the base model to be completely converged by using a course learning method to obtain a final model. And performing feature extraction and feature matching on the verification set data by using the final model, and completing pedestrian re-identification according to a matching result.
The invention is further illustrated by the following specific examples:
the method is realized by adopting a ResNet50 convolutional neural network model, the size of an input image is 128 x 384, the dimension of a feature layer is 2048, a PyTorch framework is used, an Adam optimizer is used, the initial learning rate is 0.0002, the weight decade is 0.0005, 32 pedestrian IDs (identity) are in total in each batch, each pedestrian ID is provided with 4 images, and the model is pre-trained in an ImageNet data set.
Table 1 lists the comparison of the classification accuracy of the pedestrian re-identification method of the present invention with that of other methods on the public pedestrian re-identification data set (Market1501, DukeMCMT-ReID), and it can be seen that the accuracy of the model obtained by the method is higher.
TABLE 1
Figure BDA0002289265700000111
[1]X.Xin,J.Wang,R.Xie,S.Zhou,W.Huang,N.Zheng,Semi-supervised personre-identification using multi-view clustering,Pattern Recognition 88(2019)285–297.
[2]X.Xin,X.Wu,Y.Wang,J.Wang,Deep self-paced learning for semi-supervised person re-identification using multi-view self-paced clustering,in:2019IEEE International Conference on Image Processing(ICIP),2019,pp.2631–2635.doi:10.1109/ICIP.2019.8803290.
It should be noted that the above description is only a part of the embodiments of the present invention, and all equivalent changes made according to the present invention are included in the protection scope of the present invention. Those skilled in the art to which the invention relates may substitute similar embodiments for the specific examples described, all falling within the scope of the invention, without thereby departing from the invention or exceeding the scope of the claims defined thereby.

Claims (6)

1. A graph-based direct-push semi-supervised pedestrian re-identification method is characterized by comprising the following steps:
step 1: training a double-channel model by using the pedestrian data with the labels to obtain a base model;
step 2: performing feature extraction on the unlabeled pedestrian data by using a base model, establishing a graph model for the extracted unlabeled pedestrian data features, giving a pseudo label to the unlabeled pedestrian data according to the graph model, and constructing a positive and negative sample pair by using the labeled pedestrian data and the unlabeled pedestrian data with the pseudo label;
and step 3: giving positive and negative sample opposite confidence degrees by using a graph model, and fine-tuning the base model obtained in the step 1 by using a positive and negative sample pair with the confidence degrees;
and 4, step 4: repeating the steps 2 and 3, gradually increasing the difficulty and confidence coefficient of the positive and negative sample pairs, and training the base model by using a course learning method until the base model is completely converged to obtain a final model;
and 5: and performing feature extraction and feature matching on the verification set data by using the final model, and completing pedestrian re-identification according to a matching result.
2. The graph-based direct-push semi-supervised pedestrian re-identification method according to claim 1, wherein the step 1 is specifically as follows: the pedestrian data marked with the label is XLNumber of pedestrians without labelsAccording to XU(ii) a Defining sample pairs belonging to the same label as positive sample pairs, and defining sample pairs not belonging to the same label as negative sample pairs; in a two-channel model, one of the two-channel model is ResNet50, and parameters of the model are obtained by learning and are set as a student model; another channel model is ResNet50, and the parameters of the model are obtained by the "student" model through exponential average moving calculation, and are set as the "teacher" model, and the calculation formula is as follows:
θt′=αθt-1′+(1-α)θt
in the formula, thetatFor "student" model parameters, θt'teacher' model parameter, α smoothing coefficient, and the loss function used by the two-channel model comprises three parts, namely a consistency loss function based on characteristics, a triple loss function and a cross entropy loss function, wherein:
Figure FDA0002289265690000011
in the formula, LCLFor the consistency loss function, N is the number of samples,
Figure FDA0002289265690000012
η is the square of the norm of L2iAnd ηi' are two different noises;
Figure FDA0002289265690000021
in the formula (I), the compound is shown in the specification,
Figure FDA0002289265690000022
is a triple loss function, N is the number of triples, fθ() is a feature obtained by extracting a pedestrian image for a student model; theta is a parameter of the student model;
Figure FDA0002289265690000023
anchor, positive and negative samples in the tripletα is the boundary parameter in the triple loss function;
Figure FDA0002289265690000024
in the formula (I), the compound is shown in the specification,
Figure FDA0002289265690000025
is a cross entropy loss function, sigma is a standard softmax cross entropy loss function, N is the number of the labeled pedestrian data in the current training batch, and y isiThe method comprises the following steps of (1) marking a label with labeled pedestrian data in a current training batch, wherein omega is a parameter of the last full-connection layer in a student model;
using a hyperparameter lambda0Combining the triple loss function and the cross entropy loss function to obtain a fully supervised loss function LSL
Figure FDA0002289265690000026
Using a hyperparameter lambda1Will fully supervise the loss function LSLAnd consistency learning loss function LCLAnd combining to obtain a final loss function of the labeled pedestrian data:
LSL-CL=LSL1LCL
and using the final loss function of the pedestrian data with the label as constraint, and using the pedestrian data with the label to train the double channels to obtain a base model.
3. The graph-based direct-push semi-supervised pedestrian re-identification method according to claim 1, wherein the step 2 is specifically as follows: performing feature extraction on the unlabeled pedestrian data by using a base model, and constructing a directed KNN graph G (V, E) by using the features of the unlabeled pedestrian data, wherein in G:
V={vi=fθ(xi)|xi∈XU}
E={eij=P(vi,vj)|vj∈Nk(vi)}
in the formula, Nk(vi) Is a vertex viKNN map of (C), P (v)i,vj) Is v isiTo vjA directed edge in between; selecting pairwise combination of all non-label pedestrian data as a positive sample pair C in a closed loop formed by a plurality of edges in a KNN imagetWherein t is the number of edges of the closed loop; for an anchor sample in the pedestrian data with the label, acquiring a positive sample from the pedestrian data with the same label, and acquiring a negative sample from the pedestrian data with different labels and the pedestrian data without the label; for anchor samples in unlabeled pedestrian data, from CtObtaining a positive sample pair, and obtaining a negative sample pair from the labeled pedestrian data; difficulty in raising excavated negative samples:
Figure FDA0002289265690000031
Figure FDA0002289265690000032
Figure FDA0002289265690000033
where min (-) is the pair of samples selected with the smallest Euclidean distance,
Figure FDA0002289265690000034
is at CtThe ith non-tag pedestrian data of (1),
Figure FDA0002289265690000035
is from XLThe selected tagged pedestrian data is selected from the group,
Figure FDA0002289265690000036
to be in the current training batch, from CtSelecting unlabeled pedestrian data, wherein D (-) is the calculated Euclidean distance; for oneBelong to CtAnchor sample of (2), NiIs from XLThe negative samples selected in (1) are selected,
Figure FDA0002289265690000037
is from CtC is a constant used to control the confidence of the negative sample pair.
4. The graph-based direct-push semi-supervised pedestrian re-identification method of claim 1, wherein the difficulty of mining negative examples is boosted by gradually decreasing the constant c.
5. The graph-based direct-push semi-supervised pedestrian re-identification method according to claim 1, wherein the step 3 is specifically as follows: and after positive and negative sample pairs are obtained, constructing triple loss by using the positive and negative sample pairs:
Figure FDA0002289265690000038
wherein N represents the number of triples in the current training batch, and the number of the non-labeled data and the number of the labeled data in each batch are equal; triplet slave C of unlabeled datat,Ni
Figure FDA0002289265690000039
Sampling by using a standard triple sampling strategy for triples with label data; assignment of triple confidence, s, using graph modelsiIs the confidence of the ith triplet; confidence s for triples of tagged dataiSet to a constant of 1; for confidence setting of triples of unlabeled data, D is usediTo define the repetition times of the ith sample pair in the graph model, wherein the sample pair with the most repetition times in the graph model is Dmax=max({Di}) from
Figure FDA00022892656900000310
Negative sample pair of medium sampling, using controlThe constant c of the confidence degree of the negative sample pair is used as the confidence degree of the negative sample pair; from NiA negative pair of mid-samples, using c as its confidence; the final triplet confidence for unlabeled data is defined as:
Figure FDA0002289265690000041
wherein α represents the lowest confidence of the positive sample pairs of unlabeled data selected from the graph model, defining a direct-push metric learning loss function as:
Figure FDA0002289265690000042
using a hyperparameter lambda0Combining a direct-push type metric learning loss function with a consistency learning loss function, wherein the final loss function is as follows:
LTSML-CL=LTSML1LCL
and (3) based on the loss function, jointly fine-tuning the base model obtained in the step (1) by using positive and negative sample pairs with confidence coefficients, and improving the performance of the model.
6. The graph-based direct-push semi-supervised pedestrian re-identification method as claimed in any one of claims 3 to 5, wherein the constant c has a value in a range of 0.5 to 1.
CN201911173132.3A 2019-11-26 2019-11-26 Graph-based direct-push type semi-supervised pedestrian re-identification method Pending CN111027421A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911173132.3A CN111027421A (en) 2019-11-26 2019-11-26 Graph-based direct-push type semi-supervised pedestrian re-identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911173132.3A CN111027421A (en) 2019-11-26 2019-11-26 Graph-based direct-push type semi-supervised pedestrian re-identification method

Publications (1)

Publication Number Publication Date
CN111027421A true CN111027421A (en) 2020-04-17

Family

ID=70206813

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911173132.3A Pending CN111027421A (en) 2019-11-26 2019-11-26 Graph-based direct-push type semi-supervised pedestrian re-identification method

Country Status (1)

Country Link
CN (1) CN111027421A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111724867A (en) * 2020-06-24 2020-09-29 中国科学技术大学 Molecular property measurement method, molecular property measurement device, electronic apparatus, and storage medium
CN111783646A (en) * 2020-06-30 2020-10-16 北京百度网讯科技有限公司 Training method, device, equipment and storage medium of pedestrian re-identification model
CN112101217A (en) * 2020-09-15 2020-12-18 镇江启迪数字天下科技有限公司 Pedestrian re-identification method based on semi-supervised learning
CN112418331A (en) * 2020-11-26 2021-02-26 国网甘肃省电力公司电力科学研究院 Clustering fusion-based semi-supervised learning pseudo label assignment method
CN112614150A (en) * 2020-12-18 2021-04-06 中山大学 Off-line pedestrian tracking method, system and storage medium based on dual-model interactive semi-supervised learning
CN112686912A (en) * 2021-01-05 2021-04-20 南开大学 Acute stroke lesion segmentation method based on gradual learning and mixed samples
CN112861825A (en) * 2021-04-07 2021-05-28 北京百度网讯科技有限公司 Model training method, pedestrian re-identification method, device and electronic equipment
CN112949384A (en) * 2021-01-23 2021-06-11 西北工业大学 Remote sensing image scene classification method based on antagonistic feature extraction
CN113158554A (en) * 2021-03-25 2021-07-23 腾讯科技(深圳)有限公司 Model optimization method and device, computer equipment and storage medium
CN113657267A (en) * 2021-08-17 2021-11-16 中国科学院长春光学精密机械与物理研究所 Semi-supervised pedestrian re-identification model, method and device
CN113792574A (en) * 2021-07-14 2021-12-14 哈尔滨工程大学 Cross-data-set expression recognition method based on metric learning and teacher student model
CN113920574A (en) * 2021-12-15 2022-01-11 深圳市视美泰技术股份有限公司 Training method and device for picture quality evaluation model, computer equipment and medium
CN114548259A (en) * 2022-02-18 2022-05-27 东北大学 PISA fault identification method based on Semi-supervised Semi-KNN model
CN112101217B (en) * 2020-09-15 2024-04-26 镇江启迪数字天下科技有限公司 Pedestrian re-identification method based on semi-supervised learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446666A (en) * 2018-04-04 2018-08-24 平安科技(深圳)有限公司 The training of binary channels neural network model and face comparison method, terminal and medium
CN109154973A (en) * 2016-05-20 2019-01-04 奇跃公司 Execute the method and system of convolved image transformation estimation
CN110135295A (en) * 2019-04-29 2019-08-16 华南理工大学 A kind of unsupervised pedestrian recognition methods again based on transfer learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109154973A (en) * 2016-05-20 2019-01-04 奇跃公司 Execute the method and system of convolved image transformation estimation
CN108446666A (en) * 2018-04-04 2018-08-24 平安科技(深圳)有限公司 The training of binary channels neural network model and face comparison method, terminal and medium
WO2019192121A1 (en) * 2018-04-04 2019-10-10 平安科技(深圳)有限公司 Dual-channel neural network model training and human face comparison method, and terminal and medium
CN110135295A (en) * 2019-04-29 2019-08-16 华南理工大学 A kind of unsupervised pedestrian recognition methods again based on transfer learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
安浩南;赵明;潘胜达;林长青;: "基于伪模态转换的红外目标融合检测算法" *
贾迪;朱宁丹;杨宁华;吴思;李玉秀;赵明远;: "图像匹配方法研究综述" *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111724867B (en) * 2020-06-24 2022-09-09 中国科学技术大学 Molecular property measurement method, molecular property measurement device, electronic apparatus, and storage medium
CN111724867A (en) * 2020-06-24 2020-09-29 中国科学技术大学 Molecular property measurement method, molecular property measurement device, electronic apparatus, and storage medium
CN111783646A (en) * 2020-06-30 2020-10-16 北京百度网讯科技有限公司 Training method, device, equipment and storage medium of pedestrian re-identification model
CN111783646B (en) * 2020-06-30 2024-01-23 北京百度网讯科技有限公司 Training method, device, equipment and storage medium of pedestrian re-identification model
CN112101217B (en) * 2020-09-15 2024-04-26 镇江启迪数字天下科技有限公司 Pedestrian re-identification method based on semi-supervised learning
CN112101217A (en) * 2020-09-15 2020-12-18 镇江启迪数字天下科技有限公司 Pedestrian re-identification method based on semi-supervised learning
CN112418331A (en) * 2020-11-26 2021-02-26 国网甘肃省电力公司电力科学研究院 Clustering fusion-based semi-supervised learning pseudo label assignment method
CN112614150A (en) * 2020-12-18 2021-04-06 中山大学 Off-line pedestrian tracking method, system and storage medium based on dual-model interactive semi-supervised learning
CN112686912A (en) * 2021-01-05 2021-04-20 南开大学 Acute stroke lesion segmentation method based on gradual learning and mixed samples
CN112686912B (en) * 2021-01-05 2022-06-10 南开大学 Acute stroke lesion segmentation method based on gradual learning and mixed samples
CN112949384A (en) * 2021-01-23 2021-06-11 西北工业大学 Remote sensing image scene classification method based on antagonistic feature extraction
CN112949384B (en) * 2021-01-23 2024-03-08 西北工业大学 Remote sensing image scene classification method based on antagonistic feature extraction
CN113158554A (en) * 2021-03-25 2021-07-23 腾讯科技(深圳)有限公司 Model optimization method and device, computer equipment and storage medium
CN113158554B (en) * 2021-03-25 2023-02-14 腾讯科技(深圳)有限公司 Model optimization method and device, computer equipment and storage medium
CN112861825B (en) * 2021-04-07 2023-07-04 北京百度网讯科技有限公司 Model training method, pedestrian re-recognition method, device and electronic equipment
WO2022213717A1 (en) * 2021-04-07 2022-10-13 北京百度网讯科技有限公司 Model training method and apparatus, person re-identification method and apparatus, and electronic device
CN112861825A (en) * 2021-04-07 2021-05-28 北京百度网讯科技有限公司 Model training method, pedestrian re-identification method, device and electronic equipment
CN113792574B (en) * 2021-07-14 2023-12-19 哈尔滨工程大学 Cross-dataset expression recognition method based on metric learning and teacher student model
CN113792574A (en) * 2021-07-14 2021-12-14 哈尔滨工程大学 Cross-data-set expression recognition method based on metric learning and teacher student model
CN113657267B (en) * 2021-08-17 2024-01-12 中国科学院长春光学精密机械与物理研究所 Semi-supervised pedestrian re-identification method and device
CN113657267A (en) * 2021-08-17 2021-11-16 中国科学院长春光学精密机械与物理研究所 Semi-supervised pedestrian re-identification model, method and device
CN113920574B (en) * 2021-12-15 2022-03-18 深圳市视美泰技术股份有限公司 Training method and device for picture quality evaluation model, computer equipment and medium
CN113920574A (en) * 2021-12-15 2022-01-11 深圳市视美泰技术股份有限公司 Training method and device for picture quality evaluation model, computer equipment and medium
CN114548259A (en) * 2022-02-18 2022-05-27 东北大学 PISA fault identification method based on Semi-supervised Semi-KNN model
CN114548259B (en) * 2022-02-18 2023-10-10 东北大学 PISA fault identification method based on Semi-supervised Semi-KNN model

Similar Documents

Publication Publication Date Title
CN111027421A (en) Graph-based direct-push type semi-supervised pedestrian re-identification method
CN111126360B (en) Cross-domain pedestrian re-identification method based on unsupervised combined multi-loss model
CN109740413B (en) Pedestrian re-identification method, device, computer equipment and computer storage medium
CN109344787B (en) Specific target tracking method based on face recognition and pedestrian re-recognition
CN110414368B (en) Unsupervised pedestrian re-identification method based on knowledge distillation
CN109934117B (en) Pedestrian re-identification detection method based on generation of countermeasure network
CN108960080B (en) Face recognition method based on active defense image anti-attack
US11263435B2 (en) Method for recognizing face from monitoring video data
CN105701467B (en) A kind of more people's abnormal behaviour recognition methods based on human figure feature
CN110717411A (en) Pedestrian re-identification method based on deep layer feature fusion
CN109711366B (en) Pedestrian re-identification method based on group information loss function
CN109299707A (en) A kind of unsupervised pedestrian recognition methods again based on fuzzy depth cluster
CN111639564B (en) Video pedestrian re-identification method based on multi-attention heterogeneous network
CN110728216A (en) Unsupervised pedestrian re-identification method based on pedestrian attribute adaptive learning
CN111723645A (en) Multi-camera high-precision pedestrian re-identification method for in-phase built-in supervised scene
CN111310662B (en) Flame detection and identification method and system based on integrated deep network
CN109919073B (en) Pedestrian re-identification method with illumination robustness
CN108345900B (en) Pedestrian re-identification method and system based on color texture distribution characteristics
CN111950372A (en) Unsupervised pedestrian re-identification method based on graph convolution network
CN112819065A (en) Unsupervised pedestrian sample mining method and unsupervised pedestrian sample mining system based on multi-clustering information
CN111488760A (en) Few-sample pedestrian re-identification method based on deep multi-example learning
CN112115780A (en) Semi-supervised pedestrian re-identification method based on deep multi-model cooperation
CN113065409A (en) Unsupervised pedestrian re-identification method based on camera distribution difference alignment constraint
CN111695531A (en) Cross-domain pedestrian re-identification method based on heterogeneous convolutional network
CN115049894A (en) Target re-identification method of global structure information embedded network based on graph learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned

Effective date of abandoning: 20240202

AD01 Patent right deemed abandoned