CN112395997B - Weak supervision training method based on pedestrian re-recognition model capable of micro-graph learning - Google Patents
Weak supervision training method based on pedestrian re-recognition model capable of micro-graph learning Download PDFInfo
- Publication number
- CN112395997B CN112395997B CN202011303629.5A CN202011303629A CN112395997B CN 112395997 B CN112395997 B CN 112395997B CN 202011303629 A CN202011303629 A CN 202011303629A CN 112395997 B CN112395997 B CN 112395997B
- Authority
- CN
- China
- Prior art keywords
- pedestrian
- bag
- label
- model
- pseudo
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000001000 micrograph Methods 0.000 title claims description 12
- 230000006870 function Effects 0.000 claims abstract description 29
- 238000013528 artificial neural network Methods 0.000 claims description 20
- 239000013598 vector Substances 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 7
- 101100478633 Escherichia coli O157:H7 stcE gene Proteins 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 239000003086 colorant Substances 0.000 claims description 3
- 239000011521 glass Substances 0.000 claims description 3
- 238000009499 grossing Methods 0.000 claims description 3
- 101150115529 tagA gene Proteins 0.000 claims description 3
- 238000002372 labelling Methods 0.000 abstract description 7
- 238000001514 detection method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000013077 scoring method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/53—Recognition of crowd images, e.g. recognition of crowd congestion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Multimedia (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a weak supervision training method based on a pedestrian re-identification model capable of micro-image learning, which comprises the steps of firstly grouping pedestrian images into bags according to shooting time periods and distributing bag type labels; then, capturing the dependency relationship among all pictures in each bag to generate a reliable pseudo pedestrian category label for each picture in the bag of the category, and taking the pseudo pedestrian category label as supervision information for training a pedestrian re-identification model; then, carrying out integrated training of the pedestrian re-recognition model and the graph model; the linear combination of graph model loss and re-identification loss is used as a total loss function, and the parameters of all layers of the network are updated by using a back propagation algorithm. The invention can achieve the leading model performance without heavy manual labeling cost and little increase of computational complexity.
Description
Technical Field
The invention relates to the technical field of machine vision, in particular to a weak supervision training method based on a pedestrian re-recognition model capable of micro-graph learning.
Background
At present, three main implementation methods for the problem of pedestrian re-identification are as follows: (1) extracting distinguishing characteristics; (2) Learning a stable metric or subspace for matching; (3) combining the two methods. However, most implementations require strong supervision training labels, i.e., manual labeling of each picture of the dataset. In addition, there are pedestrian re-recognition methods based on unsupervised learning that do not require manual labeling, using local saliency matching or clustering models, but it is difficult to model the significant differences across camera views, and therefore it is difficult to achieve high accuracy. In contrast, the weak supervision pedestrian re-identification method provided by the invention is an excellent training method, and can achieve higher precision without expensive manual labeling cost.
Weak supervision learning: while training deep neural networks with weakly supervised methods is a challenging problem, it has been studied to solve certain tasks such as picture classification, semantic segmentation, and object detection. Similar to these studies, the invention is also based on the generation of pseudo tags, but the task of weak supervision pedestrian re-identification has two characteristics: (1) Representative images of each individual pedestrian cannot be found, and the tags are ambiguous because people change clothes in a short time; (2) Entropy is larger than other tasks, for example, pixels of images in a weak supervision semantic segmentation task have certain stability, and pedestrians in a pedestrian re-recognition task are more disordered and irregular. The two characteristics improve the difficulty of the re-identification of the weak supervision pedestrians.
Uncertain tag learning: where single sample pedestrian re-recognition is most relevant to the present invention, but there are two differences: (1) Each pedestrian category identified by the single sample pedestrian re-identification needs at least one picture instance, and the data set of the method does not need an accurate pedestrian category label; (2) The invention introduces the bag type label as the estimation of the limited guide pseudo pedestrian type label, and ensures that the pseudo label is generated more reliably than the single sample pedestrian re-identification.
Pedestrian search: combining the processes of pedestrian detection and pedestrian re-identification. The invention differs from the above two mainly: (1) The present invention focuses only on visual feature matching, as the capabilities of current person detectors are adequate; (2) The invention benefits from low-cost weak labeling, and each training picture searched by the pedestrian still needs strong labeling.
Application number 201710487019.7 discloses an image quality scoring method using a depth-generated machine learning model that uses depth machine learning to create a generated model of an expected good quality image for image quality scoring of images from a medical scanner. Deviations of the input image from the generative model are used as input feature vectors for the discriminant model; the discriminant model may also operate on further input feature vectors derived from the input image. Based on this. However, the patent fails to directly represent graph learning as a loss function that is differentiable for network parameters, so that the loss function can be optimized by a random gradient descent method, and the integrated training of the graph model and the pedestrian re-recognition model is realized.
Disclosure of Invention
The invention provides a weak supervision training method based on a pedestrian re-recognition model capable of micro-graph learning, which is used for adding a module for automatically generating a training label into a pedestrian re-recognition deep neural network and training the module and the module integrally, so that the algorithm complexity is reduced.
In order to achieve the technical effects, the technical scheme of the invention is as follows:
a weak supervision training method based on a pedestrian re-recognition model capable of micro-graph learning comprises the following steps:
s1: grouping pedestrian pictures into bags according to shooting time periods and distributing bag type labels;
s2: capturing the dependency relationship among all pictures in each bag to generate a reliable pseudo pedestrian category label for each picture in the bag of the category, and taking the pseudo pedestrian category label as supervision information for training a pedestrian re-identification model;
s3: performing integrated training of a pedestrian re-recognition model and a graph model;
s4: the linear combination of graph model loss and re-identification loss is used as a total loss function, and the parameters of all layers of the network are updated by using a back propagation algorithm.
Further, the specific process of the step S1 is:
denoted by b is a pocket containing p pictures, i.e. b=x 1 ,x 2 ,…,x j ,…,x p ,y=y 1 ,y 2 ,…,y j ,…,y p For pedestrian category labels, the bag category label is denoted by l.
Further, the process of step S2 is characterized in that:
if only the bag type label l is available for the weak supervision pedestrian re-identification, estimating a pseudo pedestrian type label for each picture, and representing the pseudo pedestrian type label by a probability vector Y; assuming that the bag under the class label comprises n pedestrian classes, the whole training set has m pedestrian classes, and the bag class label is used for limiting Y, each picture x j The probability vector for a pedestrian category label is:
further, the process of step S3 is:
defining a directed graph, each node representing a picture x in a bag i Each edge represents a relationship between pictures, and the energy function of assigning the pedestrian category label y to the node x on the graph is as follows:
wherein U and V represent nodes and edges, respectively, Φ (y i |x i ) Is calculated as picture x i Dispensing label y i Is a term of the cost of (a), ψ (y i ,y j |x i ;x j ) Is calculated as a picture pair (x i ,x j ) Assigning a penalty pair term for the tag, equation (2) eliminates false tags generated by weak supervised learning;
the univariate term in equation (2) is defined as:
Φ(y i |x i )=-log(Y i [y i ]) Wherein
Wherein P is i Is the neural network as the picture x i Calculated probability of pedestrian category label, Y i Is the bag constraint expressed by equation (1),representing element-by-element product, []Representing a vector index;
because the output of the univariate items of different pictures are independent, the univariate items are unstable, and the smoothing of paired items is needed:
the method comprises the steps of calculating appearance similarity by using a Gaussian kernel based on RGB colors, controlling the size of the Gaussian kernel by using a super parameter sigma, and limiting pictures with similar appearance to have the same label; label compatibility ζ (y) i ,y j ) Expressed in the glass's model:
further, the bag category label contains additional information to improve the generation of the pseudo label: correcting the estimated pseudo tag to be the pedestrian classification with the highest predictive score in the bag; causing a partial picture to be assigned to a pedestrian category that is not predicted; the pseudo pedestrian category label for each picture can be obtained by minimizing formula (2):
where {1,2,3, …, m } represents all pedestrian categories in the training set.
Further, in step S3, before the integrated training of the pedestrian re-recognition model and the graph model is performed, the graph model needs to be miniaturized, which specifically includes:
obtaining pseudo pedestrian class labels by using an external graph model for supervising training of a pedestrian re-recognition deep neural network, wherein the calculation for obtaining the pseudo labels by minimizing the formula (2) is not tiny, so that the graph model is incompatible with the deep neural network, and therefore, a relaxation formula (2) is needed to be:
the discrete Φ and ψ are serialized:
the difference between the formula (8) and the formula (3) is that in the non-microchargable model, all possible y needs to be input to the energy function, and the y with the lowest energy is taken as the optimal solution; in the micro-image model, directly inputting a picture x into a deep neural network to obtain a prediction of y; the difference between equation (9) and equation (4) is that the cross entropy term- (Y) i P i ) T log(Y j P j ) Approximation of the non-differentiable term ζ (y) in equation (4) i ,y j )Y i Y j 。
Further, in the step S4, the graph model loses L Drawing of the figure And a loss of classification/re-identification L Classification ,L Classification Is a pseudo tagAs a supervised normalized exponential cross entropy loss function:
wherein the method comprises the steps ofThe representation will->Function converted into a single thermal vector, n representing the number of pictures in a bag, P i Representing pedestrian categories computed by neural networksProbability, which is a normalized exponential function of the logarithm of the network output z:
where m represents the number of pedestrian categories of the training set, and the total loss function L is a linear combination of the two loss functions:
L=w classification L Classification +w Drawing of the figure L Drawing of the figure (12)
Wherein w is Classification And w Drawing of the figure The weights of the two losses are indicated, respectively, and are set to 1 and 0.5, respectively.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
according to the method, a micro-graph learning method and a weak supervision learning method are combined, a module for automatically generating training labels is added for the pedestrian re-recognition deep neural network and is trained integrally with the pedestrian re-recognition deep neural network, and compared with a common pedestrian re-recognition method, the method can achieve the leading model performance without heavy manual labeling cost and almost without increasing calculation complexity.
Drawings
FIG. 1 is a diagram model of a bag of pictures to generate pseudo pedestrian category labels;
figure 2 is a training flow chart of the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the present patent;
for the purpose of better illustrating the embodiments, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the actual product dimensions;
it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.
A weak supervision training method based on a pedestrian re-recognition model capable of micro-graph learning comprises the following steps:
1. from supervised pedestrian re-recognition to weakly supervised pedestrian re-recognition
Denoted by b is a pocket containing p pictures, i.e. b=x 1 ,x 2 ,…,x j ,…,x p ,y=y 1 ,y 2 ,…,y j ,…,y p For pedestrian category labels, the bag category label is denoted by l. The supervised pedestrian re-recognition needs classification prediction of a pedestrian category label y supervision model; only the bag class label l is available for the weak supervision pedestrian re-identification, and a pseudo pedestrian class label needs to be estimated for each picture first and is represented by a probability vector Y. Assuming that l contains n pedestrian categories, the whole training set has m pedestrian categories, and Y is limited by bag category labels, each picture x j The probability vector for a pedestrian category label is:
2. weak supervision pedestrian re-identification based on micro-graph learning
Pedestrian re-identification of graph model
As shown in FIG. 1, a directed graph is defined, each node representing a picture x in a bag i Each edge represents a relationship between pictures, and the energy function of assigning the pedestrian category label y to the node x on the graph is as follows:
wherein U and V represent nodes and edges, respectively, Φ (y i |x i ) Is calculated as picture x i Dispensing label y i Is a term of the cost of (a), ψ (y i ,y j |x i ;x j ) Is calculated as a picture pair (x i ,x j ) Pair of items that assign penalties to tags. Equation (2) eliminates false labels generated by weak supervised learning.
Unitary item
The univariate term in equation (2) is defined as:
Φ(y i |x i )=-log(Y i [y i ]) Wherein
Wherein P is i Is the neural network as the picture x i Calculated probability of pedestrian category label, Y i Is the bag constraint expressed by equation (1),representing element-by-element product, []Representing the vector index.
Paired item
Because the output of the univariate items of different pictures are independent, the univariate items are unstable, and the smoothing of paired items is needed:
the method comprises the steps of calculating appearance similarity by using a Gaussian kernel based on RGB colors, controlling the size of the Gaussian kernel by using a super parameter sigma, and limiting pictures with similar appearance to have the same label; label compatibility ζ (y) i ,y j ) Expressed in the glass's model:
bag restraint
In effect, the bag category label contains additional information to improve the generation of the pseudo label: correcting the estimated pseudo tag to be the pedestrian classification with the highest predictive score in the bag; causing a partial picture to be assigned to a pedestrian category that is not predicted.
Inference of pseudo pedestrian category labels
The pseudo pedestrian category label for each picture can be obtained by minimizing formula (2):
where {1,2,3, …, m } represents all pedestrian categories in the training set.
Graph learning can be miniaturized
The weak supervision pedestrian re-recognition method is not integrally trained, because an external graph model is needed to obtain a pseudo pedestrian class label for supervising the training of the pedestrian re-recognition deep neural network. Minimizing equation (2) results in the computation of pseudo labels that is not trivial, making the graph model incompatible with deep neural networks, thus requiring relaxation of equation (2) to:
the discrete Φ and ψ are serialized:
the difference between the formula (8) and the formula (3) is that in the non-microchargable model, all possible y needs to be input to the energy function, and the y with the lowest energy is taken as the optimal solution; in the micro-image model, the image x is directly input into a deep neural network to obtain the prediction of y. The difference between equation (9) and equation (4) is that the cross entropy term- (Y) i P i ) T log(Y j P j ) Approximation of the non-differentiable term ζ (y) in equation (4) i ,y j )Y i Y j 。
3. Integral neural network structure
Fig. 2 is a network structure of training and reasoning, with broken lines representing training data flows and solid lines representing reasoning data flows, where the graph model only participates in the training phase. The overall structure comprises three main modules:
feature extraction module
As shown in FIG. 2 (a), the last layer of the original ResNet-50 is removed using ResNet-50 as the backbone network, and replaced with a fully connected layer with 512-dimensional output, a batch normalization, a leaky linear rectification function, and a dropout.
Coarse pedestrian re-identification module
As shown in fig. 2 (b), a full-connection layer with the same output dimension as the number of pedestrian categories is added at the top of the feature extraction module, and normalized exponential cross entropy is used as a loss function. Pedestrian category prediction scoreAs a rough pedestrian re-recognition estimate, the probability of the pedestrian category of the picture in the bag b is represented.
Refined pedestrian re-identification module
As in fig. 2 (c), the coarse pedestrian re-recognition score, look and bag limit are input into the graph model according to formulas (8) and (9), and the pseudo tag generated by the graph model can be used to update the network parameters just like a manually labeled real tag.
4. Optimization
The gradient of the overall loss value to the deep neural network parameters can be calculated by obtaining the pseudo pedestrian class labels, and the gradient is transmitted back to all layers of the network by utilizing a back propagation algorithm, so that the integrated training of all parameters of the weak supervision model is realized.
Loss function
The optimization objective of the method comprises graph model loss L Drawing of the figure And a loss of classification/re-identification L Classification ,L Classification Is a pseudo tagAs a supervised normalized exponential cross entropy loss function:
wherein the method comprises the steps ofThe representation will->Function converted into a single thermal vector, n representing the number of pictures in a bag, P i The probability representing the pedestrian category calculated by the neural network is a normalized exponential function of the logarithm of the network output z:
where m represents the number of pedestrian categories of the training set.
The total loss function L is a linear combination of these two loss functions:
L=w classification L Classification +w Drawing of the figure L Drawing of the figure (12)
Wherein w is Classification And w Drawing of the figure The weights of the two losses are shown, respectively, and the method is set to 1 and 0.5.
The same or similar reference numerals correspond to the same or similar components;
the positional relationship depicted in the drawings is for illustrative purposes only and is not to be construed as limiting the present patent;
it is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.
Claims (8)
1. A weak supervision training method based on a pedestrian re-recognition model capable of micro-graph learning is characterized by comprising the following steps:
s1: grouping pedestrian pictures into bags according to shooting time periods and distributing bag type labels;
s2: capturing the dependency relationship among all pictures in each bag to generate a pseudo pedestrian category label for each picture in the bag of the category, and taking the pseudo pedestrian category label as supervision information for training a pedestrian re-identification model;
s3: performing integrated training of a pedestrian re-recognition model and a graph model;
s4: taking the linear combination of graph model loss and re-identification loss as a total loss function, and updating parameters of all layers of the network by using a back propagation algorithm;
the process of the step S2 is as follows:
if only the bag type label l is available for the weak supervision pedestrian re-identification, estimating a pseudo pedestrian type label for each picture, and representing the pseudo pedestrian type label by a probability vector Y; assuming that the bag under the class label comprises n pedestrian classes, the whole training set has m pedestrian classes, and the bag class label is used for limiting Y, each picture x j The probability vector for a pedestrian category label is:
the process of the step S3 is as follows:
defining a directed graph, each node representing a picture x in a bag i Each edge represents a relationship between pictures, and the energy function of assigning the pedestrian category label y to the node x on the graph is as follows:
wherein U and V represent nodes and edges, respectively, Φ (y i |x i ) Is calculated as picture x i Dispensing label y i Is a term of the cost of (a), ψ (y i ,y j |x i ;x j ) Is calculated as a picture pair (x i ,x j ) Assigning a penalty pair term for the tag, equation (2) eliminates false tags generated by weak supervised learning;
the univariate term in equation (2) is defined as:
Φ(y i |x i )=-log(Y i [y i ]) Wherein y=y i ⊙P i (3)
Wherein P is i Is the neural network as the picture x i The calculated probability of the pedestrian category label represents network output; y is Y i Is the bag constraint expressed by formula (1), and by which is the element-wise product, []Representing a vector index;
because the output of the univariate items of different pictures are independent, the univariate items are unstable, and the smoothing of paired items is needed:
the method comprises the steps of calculating appearance similarity by using a Gaussian kernel based on RGB colors, controlling the size of the Gaussian kernel by using a super parameter sigma, and limiting pictures with similar appearance to have the same label; y is Y i [y i ]Y j [y j ]Indicating that the bag is to be restrained,representing appearance similarity;
label compatibility ζ (y) i ,y j ) Expressed in the glass's model:
2. the weak supervision training method based on the micro-graph learning pedestrian re-recognition model according to claim 1, wherein the specific process of step S1 is as follows:
denoted by b is a pocket containing p pictures, i.e. b=x 1 ,x 2 ,…,x j ,…,x p ,y=y 1 ,y 2 ,…,y j ,…,y p For pedestrian category labels, the bag category label is denoted by l.
3. The weak supervision training method based on the micro-graph learning pedestrian re-recognition model according to claim 1, wherein the bag category label contains additional information to improve the generation of the pseudo label: correcting the estimated pseudo tag to be the pedestrian classification with the highest predictive score in the bag; causing a partial picture to be assigned to a pedestrian category that is not predicted.
4. The weak supervision training method based on the micro-graph learning pedestrian re-recognition model according to claim 3, wherein the pseudo pedestrian category label of each picture is obtained by minimizing the formula (2):
where {1,2,3, …, m } represents all pedestrian categories in the training set.
5. The weak supervision training method based on the pedestrian re-recognition model capable of micro-image learning according to claim 4, wherein in step S3, before the integrated training of the pedestrian re-recognition model and the image model is performed, the image model is required to be micro-processed, which comprises the following specific steps:
obtaining pseudo pedestrian class labels by using an external graph model for supervising training of a pedestrian re-recognition deep neural network, wherein the calculation for obtaining the pseudo labels by minimizing the formula (2) is not tiny, so that the graph model is incompatible with the deep neural network, and therefore, a relaxation formula (2) is needed to be:
the discrete Φ and ψ are serialized:
the difference between the formula (8) and the formula (3) is that in the non-microchargable model, all y are input to the energy function, and y with the lowest energy is taken as an optimal solution; in the micro-image model, directly inputting a picture x into a deep neural network to obtain a prediction of y; the difference between equation (9) and equation (4) is that the cross entropy term- (Y) i P i ) T log(Y j P j ) Approximation of the non-differentiable term ζ (y) in equation (4) i ,y j )Y i Y j 。
6. The weak supervision training method based on the micro-image learning pedestrian re-recognition model according to claim 5, wherein in the step S4, the image model loses L Drawing of the figure And a loss of classification/re-identification L Classification ,L Classification Is a pseudo tagAs a supervised normalized exponential cross entropy loss function:
wherein the method comprises the steps ofThe representation will->Function converted into a single thermal vector, n representing the number of pictures in a bag, P i The probability representing the pedestrian category calculated by the neural network is a normalized exponential function of the logarithm of the network output z:
where m represents the number of pedestrian categories of the training set, and the total loss function L is a linear combination of the two loss functions:
L=w classification L Classification +w Drawing of the figure L Drawing of the figure (12)
Wherein w is Classification And w Drawing of the figure Respectively representing the weights of the two losses.
7. The weak supervision training method based on the micro-image-learning pedestrian re-recognition model according to claim 6, wherein w Classification Set to 1.
8. The weak supervision training method based on the micro-image-learning pedestrian re-recognition model according to claim 7, wherein w Drawing of the figure Set to 0.5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011303629.5A CN112395997B (en) | 2020-11-19 | 2020-11-19 | Weak supervision training method based on pedestrian re-recognition model capable of micro-graph learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011303629.5A CN112395997B (en) | 2020-11-19 | 2020-11-19 | Weak supervision training method based on pedestrian re-recognition model capable of micro-graph learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112395997A CN112395997A (en) | 2021-02-23 |
CN112395997B true CN112395997B (en) | 2023-11-24 |
Family
ID=74605913
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011303629.5A Active CN112395997B (en) | 2020-11-19 | 2020-11-19 | Weak supervision training method based on pedestrian re-recognition model capable of micro-graph learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112395997B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112949630B (en) * | 2021-03-01 | 2024-03-19 | 北京交通大学 | Weak supervision target detection method based on frame hierarchical screening |
CN113128410A (en) * | 2021-04-21 | 2021-07-16 | 湖南大学 | Weak supervision pedestrian re-identification method based on track association learning |
CN113688781B (en) * | 2021-09-08 | 2023-09-15 | 北京邮电大学 | Pedestrian re-identification anti-attack method capable of shielding elasticity |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019136946A1 (en) * | 2018-01-15 | 2019-07-18 | 中山大学 | Deep learning-based weakly supervised salient object detection method and system |
CN110942025A (en) * | 2019-11-26 | 2020-03-31 | 河海大学 | Unsupervised cross-domain pedestrian re-identification method based on clustering |
CN111445488A (en) * | 2020-04-22 | 2020-07-24 | 南京大学 | Method for automatically identifying and segmenting salt body through weak supervised learning |
CN111723645A (en) * | 2020-04-24 | 2020-09-29 | 浙江大学 | Multi-camera high-precision pedestrian re-identification method for in-phase built-in supervised scene |
CN111860678A (en) * | 2020-07-29 | 2020-10-30 | 中国矿业大学 | Unsupervised cross-domain pedestrian re-identification method based on clustering |
-
2020
- 2020-11-19 CN CN202011303629.5A patent/CN112395997B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019136946A1 (en) * | 2018-01-15 | 2019-07-18 | 中山大学 | Deep learning-based weakly supervised salient object detection method and system |
CN110942025A (en) * | 2019-11-26 | 2020-03-31 | 河海大学 | Unsupervised cross-domain pedestrian re-identification method based on clustering |
CN111445488A (en) * | 2020-04-22 | 2020-07-24 | 南京大学 | Method for automatically identifying and segmenting salt body through weak supervised learning |
CN111723645A (en) * | 2020-04-24 | 2020-09-29 | 浙江大学 | Multi-camera high-precision pedestrian re-identification method for in-phase built-in supervised scene |
CN111860678A (en) * | 2020-07-29 | 2020-10-30 | 中国矿业大学 | Unsupervised cross-domain pedestrian re-identification method based on clustering |
Non-Patent Citations (1)
Title |
---|
基于深度卷积神经网络的弱监督图像语义分割;郑宝玉;王雨;吴锦雯;周全;;南京邮电大学学报(自然科学版)(05);第5-16页 * |
Also Published As
Publication number | Publication date |
---|---|
CN112395997A (en) | 2021-02-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112395997B (en) | Weak supervision training method based on pedestrian re-recognition model capable of micro-graph learning | |
CN110472090B (en) | Image retrieval method based on semantic tags, related device and storage medium | |
US11080918B2 (en) | Method and system for predicting garment attributes using deep learning | |
US11494616B2 (en) | Decoupling category-wise independence and relevance with self-attention for multi-label image classification | |
US10089364B2 (en) | Item recommendation device, item recommendation method, and computer program product | |
CN107330451B (en) | Clothing attribute retrieval method based on deep convolutional neural network | |
Liu et al. | Recognizing human actions by attributes | |
CN108288051B (en) | Pedestrian re-recognition model training method and device, electronic equipment and storage medium | |
CN107683469A (en) | A kind of product classification method and device based on deep learning | |
CN108256450A (en) | A kind of supervised learning method of recognition of face and face verification based on deep learning | |
EP3300002A1 (en) | Method for determining the similarity of digital images | |
Wang et al. | Ppp: Joint pointwise and pairwise image label prediction | |
Freytag et al. | Labeling examples that matter: Relevance-based active learning with gaussian processes | |
Katkade et al. | Advances in real-time object detection and information retrieval: A review | |
Srininvas et al. | A framework to recognize the sign language system for deaf and dumb using mining techniques | |
CN112732967B (en) | Automatic image annotation method and system and electronic equipment | |
KR20220125422A (en) | Method and device of celebrity identification based on image classification | |
CN114818979A (en) | Noise-containing multi-label classification method based on maximum interval mechanism | |
CN116955599A (en) | Category determining method, related device, equipment and storage medium | |
CN111339428A (en) | Interactive personalized search method based on limited Boltzmann machine drive | |
CN114419670B (en) | Unsupervised pedestrian re-identification method based on camera deviation removal and dynamic memory model updating | |
Yousif et al. | Gender face Recognition Using Advanced Convolutional Neural Network Model | |
Stitini et al. | Towards a robust solution to mitigate all content-based filtering drawbacks within a recommendation system | |
Filipovych et al. | Combining models of pose and dynamics for human motion recognition | |
CN117252665B (en) | Service recommendation method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information |
Inventor after: Nie Lin Inventor after: Zhang Jiqi Inventor after: Lin Jing Inventor after: Wang Guangrun Inventor after: Wang Guangcong Inventor before: Zhang Jiqi Inventor before: Lin Jing Inventor before: Nie Lin Inventor before: Wang Guangrun Inventor before: Wang Guangcong |
|
CB03 | Change of inventor or designer information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |